[ceph.git] / ceph / src / boost / libs / spirit / classic / doc / quick_start.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content=
    "HTML Tidy for Windows (vers 1st February 2003), see www.w3.org"
          name="generator">
    <title>
      Quick Start
    </title>
    <meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
    <link rel="stylesheet" href="theme/style.css" type="text/css">
    </head>
  <body>
    <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
      <tr>
        <td width="10"></td>
        <td width="85%">
          <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Quick
          Start</b></font>
        </td>
        <td width="112">
          <a href="http://spirit.sf.net"><img src="theme/spirit.gif"
               width="112" height="48" align="right" border="0"></a>
        </td>
      </tr>
    </table><br>
    <table border="0">
      <tr>
        <td width="10"></td>
        <td width="30">
          <a href="../index.html"><img src="theme/u_arr.gif" border="0"></a>
        </td>
        <td width="30">
          <a href="introduction.html"><img src="theme/l_arr.gif" border="0">
          </a>
        </td>
        <td width="30">
          <a href="basic_concepts.html"><img src="theme/r_arr.gif" border="0">
          </a>
        </td>
      </tr>
    </table>
    <h2>
      <b>Why would you want to use Spirit?</b>
    </h2>
    <p>
      Spirit is designed to be a practical parsing tool. At the very least, the
      ability to generate a fully-working parser from a formal EBNF
      specification inlined in C++ significantly reduces development time.
      While it may be practical to use a full-blown, stand-alone parser such as
      YACC or ANTLR when we want to develop a computer language such as C or
      Pascal, it is certainly overkill to bring in the big guns when we wish to
      write extremely small micro-parsers. At that end of the spectrum,
      programmers typically approach the job at hand not as a formal parsing
      task but through ad hoc hacks using primitive tools such as
      <tt>scanf</tt>. True, there are tools such as regular-expression
      libraries (such as <a href=
      "http://www.boost.org/libs/regex/index.html">boost regex</a>) or scanners
      (such as <a href="http://www.boost.org/libs/tokenizer/index.html">boost
      tokenizer</a>), but these tools do not scale well when we need to write
      more elaborate parsers. Attempting to write even a moderately-complex
      parser using these tools leads to code that is hard to understand and
      maintain.
    </p>
    <p>
      One prime objective is to make the tool easy to use. When one thinks of a
      parser generator, the usual reaction is "it must be big and complex with
      a steep learning curve." Not so. Spirit is designed to be fully scalable.
      The framework is structured in layers. This permits learning on an
      as-needed basis, after only learning the minimal core and basic concepts.
    </p>
    <p>
      For development simplicity and ease in deployment, the entire framework
      consists of only header files, with no libraries to link against or
      build. Just put the spirit distribution in your include path, compile and
      run. Code size? -very tight. In the quick start example that we shall
      present in a short while, the code size is dominated by the instantiation
      of the <tt>std::vector</tt> and <tt>std::iostream</tt>.
    </p>
    <h2>
    <b>Trivial Example #1</b></h2>
    <p>Create a parser that will parse
      a floating-point number.
    </p>
    <pre><code><font color="#000000">    </font></code><span class="identifier">real_p</span>
</pre>
<p>
      (You've got to admit, that's trivial!) The above code actually generates
      a Spirit <tt>real_parser</tt> (a built-in parser) which parses a floating
      point number. Take note that parsers that are meant to be used directly
      by the user end with "<tt>_p</tt>" in their names as a Spirit convention.
      Spirit has many pre-defined parsers and consistent naming conventions
      help you keep from going insane!
  </p>
    <h2>
      <b>Trivial Example #2</b></h2>
    <p>
      Create a parser that will accept a line consisting of two floating-point
      numbers.
    </p>
    
<pre><code><font color="#000000">    </font></code><code><span class=
"identifier">real_p</span> <span class=
      "special">&gt;&gt;</span> <span class="identifier">real_p</span></code>
</pre>
<p>
      Here you see the familiar floating-point numeric parser
      <code><tt>real_p</tt></code> used twice, once for each number. What's
      that <tt class="operators">&gt;&gt;</tt> operator doing in there? Well,
      they had to be separated by something, and this was chosen as the
      "followed by" sequence operator. The above program creates a parser from
      two simpler parsers, glueing them together with the sequence operator.
      The result is a parser that is a composition of smaller parsers.
      Whitespace between numbers can implicitly be consumed depending on how
      the parser is invoked (see below).
  </p>
    <p>
      Note: when we combine parsers, we end up with a "bigger" parser, But it's
      still a parser. Parsers can get bigger and bigger, nesting more and more,
      but whenever you glue two parsers together, you end up with one bigger
      parser. This is an important concept.
    </p>
    <h2>
      <b>Trivial Example #3</b></h2>
    <p>
      Create a parser that will accept an arbitrary number of floating-point
      numbers. (Arbitrary means anything from zero to infinity)
    </p>
    
<pre><code><font color="#000000">    </font></code><code><span class=
"special">*</span><span class="identifier">real_p</span></code>
</pre>
<p>
      This is like a regular-expression Kleene Star, though the syntax might
      look a bit odd for a C++ programmer not used to seeing the <tt class=
      "operators">*</tt> operator overloaded like this. Actually, if you know
      regular expressions it may look odd too since the star is <b>before</b>
      the expression it modifies. C'est la vie. Blame it on the fact that we
      must work with the syntax rules of C++.
  </p>
    <p>
      Any expression that evaluates to a parser may be used with the Kleene
      Star. Keep in mind, though, that due to C++ operator precedence rules you
      may need to put the expression in parentheses for complex expressions.
      The Kleene Star is also known as a Kleene Closure, but we call it the
      Star in most places.
    </p>
    <h3>
      <b><a name="list_of_numbers"></a> Example #4 [ A Just Slightly Less Trivial Example</b>
]    </h3>
    <p>
 This example will create a parser that accepts a comma-delimited list of numbers and put the numbers in a vector.
</p>
    <h4><strong> Step 1. Create the parser</strong></h4>
    <pre><code><font color="#000000">    </font></code><code><span class=
"identifier">real_p</span> <span class=
      "special">&gt;&gt;</span> <span class="special">*(</span><span class=
      "identifier">ch_p</span><span class="special">(</span><span class=
      "literal">','</span><span class="special">)</span> <span class=
      "special">&gt;&gt;</span> <span class=
      "identifier">real_p</span><span class="special">)</span></code>
</pre>
    <p>
      Notice <tt>ch_p(',')</tt>. It is a literal character parser that can
      recognize the comma <tt>','</tt>. In this case, the Kleene Star is
      modifying a more complex parser, namely, the one generated by the
      expression:
    </p>
    
    <pre><code><font color="#000000">    </font></code><code><span class=
      "special">(</span><span class="identifier">ch_p</span><span class=
      "special">(</span><span class="literal">','</span><span class=
      "special">)</span> <span class="special">&gt;&gt;</span> <span class=
      "identifier">real_p</span><span class="special">)</span></code>
</pre>
<p>
      Note that this is a case where the parentheses are necessary. The Kleene
      star encloses the complete expression above.
  </p>
    <h4>
      <b><strong>Step 2. </strong>Using a Parser (now that it's created)</b></h4>
    <p>
      Now that we have created a parser, how do we use it? Like the result of
      any C++ temporary object, we can either store it in a variable, or call
      functions directly on it.
    </p>
    <p>
      We'll gloss over some low-level C++ details and just get to the good
      stuff.
    </p>
    <p>
      If <b><tt>r</tt></b> is a rule (don't worry about what rules exactly are
      for now. This will be discussed later. Suffice it to say that the rule is
      a placeholder variable that can hold a parser), then we store the parser
      as a rule like this:
    </p>
    
<pre><code><font color="#000000">    </font></code><code><font color="#000000"><span class=
      "identifier">r</span> <span class="special">=</span> <span class=
      "identifier">real_p</span> <span class=
      "special">&gt;&gt; *(</span><span class=
      "identifier">ch_p</span><span class="special">(</span><span class=
      "literal">','</span><span class="special">) &gt;&gt;</span> <span class=
      "identifier">real_p</span><span class="special">);</span></font></code>
</pre>
<p>
      Not too exciting, just an assignment like any other C++ expression you've
      used for years. The cool thing about storing a parser in a rule is this:
      rules are parsers, and now you can refer to it <b>by name</b>. (In this
      case the name is <tt><b>r</b></tt>). Notice that this is now a full
      assignment expression, thus we terminate it with a semicolon,
      "<tt>;</tt>".
  </p>
    <p>
      That's it. We're done with defining the parser. So the next step is now
      invoking this parser to do its work. There are a couple of ways to do
      this. For now, we shall use the free <tt>parse</tt> function that takes
      in a <tt>char const*</tt>. The function accepts three arguments:
    </p>
    <blockquote>
      <p>
        <img src="theme/bullet.gif" width="12" height="12"> The null-terminated
        <tt>const char*</tt> input<br>
         <img src="theme/bullet.gif" width="12" height="12"> The parser
        object<br>
         <img src="theme/bullet.gif" width="12" height="12"> Another parser
        called the <b>skip parser</b>
      </p>
    </blockquote>
    <p>
      In our example, we wish to skip spaces and tabs. Another parser named
      <tt>space_p</tt> is included in Spirit's repertoire of predefined
      parsers. It is a very simple parser that simply recognizes whitespace. We
      shall use <tt>space_p</tt> as our skip parser. The skip parser is the one
      responsible for skipping characters in between parser elements such as
      the <tt>real_p</tt> and the <tt>ch_p</tt>.
    </p>
    <p>
      Ok, so now let's parse!
    </p>
    
<pre><code><font color="#000000">    </font></code><code><font color="#000000"><span class=
"identifier">r</span> <span class="special">=</span> <span class=
"identifier">real_p</span> <span class=
      "special">&gt;&gt;</span> <span class="special">*(</span><span class=
      "identifier">ch_p</span><span class="special">(</span><span class=
      "literal">','</span><span class="special">)</span> <span class=
      "special">&gt;&gt;</span> <span class=
      "identifier">real_p</span><span class="special">);
</span> <span class="identifier">   parse</span><span class=
"special">(</span><span class="identifier">str</span><span class=
"special">,</span> <span class="identifier">r</span><span class=
"special">,</span> <span class="identifier">space_p</span><span class=
"special">)</span> <span class=
"comment">// Not a full statement yet, patience...</span></font></code>
</pre>
<p>
      The parse function returns an object (called <tt>parse_info</tt>) that
      holds, among other things, the result of the parse. In this example, we
      need to know:
  </p>
    <blockquote>
      <p>
        <img src="theme/bullet.gif" width="12" height="12"> Did the parser
        successfully recognize the input <tt>str</tt>?<br>
         <img src="theme/bullet.gif" width="12" height="12"> Did the parser
        <b>fully</b> parse and consume the input up to its end?
      </p>
    </blockquote>
    <p>
      To get a complete picture of what we have so far, let us also wrap this
      parser inside a function:
    </p>
    
<pre><code><font color="#000000">    </font></code><code><font color="#000000"><span class=
"keyword">bool
</span> <span class="identifier">   parse_numbers</span><span class=
"special">(</span><span class="keyword">char</span> <span class=
"keyword">const</span><span class="special">*</span> <span class=
"identifier">str</span><span class="special">)
    {
</span> <span class="keyword">       return</span> <span class=
"identifier">parse</span><span class="special">(</span><span class=
"identifier">str</span><span class="special">,</span> <span class=
"identifier">real_p</span> <span class=
      "special">&gt;&gt;</span> <span class="special">*(</span><span class=
      "literal">','</span> <span class="special">&gt;&gt;</span> <span class=
      "identifier">real_p</span><span class="special">),</span> <span class=
      "identifier">space_p</span><span class="special">).</span><span class=
      "identifier">full</span><span class="special">;
    }</span></font></code>
</pre>
<p>
      Note in this case we dropped the named rule and inlined the parser
      directly in the call to parse. Upon calling parse, the expression
      evaluates into a temporary, unnamed parser which is passed into the
      parse() function, used, and then destroyed.
  </p>
    <table border="0" width="80%" align="center">
      <tr>
        <td class="note_box">
          <img src="theme/note.gif" width="16" height="16"><b>char and wchar_t
          operands</b><br>
          <br>
           The careful reader may notice that the parser expression has
          <tt class="quotes">','</tt> instead of <tt>ch_p(',')</tt> as the
          previous examples did. This is ok due to C++ syntax rules of
          conversion. There are <tt>&gt;&gt;</tt> operators that are overloaded
          to accept a <tt>char</tt> or <tt>wchar_t</tt> argument on its left or
          right (but not both). An operator may be overloaded if at least one
          of its parameters is a user-defined type. In this case, the
          <tt>real_p</tt> is the 2nd argument to <tt>operator<span class=
          "operators">&gt;&gt;</span></tt>, and so the proper overload of
          <tt class="operators">&gt;&gt;</tt> is used, converting
              <tt class="quotes">','</tt> into a character literal parser.<br>
          <br>
           The problem with omiting the <tt>ch_p</tt> call should be obvious:
          <tt>'a' &gt;&gt; 'b'</tt> is <b>not</b> a spirit parser, it is a
          numeric expression, right-shifting the ASCII (or another encoding)
          value of <tt class="quotes">'a'</tt> by the ASCII value of
              <tt class="quotes">'b'</tt>. However, both <tt>ch_p('a') &gt;&gt;
              'b'</tt> and <tt>'a' &gt;&gt; ch_p('b')</tt> are Spirit sequence
              parsers for the letter <tt class="quotes">'a'</tt> followed by
              <tt class="quotes">'b'</tt>. You'll get used to it, sooner or
              later.
        </td>
      </tr>
    </table>
    <p>
      Take note that the object returned from the parse function has a member
      called <tt>full</tt> which returns true if both of our requirements above
      are met (i.e. the parser fully parsed the input).
    </p>
    <h4>
      <b> Step 3. Semantic Actions</b></h4>
    <p>
      Our parser above is really nothing but a recognizer. It answers the
      question <i class="quotes">"did the input match our grammar?"</i>, but it
      does not remember any data, nor does it perform any side effects.
      Remember: we want to put the parsed numbers into a vector. This is done
      in an <b>action</b> that is linked to a particular parser. For example,
      whenever we parse a real number, we wish to store the parsed number after
      a successful match. We now wish to extract information from the parser.
      Semantic actions do this. Semantic actions may be attached to any point
      in the grammar specification. These actions are C++ functions or functors
      that are called whenever a part of the parser successfully recognizes a
      portion of the input. Say you have a parser <b>P</b>, and a C++ function
      <b>F</b>, you can make the parser call <b>F</b> whenever it matches an
      input by attaching <b>F</b>:
    </p>
    
<pre><code><font color="#000000">    </font></code><code><font color="#000000"><span class=
"identifier">P</span><span class="special">[&amp;</span><span class=
"identifier">F</span><span class="special">]</span></font></code>
</pre>
<p>
      Or if <b>F</b> is a function object (a functor):
  </p>
    
<pre><code><font color="#000000">    </font></code><code><font color="#000000"><span class=
"identifier">P</span><span class="special">[</span><span class=
"identifier">F</span><span class="special">]</span></font></code>
</pre>
<p>
      The function/functor signature depends on the type of the parser to which
      it is attached. The parser <tt>real_p</tt> passes a single argument: the
      parsed number. Thus, if we were to attach a function <b>F</b> to
      <tt>real_p</tt>, we need <b>F</b> to be declared as:
  </p>
    
<pre><code>    </code><code><span class=
"keyword">void</span> <span class="identifier">F</span><span class=
"special">(</span><span class="keyword">double</span> <span class=
"identifier">n</span><span class="special">);</span></code></pre>
<p>
      For our example however, again, we can take advantage of some predefined
      semantic functors and functor generators (<img src="theme/lens.gif"
         width="15" height="16"> A functor generator is a function that returns
         a functor). For our purpose, Spirit has a functor generator
         <tt>push_back_a(c)</tt>. In brief, this semantic action, when called,
         <b>appends</b> the parsed value it receives from the parser it is
         attached to, to the container <tt>c</tt>.
  </p>
    <p>
      Finally, here is our complete comma-separated list parser:
    </p>
    
<pre><code><font color="#000000">    </font></code><code><font color="#000000"><span class=
"keyword">bool
</span>    <span class="identifier">parse_numbers</span><span class=
"special">(</span><span class="keyword">char</span> <span class=
"keyword">const</span><span class="special">*</span> <span class=
"identifier">str</span><span class="special">,</span> <span class=
"identifier">vector</span><span class="special">&lt;</span><span class=
"keyword">double</span><span class=
      "special">&gt;&amp;</span> <span class="identifier">v</span><span class=
      "special">)
    {
</span>        <span class="keyword">return</span> <span class=
"identifier">parse</span><span class="special">(</span><span class=
"identifier">str</span><span class="special">,

</span>        <span class="comment">    //  Begin grammar
</span>        <span class="special">    (
</span>                <span class="identifier">real_p</span><span class=
"special">[</span><span class="identifier">push_back_a</span><span class=
"special">(</span><span class="identifier">v</span><span class=
"special">)]</span> <span class="special">&gt;&gt;</span> <span class=
"special">*(</span><span class="literal">','</span> <span class=
"special">&gt;&gt;</span> <span class=
      "identifier">real_p</span><span class="special">[</span><span class=
      "identifier">push_back_a</span><span class="special">(</span><span class=
      "identifier">v</span><span class="special">)])
            )
</span> <span class="special">           ,
</span>        <span class="comment">    //  End grammar

</span>        <span class="identifier">    space_p</span><span class=
"special">).</span><span class="identifier">full</span><span class="special">;
    }</span></font></code>
</pre>
<p>
      This is the same parser as above. This time with appropriate semantic
      actions attached to strategic places to extract the parsed numbers and
      stuff them in the vector <tt>v</tt>. The parse_numbers function returns
      true when successful.
  </p>
    <p>
      <img src="theme/lens.gif" width="15" height="16"> The full source code
      can be <a href="../example/fundamental/number_list.cpp">viewed here</a>.
      This is part of the Spirit distribution.
    </p>
    <table border="0">
      <tr>
        <td width="10"></td>
        <td width="30">
          <a href="../index.html"><img src="theme/u_arr.gif" border="0"></a>
        </td>
        <td width="30">
          <a href="introduction.html"><img src="theme/l_arr.gif" border="0">
          </a>
        </td>
        <td width="30">
          <a href="basic_concepts.html"><img src="theme/r_arr.gif" border="0">
          </a>
        </td>
      </tr>
    </table><br>
    <hr size="1">
    <p class="copyright">
      Copyright &copy; 1998-2003 Joel de Guzman<br>
       Copyright &copy; 2002 Chris Uzdavinis<br>
      <br>
       <font size="2">Use, modification and distribution is subject to the
      Boost Software License, Version 1.0. (See accompanying file
      LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)</font>
    </p>
    <blockquote>&nbsp;
      
    </blockquote>
  </body>
</html>
Commit	Line	Data
7c673cae FG	1	<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
	2	<html>
	3	<head>
	4	<meta content=
	5	"HTML Tidy for Windows (vers 1st February 2003), see www.w3.org"
	6	name="generator">
	7	<title>
	8	Quick Start
	9	</title>
	10	<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
	11	<link rel="stylesheet" href="theme/style.css" type="text/css">
	12	</head>
	13	<body>
	14	<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
	15	<tr>
	16	<td width="10"></td>
	17	<td width="85%">
	18	<font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Quick
	19	Start</b></font>
	20	</td>
	21	<td width="112">
	22	<a href="http://spirit.sf.net"><img src="theme/spirit.gif"
	23	width="112" height="48" align="right" border="0"></a>
	24	</td>
	25	</tr>
	26	</table><br>
	27	<table border="0">
	28	<tr>
	29	<td width="10"></td>
	30	<td width="30">
	31	<a href="../index.html"><img src="theme/u_arr.gif" border="0"></a>
	32	</td>
	33	<td width="30">
	34	<a href="introduction.html"><img src="theme/l_arr.gif" border="0">
	35	</a>
	36	</td>
	37	<td width="30">
	38	<a href="basic_concepts.html"><img src="theme/r_arr.gif" border="0">
	39	</a>
	40	</td>
	41	</tr>
	42	</table>
	43	<h2>
	44	<b>Why would you want to use Spirit?</b>
	45	</h2>
	46	<p>
	47	Spirit is designed to be a practical parsing tool. At the very least, the
	48	ability to generate a fully-working parser from a formal EBNF
	49	specification inlined in C++ significantly reduces development time.
	50	While it may be practical to use a full-blown, stand-alone parser such as
	51	YACC or ANTLR when we want to develop a computer language such as C or
	52	Pascal, it is certainly overkill to bring in the big guns when we wish to
	53	write extremely small micro-parsers. At that end of the spectrum,
	54	programmers typically approach the job at hand not as a formal parsing
	55	task but through ad hoc hacks using primitive tools such as
	56	<tt>scanf</tt>. True, there are tools such as regular-expression
	57	libraries (such as <a href=
	58	"http://www.boost.org/libs/regex/index.html">boost regex</a>) or scanners
	59	(such as <a href="http://www.boost.org/libs/tokenizer/index.html">boost
	60	tokenizer</a>), but these tools do not scale well when we need to write
	61	more elaborate parsers. Attempting to write even a moderately-complex
	62	parser using these tools leads to code that is hard to understand and
	63	maintain.
	64	</p>
65	<p>
66	One prime objective is to make the tool easy to use. When one thinks of a
67	parser generator, the usual reaction is "it must be big and complex with
68	a steep learning curve." Not so. Spirit is designed to be fully scalable.
69	The framework is structured in layers. This permits learning on an
70	as-needed basis, after only learning the minimal core and basic concepts.
71	</p>
72	<p>
73	For development simplicity and ease in deployment, the entire framework
74	consists of only header files, with no libraries to link against or
75	build. Just put the spirit distribution in your include path, compile and
76	run. Code size? -very tight. In the quick start example that we shall
77	present in a short while, the code size is dominated by the instantiation
78	of the <tt>std::vector</tt> and <tt>std::iostream</tt>.
79	</p>
80	<h2>
81	<b>Trivial Example #1</b></h2>
82	<p>Create a parser that will parse
83	a floating-point number.
84	</p>
85	<pre><code><font color="#000000"> </font></code><span class="identifier">real_p</span>
86	</pre>
87	<p>
88	(You've got to admit, that's trivial!) The above code actually generates
89	a Spirit <tt>real_parser</tt> (a built-in parser) which parses a floating
90	point number. Take note that parsers that are meant to be used directly
91	by the user end with "<tt>_p</tt>" in their names as a Spirit convention.
92	Spirit has many pre-defined parsers and consistent naming conventions
93	help you keep from going insane!
94	</p>
95	<h2>
96	<b>Trivial Example #2</b></h2>
97	<p>
98	Create a parser that will accept a line consisting of two floating-point
99	numbers.
100	</p>
101
102	<pre><code><font color="#000000"> </font></code><code><span class=
103	"identifier">real_p</span> <span class=
104	"special">>></span> <span class="identifier">real_p</span></code>
105	</pre>
106	<p>
107	Here you see the familiar floating-point numeric parser
108	<code><tt>real_p</tt></code> used twice, once for each number. What's
109	that <tt class="operators">>></tt> operator doing in there? Well,
110	they had to be separated by something, and this was chosen as the
111	"followed by" sequence operator. The above program creates a parser from
112	two simpler parsers, glueing them together with the sequence operator.
113	The result is a parser that is a composition of smaller parsers.
114	Whitespace between numbers can implicitly be consumed depending on how
115	the parser is invoked (see below).
116	</p>
117	<p>
118	Note: when we combine parsers, we end up with a "bigger" parser, But it's
119	still a parser. Parsers can get bigger and bigger, nesting more and more,
120	but whenever you glue two parsers together, you end up with one bigger
121	parser. This is an important concept.
122	</p>
123	<h2>
124	<b>Trivial Example #3</b></h2>
125	<p>
126	Create a parser that will accept an arbitrary number of floating-point
127	numbers. (Arbitrary means anything from zero to infinity)
128	</p>
129
130	<pre><code><font color="#000000"> </font></code><code><span class=
131	"special">*</span><span class="identifier">real_p</span></code>
132	</pre>
133	<p>
134	This is like a regular-expression Kleene Star, though the syntax might
135	look a bit odd for a C++ programmer not used to seeing the <tt class=
136	"operators">*</tt> operator overloaded like this. Actually, if you know
137	regular expressions it may look odd too since the star is <b>before</b>
138	the expression it modifies. C'est la vie. Blame it on the fact that we
139	must work with the syntax rules of C++.
140	</p>
141	<p>
142	Any expression that evaluates to a parser may be used with the Kleene
143	Star. Keep in mind, though, that due to C++ operator precedence rules you
144	may need to put the expression in parentheses for complex expressions.
145	The Kleene Star is also known as a Kleene Closure, but we call it the
146	Star in most places.
147	</p>
148	<h3>
149	<b><a name="list_of_numbers"></a> Example #4 [ A Just Slightly Less Trivial Example</b>
150	] </h3>
151	<p>
152	This example will create a parser that accepts a comma-delimited list of numbers and put the numbers in a vector.
153	</p>
154	<h4><strong> Step 1. Create the parser</strong></h4>
155	<pre><code><font color="#000000"> </font></code><code><span class=
156	"identifier">real_p</span> <span class=
157	"special">>></span> <span class="special">*(</span><span class=
158	"identifier">ch_p</span><span class="special">(</span><span class=
159	"literal">','</span><span class="special">)</span> <span class=
160	"special">>></span> <span class=
161	"identifier">real_p</span><span class="special">)</span></code>
162	</pre>
163	<p>
164	Notice <tt>ch_p(',')</tt>. It is a literal character parser that can
165	recognize the comma <tt>','</tt>. In this case, the Kleene Star is
166	modifying a more complex parser, namely, the one generated by the
167	expression:
168	</p>
169
170	<pre><code><font color="#000000"> </font></code><code><span class=
171	"special">(</span><span class="identifier">ch_p</span><span class=
172	"special">(</span><span class="literal">','</span><span class=
173	"special">)</span> <span class="special">>></span> <span class=
174	"identifier">real_p</span><span class="special">)</span></code>
175	</pre>
176	<p>
177	Note that this is a case where the parentheses are necessary. The Kleene
178	star encloses the complete expression above.
179	</p>
180	<h4>
181	<b><strong>Step 2. </strong>Using a Parser (now that it's created)</b></h4>
182	<p>
183	Now that we have created a parser, how do we use it? Like the result of
184	any C++ temporary object, we can either store it in a variable, or call
185	functions directly on it.
186	</p>
187	<p>
188	We'll gloss over some low-level C++ details and just get to the good
189	stuff.
190	</p>
191	<p>
192	If <b><tt>r</tt></b> is a rule (don't worry about what rules exactly are
193	for now. This will be discussed later. Suffice it to say that the rule is
194	a placeholder variable that can hold a parser), then we store the parser
195	as a rule like this:
196	</p>
197
198	<pre><code><font color="#000000"> </font></code><code><font color="#000000"><span class=
199	"identifier">r</span> <span class="special">=</span> <span class=
200	"identifier">real_p</span> <span class=
201	"special">>> *(</span><span class=
202	"identifier">ch_p</span><span class="special">(</span><span class=
203	"literal">','</span><span class="special">) >></span> <span class=
204	"identifier">real_p</span><span class="special">);</span></font></code>
205	</pre>
206	<p>
207	Not too exciting, just an assignment like any other C++ expression you've
208	used for years. The cool thing about storing a parser in a rule is this:
209	rules are parsers, and now you can refer to it <b>by name</b>. (In this
210	case the name is <tt><b>r</b></tt>). Notice that this is now a full
211	assignment expression, thus we terminate it with a semicolon,
212	"<tt>;</tt>".
213	</p>
214	<p>
215	That's it. We're done with defining the parser. So the next step is now
216	invoking this parser to do its work. There are a couple of ways to do
217	this. For now, we shall use the free <tt>parse</tt> function that takes
218	in a <tt>char const*</tt>. The function accepts three arguments:
219	</p>
220	<blockquote>
221	<p>
222	<img src="theme/bullet.gif" width="12" height="12"> The null-terminated
223	<tt>const char*</tt> input<br>
224	<img src="theme/bullet.gif" width="12" height="12"> The parser
225	object<br>
226	<img src="theme/bullet.gif" width="12" height="12"> Another parser
227	called the <b>skip parser</b>
228	</p>
229	</blockquote>
230	<p>
231	In our example, we wish to skip spaces and tabs. Another parser named
232	<tt>space_p</tt> is included in Spirit's repertoire of predefined
233	parsers. It is a very simple parser that simply recognizes whitespace. We
234	shall use <tt>space_p</tt> as our skip parser. The skip parser is the one
235	responsible for skipping characters in between parser elements such as
236	the <tt>real_p</tt> and the <tt>ch_p</tt>.
237	</p>
238	<p>
239	Ok, so now let's parse!
240	</p>
241
242	<pre><code><font color="#000000"> </font></code><code><font color="#000000"><span class=
243	"identifier">r</span> <span class="special">=</span> <span class=
244	"identifier">real_p</span> <span class=
245	"special">>></span> <span class="special">*(</span><span class=
246	"identifier">ch_p</span><span class="special">(</span><span class=
247	"literal">','</span><span class="special">)</span> <span class=
248	"special">>></span> <span class=
249	"identifier">real_p</span><span class="special">);
250	</span> <span class="identifier"> parse</span><span class=
251	"special">(</span><span class="identifier">str</span><span class=
252	"special">,</span> <span class="identifier">r</span><span class=
253	"special">,</span> <span class="identifier">space_p</span><span class=
254	"special">)</span> <span class=
255	"comment">// Not a full statement yet, patience...</span></font></code>
256	</pre>
257	<p>
258	The parse function returns an object (called <tt>parse_info</tt>) that
259	holds, among other things, the result of the parse. In this example, we
260	need to know:
261	</p>
262	<blockquote>
263	<p>
264	<img src="theme/bullet.gif" width="12" height="12"> Did the parser
265	successfully recognize the input <tt>str</tt>?<br>
266	<img src="theme/bullet.gif" width="12" height="12"> Did the parser
267	<b>fully</b> parse and consume the input up to its end?
268	</p>
269	</blockquote>
270	<p>
271	To get a complete picture of what we have so far, let us also wrap this
272	parser inside a function:
273	</p>
274
275	<pre><code><font color="#000000"> </font></code><code><font color="#000000"><span class=
276	"keyword">bool
277	</span> <span class="identifier"> parse_numbers</span><span class=
278	"special">(</span><span class="keyword">char</span> <span class=
279	"keyword">const</span><span class="special">*</span> <span class=
280	"identifier">str</span><span class="special">)
281	{
282	</span> <span class="keyword"> return</span> <span class=
283	"identifier">parse</span><span class="special">(</span><span class=
284	"identifier">str</span><span class="special">,</span> <span class=
285	"identifier">real_p</span> <span class=
286	"special">>></span> <span class="special">*(</span><span class=
287	"literal">','</span> <span class="special">>></span> <span class=
288	"identifier">real_p</span><span class="special">),</span> <span class=
289	"identifier">space_p</span><span class="special">).</span><span class=
290	"identifier">full</span><span class="special">;
291	}</span></font></code>
292	</pre>
293	<p>
294	Note in this case we dropped the named rule and inlined the parser
295	directly in the call to parse. Upon calling parse, the expression
296	evaluates into a temporary, unnamed parser which is passed into the
297	parse() function, used, and then destroyed.
298	</p>
299	<table border="0" width="80%" align="center">
300	<tr>
301	<td class="note_box">
302	<img src="theme/note.gif" width="16" height="16"><b>char and wchar_t
303	operands</b><br>
304	<br>
305	The careful reader may notice that the parser expression has
306	<tt class="quotes">','</tt> instead of <tt>ch_p(',')</tt> as the
307	previous examples did. This is ok due to C++ syntax rules of
308	conversion. There are <tt>>></tt> operators that are overloaded
309	to accept a <tt>char</tt> or <tt>wchar_t</tt> argument on its left or
310	right (but not both). An operator may be overloaded if at least one
311	of its parameters is a user-defined type. In this case, the
312	<tt>real_p</tt> is the 2nd argument to <tt>operator<span class=
313	"operators">>></span></tt>, and so the proper overload of
314	<tt class="operators">>></tt> is used, converting
315	<tt class="quotes">','</tt> into a character literal parser.<br>
316	<br>
317	The problem with omiting the <tt>ch_p</tt> call should be obvious:
318	<tt>'a' >> 'b'</tt> is <b>not</b> a spirit parser, it is a
319	numeric expression, right-shifting the ASCII (or another encoding)
320	value of <tt class="quotes">'a'</tt> by the ASCII value of
321	<tt class="quotes">'b'</tt>. However, both <tt>ch_p('a') >>
322	'b'</tt> and <tt>'a' >> ch_p('b')</tt> are Spirit sequence
323	parsers for the letter <tt class="quotes">'a'</tt> followed by
324	<tt class="quotes">'b'</tt>. You'll get used to it, sooner or
325	later.
326	</td>
327	</tr>
328	</table>
329	<p>
330	Take note that the object returned from the parse function has a member
331	called <tt>full</tt> which returns true if both of our requirements above
332	are met (i.e. the parser fully parsed the input).
333	</p>
334	<h4>
335	<b> Step 3. Semantic Actions</b></h4>
336	<p>
337	Our parser above is really nothing but a recognizer. It answers the
338	question <i class="quotes">"did the input match our grammar?"</i>, but it
339	does not remember any data, nor does it perform any side effects.
340	Remember: we want to put the parsed numbers into a vector. This is done
341	in an <b>action</b> that is linked to a particular parser. For example,
342	whenever we parse a real number, we wish to store the parsed number after
343	a successful match. We now wish to extract information from the parser.
344	Semantic actions do this. Semantic actions may be attached to any point
345	in the grammar specification. These actions are C++ functions or functors
346	that are called whenever a part of the parser successfully recognizes a
347	portion of the input. Say you have a parser <b>P</b>, and a C++ function
348	<b>F</b>, you can make the parser call <b>F</b> whenever it matches an
349	input by attaching <b>F</b>:
350	</p>
351
352	<pre><code><font color="#000000"> </font></code><code><font color="#000000"><span class=
353	"identifier">P</span><span class="special">[&</span><span class=
354	"identifier">F</span><span class="special">]</span></font></code>
355	</pre>
356	<p>
357	Or if <b>F</b> is a function object (a functor):
358	</p>
359
360	<pre><code><font color="#000000"> </font></code><code><font color="#000000"><span class=
361	"identifier">P</span><span class="special">[</span><span class=
362	"identifier">F</span><span class="special">]</span></font></code>
363	</pre>
364	<p>
365	The function/functor signature depends on the type of the parser to which
366	it is attached. The parser <tt>real_p</tt> passes a single argument: the
367	parsed number. Thus, if we were to attach a function <b>F</b> to
368	<tt>real_p</tt>, we need <b>F</b> to be declared as:
369	</p>
370
371	<pre><code> </code><code><span class=
372	"keyword">void</span> <span class="identifier">F</span><span class=
373	"special">(</span><span class="keyword">double</span> <span class=
374	"identifier">n</span><span class="special">);</span></code></pre>
375	<p>
376	For our example however, again, we can take advantage of some predefined
377	semantic functors and functor generators (<img src="theme/lens.gif"
378	width="15" height="16"> A functor generator is a function that returns
379	a functor). For our purpose, Spirit has a functor generator
380	<tt>push_back_a(c)</tt>. In brief, this semantic action, when called,
381	<b>appends</b> the parsed value it receives from the parser it is
382	attached to, to the container <tt>c</tt>.
383	</p>
384	<p>
385	Finally, here is our complete comma-separated list parser:
386	</p>
387
388	<pre><code><font color="#000000"> </font></code><code><font color="#000000"><span class=
389	"keyword">bool
390	</span> <span class="identifier">parse_numbers</span><span class=
391	"special">(</span><span class="keyword">char</span> <span class=
392	"keyword">const</span><span class="special">*</span> <span class=
393	"identifier">str</span><span class="special">,</span> <span class=
394	"identifier">vector</span><span class="special"><</span><span class=
395	"keyword">double</span><span class=
396	"special">>&</span> <span class="identifier">v</span><span class=
397	"special">)
398	{
399	</span> <span class="keyword">return</span> <span class=
400	"identifier">parse</span><span class="special">(</span><span class=
401	"identifier">str</span><span class="special">,
402
403	</span> <span class="comment"> // Begin grammar
404	</span> <span class="special"> (
405	</span> <span class="identifier">real_p</span><span class=
406	"special">[</span><span class="identifier">push_back_a</span><span class=
407	"special">(</span><span class="identifier">v</span><span class=
408	"special">)]</span> <span class="special">>></span> <span class=
409	"special">*(</span><span class="literal">','</span> <span class=
410	"special">>></span> <span class=
411	"identifier">real_p</span><span class="special">[</span><span class=
412	"identifier">push_back_a</span><span class="special">(</span><span class=
413	"identifier">v</span><span class="special">)])
414	)
415	</span> <span class="special"> ,
416	</span> <span class="comment"> // End grammar
417
418	</span> <span class="identifier"> space_p</span><span class=
419	"special">).</span><span class="identifier">full</span><span class="special">;
420	}</span></font></code>
421	</pre>
422	<p>
423	This is the same parser as above. This time with appropriate semantic
424	actions attached to strategic places to extract the parsed numbers and
425	stuff them in the vector <tt>v</tt>. The parse_numbers function returns
426	true when successful.
427	</p>
428	<p>
429	<img src="theme/lens.gif" width="15" height="16"> The full source code
430	can be <a href="../example/fundamental/number_list.cpp">viewed here</a>.
431	This is part of the Spirit distribution.
432	</p>
433	<table border="0">
434	<tr>
435	<td width="10"></td>
436	<td width="30">
437	<a href="../index.html"><img src="theme/u_arr.gif" border="0"></a>
438	</td>
439	<td width="30">
440	<a href="introduction.html"><img src="theme/l_arr.gif" border="0">
441	</a>
442	</td>
443	<td width="30">
444	<a href="basic_concepts.html"><img src="theme/r_arr.gif" border="0">
445	</a>
446	</td>
447	</tr>
448	</table><br>
449	<hr size="1">
450	<p class="copyright">
451	Copyright © 1998-2003 Joel de Guzman<br>
452	Copyright © 2002 Chris Uzdavinis<br>
453	<br>
454	<font size="2">Use, modification and distribution is subject to the
455	Boost Software License, Version 1.0. (See accompanying file
456	LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)</font>
457	</p>
458	<blockquote>
459
460	</blockquote>
461	</body>
462	</html>