]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <html> |
2 | <head> | |
3 | <title>The Grammar</title> | |
4 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> | |
5 | <link rel="stylesheet" href="theme/style.css" type="text/css"> | |
6 | </head> | |
7 | ||
8 | <body> | |
9 | <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> | |
10 | <tr> | |
11 | <td width="10"> | |
12 | </td> | |
13 | <td width="85%"> | |
14 | <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>The Grammar</b></font> | |
15 | </td> | |
16 | <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> | |
17 | </tr> | |
18 | </table> | |
19 | <br> | |
20 | <table border="0"> | |
21 | <tr> | |
22 | <td width="10"></td> | |
23 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> | |
24 | <td width="30"><a href="scanner.html"><img src="theme/l_arr.gif" border="0"></a></td> | |
25 | <td width="30"><a href="subrules.html"><img src="theme/r_arr.gif" border="0"></a></td> | |
26 | </tr> | |
27 | </table> | |
28 | <p>The <b>grammar</b> encapsulates a set of rules. The <tt>grammar</tt> class | |
29 | is a protocol base class. It is essentially an interface contract. The <tt>grammar</tt> | |
30 | is a template class that is parameterized by its derived class, <tt>DerivedT</tt>, | |
31 | and its context, <tt>ContextT</tt>. The template parameter ContextT defaults | |
32 | to <tt>parser_context</tt>, a predefined context. </p> | |
33 | <p>You need not be concerned at all with the ContextT template parameter unless | |
34 | you wish to tweak the low level behavior of the grammar. Detailed information | |
35 | on the ContextT template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>. | |
36 | The <tt>grammar</tt> relies on the template parameter DerivedT, a grammar subclass | |
37 | to define the actual rules.</p> | |
38 | <p>Presented below is the public API. There may actually be more template parameters | |
39 | after <tt>ContextT</tt>. Everything after the <tt>ContextT</tt> parameter should | |
40 | not be of concern to the client and are strictly for internal use only.</p> | |
41 | <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special>< | |
42 | </span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>, | |
43 | </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><</span><span class=special>> > | |
44 | </span><span class=keyword>struct </span><span class=identifier>grammar</span><span class=special>;</span></font></code></pre> | |
45 | <h2>Grammar definition</h2> | |
46 | <p>A concrete sub-class inheriting from <tt>grammar</tt> is expected to have a | |
47 | nested template class (or struct) named <tt>definition</tt>:</p> | |
48 | <blockquote> | |
49 | <p><img src="theme/bullet.gif" width="13" height="13"> It is a nested template | |
50 | class with a typename <tt>ScannerT</tt> parameter.<br> | |
51 | <img src="theme/bullet.gif" width="13" height="13"> Its constructor defines | |
52 | the grammar rules.<br> | |
53 | <img src="theme/bullet.gif" width="13" height="13"> Its constructor is passed | |
54 | in a reference to the actual grammar <tt>self</tt>.<br> | |
55 | <img src="theme/bullet.gif" width="13" height="13"> It has a member function | |
56 | named <tt>start</tt> that returns a reference to the start <tt>rule</tt>.</p> | |
57 | </blockquote> | |
58 | <h2>Grammar skeleton</h2> | |
59 | <pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>my_grammar </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>my_grammar</span><span class=special>> | |
60 | </span><span class=special>{ | |
61 | </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> | |
62 | </span><span class=keyword>struct </span><span class=identifier>definition | |
63 | </span><span class=special>{ | |
64 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>r</span><span class=special>; | |
65 | </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>my_grammar </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) </span><span class=special>{ </span><span class=identifier>r </span><span class=special>= </span><span class=comment>/*..define here..*/</span><span class=special>; </span><span class=special>} | |
66 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>r</span><span class=special>; </span><span class=special>} | |
67 | </span><span class=special>}; | |
68 | </span><span class=special>};</span></font></code></pre> | |
69 | <p>Decoupling the scanner type from the rules that form a grammar allows the grammar | |
70 | to be used in different contexts possibly using different scanners. We do not | |
71 | care what scanner we are dealing with. The user-defined <tt>my_grammar</tt> | |
72 | can be used with <b>any</b> type of scanner. Unlike the rule, the grammar is | |
73 | not tied to a specific scanner type. See <a href="faq.html#scanner_business">"Scanner | |
74 | Business"</a> to see why this is important and to gain further understanding | |
75 | on this scanner-rule coupling problem.</p> | |
76 | <h2>Instantiating and using my_grammar</h2> | |
77 | <p>Our grammar above may be instantiated and put into action:</p> | |
78 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>my_grammar </span><span class=identifier>g</span><span class=special>; | |
79 | ||
80 | </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>first</span><span class=special>, </span><span class=identifier>last</span><span class=special>, </span><span class=identifier>g</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>).</span><span class=identifier>full</span><span class=special>) | |
81 | </span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing succeeded\n"</span><span class=special>; | |
82 | </span><span class=keyword>else | |
83 | </span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing failed\n"</span><span class=special>;</span></font></code></pre> | |
84 | <p><tt>my_grammar</tt> <b>IS-A </b>parser and can be used anywhere a parser is | |
85 | expected, even referenced by another rule:</p> | |
86 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>g </span><span class=special>>> </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"cool huh?"</span><span class=special>);</span></font></code></pre> | |
87 | <table width="80%" border="0" align="center"> | |
88 | <tr> | |
89 | <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> <b>Referencing | |
90 | grammars<br> | |
91 | </b><br> | |
92 | Like the rule, the grammar is also held by reference when it is placed in | |
93 | the right hand side of an EBNF expression. It is the responsibility of the | |
94 | client to ensure that the referenced grammar stays in scope and does not | |
95 | get destructed while it is being referenced. </td> | |
96 | </tr> | |
97 | </table> | |
98 | <h2><a name="full_grammar"></a>Full Grammar Example</h2> | |
99 | <p>Recalling our original calculator example, here it is now rewritten using a | |
100 | grammar:</p> | |
101 | <pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>calculator </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator</span><span class=special>> | |
102 | </span><span class=special>{ | |
103 | </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> | |
104 | </span><span class=keyword>struct </span><span class=identifier>definition | |
105 | </span><span class=special>{ | |
106 | </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) | |
107 | </span><span class=special>{ | |
108 | </span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'</span><span class=special>; | |
109 | </span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>; | |
110 | </span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>)); | |
111 | </span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>)); | |
112 | </span><span class=special>} | |
113 | ||
114 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>expression</span><span class=special>, </span><span class=identifier>term</span><span class=special>, </span><span class=identifier>factor</span><span class=special>, </span><span class=identifier>group</span><span class=special>; | |
115 | ||
116 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>& | |
117 | </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>expression</span><span class=special>; </span><span class=special>} | |
118 | </span><span class=special>}; | |
119 | </span><span class=special>};</span></font></code></pre> | |
120 | <p><img src="theme/lens.gif" width="15" height="16"> A fully working example with | |
121 | <a href="semantic_actions.html">semantic actions</a> can be <a href="../example/fundamental/calc_plain.cpp">viewed | |
122 | here</a>. This is part of the Spirit distribution. </p> | |
123 | <table width="80%" border="0" align="center"> | |
124 | <tr> | |
125 | <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>self</b><br> | |
126 | <br> | |
127 | You might notice that the definition of the grammar has a constructor that | |
128 | accepts a const reference to the outer grammar. In the example above, notice | |
129 | that <tt>calculator::definition</tt> takes in a <tt>calculator const& | |
130 | self</tt>. While this is unused in the example above, in many cases, this | |
131 | is very useful. The self argument is the definition's window to the outside | |
132 | world. For example, the calculator class might have a reference to some | |
133 | state information that the definition can update while parsing proceeds | |
134 | through <a href="semantic_actions.html">semantic actions</a>. </td> | |
135 | </tr> | |
136 | </table> | |
137 | <h2>Grammar Capsules</h2> | |
138 | <p>As a grammar becomes complicated, it is a good idea to group parts into logical | |
139 | modules. For instance, when writing a language, it might be wise to put expressions | |
140 | and statements into separate grammar capsules. The grammar takes advantage of | |
141 | the encapsulation properties of C++ classes. The declarative nature of classes | |
142 | makes it a perfect fit for the definition of grammars. Since the grammar is | |
143 | nothing more than a class declaration, we can conveniently publish it in header | |
144 | files. The idea is that once written and fully tested, a grammar can be reused | |
145 | in many contexts. We now have the notion of grammar libraries.</p> | |
146 | <h2><a name="multithreading"></a>Reentrancy and multithreading</h2> | |
147 | <p>An instance of a grammar may be used in different places multiple times without | |
148 | any problem. The implementation is tuned to allow this at the expense of some | |
149 | overhead. However, we can save considerable cycles and bytes if we are certain | |
150 | that a grammar will only have a single instance. If this is desired, simply | |
151 | define <tt>BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE</tt> before including any spirit | |
152 | header files.</p> | |
153 | <pre><font face="Courier New, Courier, mono"><code><span class="preprocessor"> #define</span></code></font><span class="preprocessor"><code><font face="Courier New, Courier, mono"> </font><tt>BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE</tt></code></span></pre> | |
154 | <p> On the other hand, if a grammar is intended to be used in multithreaded code, | |
155 | we should then define <tt>BOOST_SPIRIT_THREADSAFE</tt> before including any | |
156 | spirit header files. In this case it will also be required to link against <a href="http://www.boost.org/libs/thread/doc/index.html">Boost.Threads</a></p> | |
157 | <pre><font face="Courier New, Courier, mono"><span class="preprocessor"> #define</span></font> <span class="preprocessor"><tt>BOOST_SPIRIT_THREADSAFE</tt></span></pre> | |
158 | <h2>Using more than one grammar start rule </h2> | |
159 | <p>Sometimes it is desirable to have more than one visible entry point to a grammar | |
160 | (apart from the start rule). To allow additional start points, Spirit provides | |
161 | a helper template <tt>grammar_def</tt>, which may be used as a base class for | |
162 | the <tt>definition</tt> subclass of your <tt>grammar</tt>. Here's an example:</p> | |
163 | <pre><code> <span class="comment">// this header has to be explicitly included</span> | |
164 | <span class="preprocessor">#include</span> <span class="string"><boost/spirit/utility/grammar_def.hpp></span> | |
165 | ||
166 | </span><span class=keyword>struct </span><span class=identifier>calculator2 </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator2</span><span class=special>> | |
167 | { | |
168 | </span> <span class="keyword">enum</span> | |
169 | { | |
170 | expression = 0, | |
171 | term = 1, | |
172 | factor = 2, | |
173 | }; | |
174 | ||
175 | <span class=special> </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>> | |
176 | </span><span class=keyword>struct </span><span class=identifier>definition | |
177 | </span><span class="special">:</span> <span class="keyword">public</span><span class=identifier> grammar_def</span><span class="special"><</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>>,</span> same<span class="special">,</span> same<span class="special">></span> | |
178 | <span class=special>{</span> | |
179 | <span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator2 </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) | |
180 | { | |
181 | </span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'</span><span class=special>; | |
182 | </span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>; | |
183 | </span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> *((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) | (</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>)); | |
184 | </span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> *((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) | (</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>));</span> | |
185 | ||
186 | <span class="keyword">this</span><span class="special">-></span>start_parsers<span class="special">(</span>expression<span class="special">,</span> term<span class="special">,</span> factor<span class="special">);</span> | |
187 | <span class="special">}</span> | |
188 | ||
189 | <span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>expression</span><span class=special>, </span><span class=identifier>term</span><span class=special>, </span><span class=identifier>factor, group</span><span class=special>; | |
190 | </span><span class=special> }; | |
191 | };</span></font></code></pre> | |
192 | <p>The <tt>grammar_def</tt> template has to be instantiated with the types of | |
193 | all the rules you wish to make visible from outside the <tt>grammar</tt>:</p> | |
194 | <pre><code><span class=identifier> </span><span class=identifier>grammar_def</span><span class="special"><</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>>,</span> same<span class="special">,</span> same<span class="special">></span></code> </pre> | |
195 | <p>The shorthand notation <tt>same</tt> is used to indicate that the same type | |
196 | be used as specified by the previous template parameter (e.g. <code><tt>rule<ScannerT></tt></code>). | |
197 | Obviously, <tt>same</tt> may not be used as the first template parameter. </p> | |
198 | <table width="80%" border="0" align="center"> | |
199 | <tr> | |
200 | <td class="note_box"> <img src="theme/bulb.gif" width="13" height="18"> <strong>grammar_def | |
201 | start types</strong><br> | |
202 | <br> | |
203 | It may not be obvious, but it is interesting to note that aside from rule<>s, | |
204 | any parser type may be specified (e.g. chlit<>, strlit<>, int_parser<>, | |
205 | etc.).</td> | |
206 | </tr> | |
207 | </table> | |
208 | <p>Using the grammar_def class, there is no need to provide a <tt>start()</tt>member | |
209 | function anymore. Instead, you'll have to insert a call to the <tt>this->start_parsers()</tt> | |
210 | (which is a member function of the <tt>grammar_def</tt> template) to define | |
211 | the start symbols for your <tt>grammar</tt>. <img src="theme/note.gif" width="16" height="16"> | |
212 | Note that the number and the sequence of the rules used as the parameters to | |
213 | the <tt>start_parsers()</tt> function should match the types specified in the | |
214 | <tt>grammar_def</tt> template:</p> | |
215 | <pre><code> <span class="keyword">this</span><span class="special">-></span>start_parsers<span class="special">(</span>expression<span class="special">,</span> term<span class="special">,</span> factor<span class="special">);</span></code></pre> | |
216 | <p> The grammar entry point may be specified using the following syntax:</p> | |
217 | <pre><code><font color="#000000"><span class=identifier> g</span><span class="special">.</span><span class=identifier>use_parser</span><span class="special"><</span><span class=identifier>N</span><span class=special>>() </span><span class="comment">// Where g is your grammar and N is the Nth entry.</span></font></code></pre> | |
218 | <p>This sample shows how to use the <tt>term</tt> rule from the <tt>calculator2</tt> | |
219 | grammar above:</p> | |
220 | <pre><code><font color="#000000"><span class=identifier> calculator2 g</span><span class=special>; | |
221 | ||
222 | </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier> | |
223 | first</span><span class=special>, </span><span class=identifier>last</span><span class=special>, | |
224 | </span><span class=identifier>g</span><span class="special">.</span><span class=identifier>use_parser</span><span class="special"><</span><span class=identifier>calculator2::term</span><span class=special>>(),</span><span class=identifier> | |
225 | space_p</span><span class=special> | |
226 | ).</span><span class=identifier>full</span><span class=special>) | |
227 | { | |
228 | </span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing succeeded\n"</span><span class=special>; | |
229 | } | |
230 | </span><span class=keyword>else</span> <span class="special">{</span> | |
231 | <span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing failed\n"</span><span class=special>; | |
232 | }</span></font></code></pre> | |
233 | <p>The template parameter for the <tt>use_parser<></tt> template type should | |
234 | be the zero based index into the list of rules specified in the <tt>start_parsers()</tt> | |
235 | function call. </p> | |
236 | <table width="80%" border="0" align="center"> | |
237 | <tr> | |
238 | <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <tt><strong>use_parser<0></strong></tt><br> | |
239 | <br> | |
240 | Note, that using <span class="literal">0</span> (zero) as the template parameter | |
241 | to <tt>use_parser</tt> is equivalent to using the start rule, exported by | |
242 | conventional means through the <tt>start()</tt> function, as shown in the | |
243 | first <tt><a href="grammar.html#full_grammar">calculator</a></tt> sample | |
244 | above. So this notation may be used even for grammars exporting one rule | |
245 | through its <tt>start()</tt> function only. On the other hand, calling a | |
246 | <tt>grammar</tt> without the <tt>use_parser</tt> notation will execute the | |
247 | rule specified as the first parameter to the <tt>start_parsers()</tt> function. | |
248 | </td> | |
249 | </tr> | |
250 | </table> | |
251 | <p>The maximum number of usable start rules is limited by the preprocessor constant:</p> | |
252 | <pre> <span class="identifier">BOOST_SPIRIT_GRAMMAR_STARTRULE_TYPE_LIMIT</span> <span class="comment">// defaults to 3</span></pre> | |
253 | <table border="0"> | |
254 | <tr> | |
255 | <td width="10"></td> | |
256 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> | |
257 | <td width="30"><a href="scanner.html"><img src="theme/l_arr.gif" border="0"></a></td> | |
258 | <td width="30"><a href="subrules.html"><img src="theme/r_arr.gif" border="0"></a></td> | |
259 | </tr> | |
260 | </table> | |
261 | <br> | |
262 | <hr size="1"> | |
263 | <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> | |
264 | Copyright © 2003-2004 Hartmut Kaiser <br> | |
265 | <br> | |
266 | <font size="2">Use, modification and distribution is subject to the Boost Software | |
267 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at | |
268 | http://www.boost.org/LICENSE_1_0.txt) </font> </p> | |
269 | <p> </p> | |
270 | </body> | |
271 | </html> |