]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <html> |
2 | <head> | |
3 | <!-- Generated by the Spirit (http://spirit.sf.net) QuickDoc --> | |
4 | <title>Distinct Parser</title> | |
5 | <link rel="stylesheet" href="theme/style.css" type="text/css"> | |
6 | </head> | |
7 | <body> | |
8 | <table width="100%" height="48" border="0" background="theme/bkd2.gif" cellspacing="2"> | |
9 | <tr> | |
10 | <td width="10"> | |
11 | </td> | |
12 | <td width="85%"> | |
13 | <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Distinct Parser </b></font></td> | |
14 | <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" align="right" border="0"></a></td> | |
15 | </tr> | |
16 | </table> | |
17 | <br> | |
18 | <table border="0"> | |
19 | <tr> | |
20 | <td width="10"></td> | |
21 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> | |
22 | <td width="30"><a href="scoped_lock.html"><img src="theme/l_arr.gif" border="0"></a></td> | |
23 | <td width="30"><a href="symbols.html"><img src="theme/r_arr.gif" border="0"></a></td> | |
24 | </tr> | |
25 | </table> | |
26 | <h3>Distinct Parsers</h3><p> | |
27 | The distinct parsers are utility parsers which ensure that matched input is | |
28 | not immediately followed by a forbidden pattern. Their typical usage is to | |
29 | distinguish keywords from identifiers.</p> | |
30 | <h3>distinct_parser</h3> | |
31 | <p> | |
32 | The basic usage of the <tt>distinct_parser</tt> is to replace the <tt>str_p</tt> parser. For | |
33 | example the <tt>declaration_rule</tt> in the following example:</p> | |
34 | <pre> | |
35 | <code><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"declare"</span><span class=special>) >> </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>]; | |
36 | </span></code></pre> | |
37 | <p> | |
38 | would correctly match an input "declare abc", but as well an input"declareabc" what is usually not intended. In order to avoid this, we can | |
39 | use <tt>distinct_parser</tt>:</p> | |
40 | <code> | |
41 | <pre> | |
42 | <span class=comment>// keyword_p may be defined in the global scope | |
43 | </span><span class=identifier>distinct_parser</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>"a-zA-Z0-9_"</span><span class=special>); | |
44 | ||
45 | </span><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>"declare"</span><span class=special>) >> </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>]; | |
46 | </span></pre> | |
47 | </code> | |
48 | <p> | |
49 | The <tt>keyword_p</tt> works in the same way as the <tt>str_p</tt> parser but matches only | |
50 | when the matched input is not immediately followed by one of the characters | |
51 | from the set passed to the constructor of <tt>keyword_p</tt>. In the example the | |
52 | "declare" can't be immediately followed by any alphabetic character, any | |
53 | number or an underscore.</p> | |
54 | <p> | |
55 | See the full <a href="../example/fundamental/distinct/distinct_parser.cpp">example here </a>.</p> | |
56 | <h3>distinct_directive</h3><p> | |
57 | For more sophisticated cases, for example when keywords are stored in a | |
58 | symbol table, we can use <tt>distinct_directive</tt>.</p> | |
59 | <pre> | |
60 | <code><span class=identifier>distinct_directive</span><span class=special><> </span><span class=identifier>keyword_d</span><span class=special>(</span><span class=string>"a-zA-Z0-9_"</span><span class=special>); | |
61 | ||
62 | </span><span class=identifier>symbol</span><span class=special><> </span><span class=identifier>keywords </span><span class=special>= </span><span class=string>"declare"</span><span class=special>, </span><span class=string>"begin"</span><span class=special>, </span><span class=string>"end"</span><span class=special>; | |
63 | </span><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>keyword </span><span class=special>= </span><span class=identifier>keyword_d</span><span class=special>[</span><span class=identifier>keywords</span><span class=special>]; | |
64 | </span></code></pre> | |
65 | <h3>dynamic_distinct_parser and dynamic_distinct_directive</h3><p> | |
66 | In some cases a set of forbidden follow-up characters is not sufficient. | |
67 | For example ASN.1 naming conventions allows identifiers to contain dashes, | |
68 | but not double dashes (which marks the beginning of a comment). | |
69 | Furthermore, identifiers can't end with a dash. So, a matched keyword can't | |
70 | be followed by any alphanumeric character or exactly one dash, but can be | |
71 | followed by two dashes.</p> | |
72 | <p> | |
73 | This is when <tt>dynamic_distinct_parser</tt> and the <tt>dynamic_distinct_directive </tt>come into play. The constructor of the <tt>dynamic_distinct_parser</tt> accepts a | |
74 | parser which matches any input that <strong>must NOT</strong> follow the keyword.</p> | |
75 | <pre> | |
76 | <code><span class=comment>// Alphanumeric characters and a dash followed by a non-dash | |
77 | // may not follow an ASN.1 identifier. | |
78 | </span><span class=identifier>dynamic_distinct_parser</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>alnum_p </span><span class=special>| (</span><span class=literal>'-' </span><span class=special>>> ~</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'-'</span><span class=special>))); | |
79 | ||
80 | </span><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>"declare"</span><span class=special>) >> </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>]; | |
81 | </span></code></pre> | |
82 | <p> | |
83 | Since the <tt>dynamic_distinct_parser</tt> internally uses a rule, its type is | |
84 | dependent on the scanner type. So, the <tt>keyword_p</tt> shouldn't be defined | |
85 | globally, but rather within the grammar.</p> | |
86 | <p> | |
87 | See the full <a href="../example/fundamental/distinct/distinct_parser_dynamic.cpp">example here</a>.</p> | |
88 | <h3>How it works</h3><p> | |
89 | When the <tt>keyword_p_1</tt> and the <tt>keyword_p_2</tt> are defined as</p> | |
90 | <code><pre> | |
91 | <span class=identifier>distinct_parser</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>forbidden_chars</span><span class=special>); | |
92 | </span><span class=identifier>distinct_parser_dynamic</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>forbidden_tail_parser</span><span class=special>); | |
93 | </span></pre></code> | |
94 | <p> | |
95 | the parsers</p> | |
96 | <code><pre> | |
97 | <span class=identifier>keyword_p_1</span><span class=special>(</span><span class=identifier>str</span><span class=special>) | |
98 | </span><span class=identifier>keyword_p_2</span><span class=special>(</span><span class=identifier>str</span><span class=special>) | |
99 | </span></pre></code> | |
100 | <p> | |
101 | are equivalent to the rules</p> | |
102 | <code><pre> | |
103 | <span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>chseq_p</span><span class=special>(</span><span class=identifier>str</span><span class=special>) >> ~</span><span class=identifier>epsilon_p</span><span class=special>(</span><span class=identifier>chset_p</span><span class=special>(</span><span class=identifier>forbidden_chars</span><span class=special>))] | |
104 | </span><span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>chseq_p</span><span class=special>(</span><span class=identifier>str</span><span class=special>) >> ~</span><span class=identifier>epsilon_p</span><span class=special>(</span><span class=identifier>forbidden_tail_parser</span><span class=special>)] | |
105 | </span></pre></code> | |
106 | <table border="0"> | |
107 | <tr> | |
108 | <td width="10"></td> | |
109 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> | |
110 | <td width="30"><a href="scoped_lock.html"><img src="theme/l_arr.gif" border="0"></a></td> | |
111 | <td width="30"><a href="symbols.html"><img src="theme/r_arr.gif" border="0"></a></td> | |
112 | </tr> | |
113 | </table> | |
114 | <br> | |
115 | <hr size="1"> | |
116 | <p class="copyright">Copyright © 2003-2004 | |
117 | ||
118 | ||
119 | Vaclav Vesely<br><br> | |
120 | <font size="2">Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) </font> </p> | |
121 | </body> | |
122 | </html> |