]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/============================================================================== |
2 | Copyright (C) 2001-2015 Hartmut Kaiser | |
3 | Copyright (C) 2001-2011 Joel de Guzman | |
4 | ||
5 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
6 | file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
7 | ===============================================================================/] | |
8 | ||
9 | [/////////////////////////////////////////////////////////////////////////////] | |
10 | [section:primitive_attributes Attributes of Primitive Components] | |
11 | ||
12 | Parsers in __spirit__ are fully attributed. __x3__ parsers always /expose/ an | |
13 | attribute specific to their type. This is called /synthesized attribute/ as it | |
14 | is returned from a successful match representing the matched input sequence. For | |
15 | instance, numeric parsers, such as `int_` or `double_`, return the `int` or | |
16 | `double` value converted from the matched input sequence. Other primitive parser | |
17 | components have other intuitive attribute types, such as for instance `int_` | |
18 | which has `int`, or `ascii::char_` which has `char`. Primitive parsers apply the | |
19 | normal C++ convertibility rules: you can use any C++ type to receive the parsed | |
20 | value as long as the attribute type of the parser is convertible to the type | |
21 | provided. The following example shows how a synthesized parser attribute (the | |
22 | `int` value) is extracted by calling the API function `x3::parse`: | |
23 | ||
24 | int value = 0; | |
25 | std::string str("123"); | |
26 | std::string::iterator strbegin = str.begin(); | |
27 | x3::parse(strbegin, str.end(), int_, value); // value == 123 | |
28 | ||
29 | For a full list of available parser primitives and their attribute types please | |
30 | see the sections __sec_x3_primitive__. | |
31 | ||
32 | [endsect] | |
33 | ||
34 | [/////////////////////////////////////////////////////////////////////////////] | |
35 | [section:compound_attributes Attributes of Compound Components] | |
36 | ||
37 | __x3__ implement well defined attribute type propagation rules for all compound | |
38 | parsers, such as sequences, alternatives, Kleene star, etc. The main attribute | |
39 | propagation rule for a sequences is for instance: | |
40 | ||
41 | a: A, b: B --> (a >> b): tuple<A, B> | |
42 | ||
43 | which reads as: | |
44 | ||
45 | [:Given `a` and `b` are parsers, and `A` is the attribute type of `a`, and `B` | |
46 | is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`) | |
47 | will be `tuple<A, B>`.] | |
48 | ||
49 | [note The notation `tuple<A, B>` is used as a placeholder expression for any | |
50 | fusion sequence holding the types A and B, such as `boost::fusion::tuple<A, B>` | |
51 | or `std::pair<A, B>` (for more information see __fusion__).] | |
52 | ||
53 | As you can see, in order for a type to be compatible with the attribute type | |
54 | of a compound expression it has to | |
55 | ||
56 | * either be convertible to the attribute type, | |
57 | * or it has to expose certain functionalities, i.e. it needs to conform to a | |
58 | concept compatible with the component. | |
59 | ||
60 | Each compound component implements its own set of attribute propagation rules. | |
61 | For a full list of how the different compound parsers consume attributes | |
62 | see the sections __sec_x3_compound__. | |
63 | ||
64 | [heading The Attribute of Sequence Parsers] | |
65 | ||
66 | Sequences require an attribute type to expose the concept of a fusion sequence, | |
67 | where all elements of that fusion sequence have to be compatible with the | |
68 | corresponding element of the component sequence. For example, the expression: | |
69 | ||
70 | double_ >> double_ | |
71 | ||
72 | is compatible with any fusion sequence holding two types, where both types have | |
73 | to be compatible with `double`. The first element of the fusion sequence has to | |
74 | be compatible with the attribute of the first `double_`, and the second element | |
75 | of the fusion sequence has to be compatible with the attribute of the second | |
76 | `double_`. If we assume to have an instance of a `std::pair<double, double>`, | |
77 | we can directly use the expressions above to do both, parse input to fill the | |
78 | attribute: | |
79 | ||
80 | // the following parses "1.0 2.0" into a pair of double | |
81 | std::string input("1.0 2.0"); | |
82 | std::string::iterator strbegin = input.begin(); | |
83 | std::pair<double, double> p; | |
84 | x3::phrase_parse(strbegin, input.end(), | |
85 | x3::double_ >> x3::double_, // parser grammar | |
86 | x3::space, // delimiter grammar | |
87 | p); // attribute to fill while parsing | |
88 | ||
89 | [tip *For sequences only:* __x3__ exposes a set of API functions | |
90 | usable mainly with sequences. Very much like the functions of the `scanf` | |
91 | and `printf` families these functions allow to pass the attributes for | |
92 | each of the elements of the sequence separately. Using the corresponding | |
93 | overload of /X3's/ parse function, the expression above | |
94 | could be rewritten as: | |
95 | `` | |
96 | double d1 = 0.0, d2 = 0.0; | |
97 | x3::phrase_parse(begin, end, x3::double_ >> x3::double_, x3::space, d1, d2); | |
98 | `` | |
99 | where the first attribute is used for the first `double_`, and | |
100 | the second attribute is used for the second `double_`. | |
101 | ] | |
102 | ||
103 | [heading The Attribute of Alternative Parsers] | |
104 | ||
105 | Alternative parsers are all about - well - alternatives. In | |
106 | order to store possibly different result (attribute) types from the different | |
107 | alternatives we use the data type __boost_variant__. The main attribute | |
108 | propagation rule of these components is: | |
109 | ||
110 | a: A, b: B --> (a | b): variant<A, B> | |
111 | ||
112 | Alternatives have a second very important attribute propagation rule: | |
113 | ||
114 | a: A, b: A --> (a | b): A | |
115 | ||
116 | often simplifying things significantly. If all sub expressions of | |
117 | an alternative expose the same attribute type, the overall alternative | |
118 | will expose exactly the same attribute type as well. | |
119 | ||
120 | [endsect] | |
121 | ||
122 | [/////////////////////////////////////////////////////////////////////////////] | |
123 | [section:more_compound_attributes More About Attributes of Compound Components] | |
124 | ||
125 | While parsing input, it is often desirable to combine some | |
126 | constant elements with variable parts. For instance, let us look at the example | |
127 | of parsing or formatting a complex number, which is written as `(real, imag)`, | |
128 | where `real` and `imag` are the variables representing the real and imaginary | |
129 | parts of our complex number. This can be achieved by writing: | |
130 | ||
131 | '(' >> double_ >> ", " >> double_ >> ')' | |
132 | ||
133 | Literals (such as `'('` and `", "`) do /not/ expose any attribute | |
134 | (well actually, they do expose the special type `unused_type`, but in this | |
135 | context `unused_type` is interpreted as if the component does not expose any | |
136 | attribute at all). It is very important to understand that the literals don't | |
137 | consume any of the elements of a fusion sequence passed to this component | |
138 | sequence. As said, they just don't expose any attribute and don't produce | |
139 | (consume) any data. The following example shows this: | |
140 | ||
141 | // the following parses "(1.0, 2.0)" into a pair of double | |
142 | std::string input("(1.0, 2.0)"); | |
143 | std::string::iterator strbegin = input.begin(); | |
144 | std::pair<double, double> p; | |
145 | x3::parse(strbegin, input.end(), | |
146 | '(' >> x3::double_ >> ", " >> x3::double_ >> ')', // parser grammar | |
147 | p); // attribute to fill while parsing | |
148 | ||
149 | where the first element of the pair passed in as the data to generate is still | |
150 | associated with the first `double_`, and the second element is associated with | |
151 | the second `double_` parser. | |
152 | ||
153 | This behavior should be familiar as it conforms to the way other input and | |
154 | output formatting libraries such as `scanf`, `printf` or `boost::format` are | |
155 | handling their variable parts. In this context you can think about __x3__'s | |
156 | primitive components (such as the `double_` above) as of being | |
157 | type safe placeholders for the attribute values. | |
158 | ||
159 | [tip Similarly to the tip provided above, this example could be rewritten | |
160 | using /Spirit's/ multi-attribute API function: | |
161 | `` | |
162 | double d1 = 0.0, d2 = 0.0; | |
163 | x3::parse(begin, end, '(' >> x3::double_ >> ", " >> x3::double_ >> ')', d1, d2); | |
164 | `` | |
165 | which provides a clear and comfortable syntax, more similar to the | |
166 | placeholder based syntax as exposed by `printf` or `boost::format`. | |
167 | ] | |
168 | ||
169 | Let's take a look at this from a more formal perspective: | |
170 | ||
171 | a: A, b: Unused --> (a >> b): A | |
172 | ||
173 | which reads as: | |
174 | ||
175 | [:Given `a` and `b` are parsers, and `A` is the attribute type of | |
176 | `a`, and `unused_type` is the attribute type of `b`, then the attribute type | |
177 | of `a >> b` (`a << b`) will be `A` as well. This rule applies regardless of | |
178 | the position the element exposing the `unused_type` is at.] | |
179 | ||
180 | This rule is the key to the understanding of the attribute handling in | |
181 | sequences as soon as literals are involved. It is as if elements with | |
182 | `unused_type` attributes 'disappeared' during attribute propagation. Notably, | |
183 | this is not only true for sequences but for any compound components. For | |
184 | instance, for alternative components the corresponding rule is: | |
185 | ||
186 | a: A, b: Unused --> (a | b): A | |
187 | ||
188 | again, allowing to simplify the overall attribute type of an expression. | |
189 | ||
190 | [endsect] | |
191 | ||
192 | [/////////////////////////////////////////////////////////////////////////////] | |
193 | [section:nonterminal_attributes Attributes of Nonterminals] | |
194 | ||
195 | Nonterminals are the main means of constructing more complex parsers out of | |
196 | simpler ones. The nonterminals in the parser world are very similar to functions | |
197 | in an imperative programming language. They can be used to encapsulate parser | |
198 | expressions for a particular input sequence. After being defined, the | |
199 | nonterminals can be used as 'normal' parsers in more complex expressions | |
200 | whenever the encapsulated input needs to be recognized. Parser nonterminals in | |
201 | __x3__ usually return a value (the synthesized attribute). | |
202 | ||
203 | The type of the synthesized attribute as to be explicitly specified while | |
204 | defining the particular nonterminal. Example (ignore ID for now): | |
205 | ||
206 | x3::rule<ID, int> r; | |
207 | ||
208 | [endsect] | |
209 |