]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/============================================================================== |
2 | Copyright (C) 2001-2011 Hartmut Kaiser | |
3 | Copyright (C) 2001-2011 Joel de Guzman | |
4 | ||
5 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
6 | file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
7 | ===============================================================================/] | |
8 | ||
9 | [section:attributes Attributes] | |
10 | ||
11 | [/////////////////////////////////////////////////////////////////////////////] | |
12 | [section:primitive_attributes Attributes of Primitive Components] | |
13 | ||
14 | Parsers and generators in __spirit__ are fully attributed. __qi__ parsers always | |
15 | /expose/ an attribute specific to their type. This is called /synthesized | |
16 | attribute/ as it is returned from a successful match representing the matched | |
17 | input sequence. For instance, numeric parsers, such as `int_` or `double_`, | |
18 | return the `int` or `double` value converted from the matched input sequence. | |
19 | Other primitive parser components have other intuitive attribute types, such as | |
20 | for instance `int_` which has `int`, or `ascii::char_` which has `char`. For | |
21 | primitive parsers apply the normal C++ convertibility rules: you can use any | |
22 | C++ type to receive the parsed value as long as the attribute type of the | |
23 | parser is convertible to the type provided. The following example shows how a | |
24 | synthesized parser attribute (the `int` value) is extracted by calling the | |
25 | API function `qi::parse`: | |
26 | ||
27 | int value = 0; | |
28 | std::string str("123"); | |
29 | std::string::iterator strbegin = str.begin(); | |
30 | qi::parse(strbegin, str.end(), int_, value); // value == 123 | |
31 | ||
32 | The attribute type of a generator defines what data types this generator is | |
33 | able to consume in order to produce its output. __karma__ generators always | |
34 | /expect/ an attribute specific to their type. This is called /consumed | |
35 | attribute/ and is expected to be passed to the generator. The consumed | |
36 | attribute is most of the time the value the generator is designed to emit | |
37 | output for. For primitive generators the normal C++ convertibility rules apply. | |
38 | Any data type convertible to the attribute type of a primitive generator can be | |
39 | used to provide the data to generate. We present a similar example as above, | |
40 | this time the consumed attribute of the `int_` generator (the `int` value) | |
41 | is passed to the API function `karma::generate`: | |
42 | ||
43 | int value = 123; | |
44 | std::string str; | |
45 | std::back_insert_iterator<std::string> out(str); | |
46 | karma::generate(out, int_, value); // str == "123" | |
47 | ||
48 | Other primitive generator components have other intuitive attribute types, very | |
49 | similar to the corresponding parser components. For instance, the | |
50 | `ascii::char_` generator has `char` as consumed attribute. For a full list of | |
51 | available parser and generator primitives and their attribute types please see | |
52 | the sections __sec_qi_primitive__ and __sec_karma_primitive__. | |
53 | ||
54 | [endsect] | |
55 | ||
56 | [/////////////////////////////////////////////////////////////////////////////] | |
57 | [section:compound_attributes Attributes of Compound Components] | |
58 | ||
59 | __qi__ and __karma__ implement well defined attribute type propagation rules | |
60 | for all compound parsers and generators, such as sequences, alternatives, | |
61 | Kleene star, etc. The main attribute propagation rule for a sequences is for | |
62 | instance: | |
63 | ||
64 | [table | |
65 | [[Library] [Sequence attribute propagation rule]] | |
66 | [[Qi] [`a: A, b: B --> (a >> b): tuple<A, B>`]] | |
67 | [[Karma] [`a: A, b: B --> (a << b): tuple<A, B>`]] | |
68 | ] | |
69 | ||
70 | which reads as: | |
71 | ||
72 | [:Given `a` and `b` are parsers (generators), and `A` is the attribute type of | |
73 | `a`, and `B` is the attribute type of `b`, then the attribute type of | |
74 | `a >> b` (`a << b`) will be `tuple<A, B>`.] | |
75 | ||
76 | [note The notation `tuple<A, B>` is used as a placeholder expression for any | |
77 | fusion sequence holding the types A and B, such as | |
78 | `boost::fusion::tuple<A, B>` or `std::pair<A, B>` (for more information | |
79 | see __fusion__).] | |
80 | ||
81 | As you can see, in order for a type to be compatible with the attribute type | |
82 | of a compound expression it has to | |
83 | ||
84 | * either be convertible to the attribute type, | |
85 | * or it has to expose certain functionalities, i.e. it needs to conform to a | |
86 | concept compatible with the component. | |
87 | ||
88 | Each compound component implements its own set of attribute propagation rules. | |
89 | For a full list of how the different compound generators consume attributes | |
90 | see the sections __sec_qi_compound__ and __sec_karma_compound__. | |
91 | ||
92 | [heading The Attribute of Sequence Parsers and Generators] | |
93 | ||
94 | Sequences require an attribute type to expose the concept of a fusion sequence, | |
95 | where all elements of that fusion sequence have to be compatible with the | |
96 | corresponding element of the component sequence. For example, the expression: | |
97 | ||
98 | [table | |
99 | [[Library] [Sequence expression]] | |
100 | [[Qi] [`double_ >> double_`]] | |
101 | [[Karma] [`double_ << double_`]] | |
102 | ] | |
103 | ||
104 | is compatible with any fusion sequence holding two types, where both types have | |
105 | to be compatible with `double`. The first element of the fusion sequence has to | |
106 | be compatible with the attribute of the first `double_`, and the second element | |
107 | of the fusion sequence has to be compatible with the attribute of the second | |
108 | `double_`. If we assume to have an instance of a `std::pair<double, double>`, | |
109 | we can directly use the expressions above to do both, parse input to fill the | |
110 | attribute: | |
111 | ||
112 | // the following parses "1.0 2.0" into a pair of double | |
113 | std::string input("1.0 2.0"); | |
114 | std::string::iterator strbegin = input.begin(); | |
115 | std::pair<double, double> p; | |
116 | qi::phrase_parse(strbegin, input.end(), | |
117 | qi::double_ >> qi::double_, // parser grammar | |
118 | qi::space, // delimiter grammar | |
119 | p); // attribute to fill while parsing | |
120 | ||
121 | and generate output for it: | |
122 | ||
123 | // the following generates: "1.0 2.0" from the pair filled above | |
124 | std::string str; | |
125 | std::back_insert_iterator<std::string> out(str); | |
126 | karma::generate_delimited(out, | |
127 | karma::double_ << karma::double_, // generator grammar (format description) | |
128 | karma::space, // delimiter grammar | |
129 | p); // data to use as the attribute | |
130 | ||
131 | (where the `karma::space` generator is used as the delimiter, allowing to | |
132 | automatically skip/insert delimiting spaces in between all primitives). | |
133 | ||
134 | [tip *For sequences only:* __qi__ and __karma__ expose a set of API functions | |
135 | usable mainly with sequences. Very much like the functions of the `scanf` | |
136 | and `printf` families these functions allow to pass the attributes for | |
137 | each of the elements of the sequence separately. Using the corresponding | |
138 | overload of /Qi's/ parse or /Karma's/ `generate()` the expression above | |
139 | could be rewritten as: | |
140 | `` | |
141 | double d1 = 0.0, d2 = 0.0; | |
142 | qi::phrase_parse(begin, end, qi::double_ >> qi::double_, qi::space, d1, d2); | |
143 | karma::generate_delimited(out, karma::double_ << karma::double_, karma::space, d1, d2); | |
144 | `` | |
145 | where the first attribute is used for the first `double_`, and | |
146 | the second attribute is used for the second `double_`. | |
147 | ] | |
148 | ||
149 | [heading The Attribute of Alternative Parsers and Generators] | |
150 | ||
151 | Alternative parsers and generators are all about - well - alternatives. In | |
152 | order to store possibly different result (attribute) types from the different | |
153 | alternatives we use the data type __boost_variant__. The main attribute | |
154 | propagation rule of these components is: | |
155 | ||
156 | a: A, b: B --> (a | b): variant<A, B> | |
157 | ||
158 | Alternatives have a second very important attribute propagation rule: | |
159 | ||
160 | a: A, b: A --> (a | b): A | |
161 | ||
162 | often allowing to simplify things significantly. If all sub expressions of | |
163 | an alternative expose the same attribute type, the overall alternative | |
164 | will expose exactly the same attribute type as well. | |
165 | ||
166 | [endsect] | |
167 | ||
168 | [/////////////////////////////////////////////////////////////////////////////] | |
169 | [section:more_compound_attributes More About Attributes of Compound Components] | |
170 | ||
171 | While parsing input or generating output it is often desirable to combine some | |
172 | constant elements with variable parts. For instance, let us look at the example | |
173 | of parsing or formatting a complex number, which is written as `(real, imag)`, | |
174 | where `real` and `imag ` are the variables representing the real and imaginary | |
175 | parts of our complex number. This can be achieved by writing: | |
176 | ||
177 | [table | |
178 | [[Library] [Sequence expression]] | |
179 | [[Qi] [`'(' >> double_ >> ", " >> double_ >> ')'`]] | |
180 | [[Karma] [`'(' << double_ << ", " << double_ << ')'`]] | |
181 | ] | |
182 | ||
183 | Fortunately, literals (such as `'('` and `", "`) do /not/ expose any attribute | |
184 | (well actually, they do expose the special type `unused_type`, but in this | |
185 | context `unused_type` is interpreted as if the component does not expose any | |
186 | attribute at all). It is very important to understand that the literals don't | |
187 | consume any of the elements of a fusion sequence passed to this component | |
188 | sequence. As said, they just don't expose any attribute and don't produce | |
189 | (consume) any data. The following example shows this: | |
190 | ||
191 | // the following parses "(1.0, 2.0)" into a pair of double | |
192 | std::string input("(1.0, 2.0)"); | |
193 | std::string::iterator strbegin = input.begin(); | |
194 | std::pair<double, double> p; | |
195 | qi::parse(strbegin, input.end(), | |
196 | '(' >> qi::double_ >> ", " >> qi::double_ >> ')', // parser grammar | |
197 | p); // attribute to fill while parsing | |
198 | ||
199 | and here is the equivalent __karma__ code snippet: | |
200 | ||
201 | // the following generates: (1.0, 2.0) | |
202 | std::string str; | |
203 | std::back_insert_iterator<std::string> out(str); | |
204 | generate(out, | |
205 | '(' << karma::double_ << ", " << karma::double_ << ')', // generator grammar (format description) | |
206 | p); // data to use as the attribute | |
207 | ||
208 | where the first element of the pair passed in as the data to generate is still | |
209 | associated with the first `double_`, and the second element is associated with | |
210 | the second `double_` generator. | |
211 | ||
212 | This behavior should be familiar as it conforms to the way other input and | |
213 | output formatting libraries such as `scanf`, `printf` or `boost::format` are | |
214 | handling their variable parts. In this context you can think about __qi__'s | |
215 | and __karma__'s primitive components (such as the `double_` above) as of being | |
216 | type safe placeholders for the attribute values. | |
217 | ||
218 | [tip Similarly to the tip provided above, this example could be rewritten | |
219 | using /Spirit's/ multi-attribute API function: | |
220 | `` | |
221 | double d1 = 0.0, d2 = 0.0; | |
222 | qi::parse(begin, end, '(' >> qi::double_ >> ", " >> qi::double_ >> ')', d1, d2); | |
223 | karma::generate(out, '(' << karma::double_ << ", " << karma::double_ << ')', d1, d2); | |
224 | `` | |
225 | which provides a clear and comfortable syntax, more similar to the | |
226 | placeholder based syntax as exposed by `printf` or `boost::format`. | |
227 | ] | |
228 | ||
229 | Let's take a look at this from a more formal perspective. The sequence attribute | |
230 | propagation rules define a special behavior if generators exposing `unused_type` | |
231 | as their attribute are involved (see __sec_karma_compound__): | |
232 | ||
233 | [table | |
234 | [[Library] [Sequence attribute propagation rule]] | |
235 | [[Qi] [`a: A, b: Unused --> (a >> b): A`]] | |
236 | [[Karma] [`a: A, b: Unused --> (a << b): A`]] | |
237 | ] | |
238 | ||
239 | which reads as: | |
240 | ||
241 | [:Given `a` and `b` are parsers (generators), and `A` is the attribute type of | |
242 | `a`, and `unused_type` is the attribute type of `b`, then the attribute type | |
243 | of `a >> b` (`a << b`) will be `A` as well. This rule applies regardless of | |
244 | the position the element exposing the `unused_type` is at.] | |
245 | ||
246 | This rule is the key to the understanding of the attribute handling in | |
247 | sequences as soon as literals are involved. It is as if elements with | |
248 | `unused_type` attributes 'disappeared' during attribute propagation. Notably, | |
249 | this is not only true for sequences but for any compound components. For | |
250 | instance, for alternative components the corresponding rule is: | |
251 | ||
252 | a: A, b: Unused --> (a | b): A | |
253 | ||
254 | again, allowing to simplify the overall attribute type of an expression. | |
255 | ||
256 | [endsect] | |
257 | ||
258 | [/////////////////////////////////////////////////////////////////////////////] | |
259 | [section:nonterminal_attributes Attributes of Rules and Grammars] | |
260 | ||
261 | Nonterminals are well known from parsers where they are used as the main means | |
262 | of constructing more complex parsers out of simpler ones. The nonterminals in | |
263 | the parser world are very similar to functions in an imperative programming | |
264 | language. They can be used to encapsulate parser expressions for a particular | |
265 | input sequence. After being defined, the nonterminals can be used as 'normal' | |
266 | parsers in more complex expressions whenever the encapsulated input needs to be | |
267 | recognized. Parser nonterminals in __qi__ may accept /parameters/ (inherited | |
268 | attributes) and usually return a value (the synthesized attribute). | |
269 | ||
270 | Both, the types of the inherited and the synthesized attributes have to be | |
271 | explicitly specified while defining the particular `grammar` or the `rule` | |
272 | (the Spirit __repo__ additionally has `subrules` which conform to a similar | |
273 | interface). As an example, the following code declares a __qi__ `rule` | |
274 | exposing an `int` as its synthesized attribute, while expecting a single | |
275 | `double` as its inherited attribute (see the section about the __qi__ __rule__ | |
276 | for more information): | |
277 | ||
278 | qi::rule<Iterator, int(double)> r; | |
279 | ||
280 | In the world of generators, nonterminals are just as useful as in the parser | |
281 | world. Generator nonterminals encapsulate a format description for a particular | |
282 | data type, and, whenever we need to emit output for this data type, the | |
283 | corresponding nonterminal is invoked in a similar way as the predefined | |
284 | __karma__ generator primitives. The __karma__ [karma_nonterminal nonterminals] | |
285 | are very similar to the __qi__ nonterminals. Generator nonterminals may accept | |
286 | /parameters/ as well, and we call those inherited attributes too. The main | |
287 | difference is that they do not expose a synthesized attribute (as parsers do), | |
288 | but they require a special /consumed attribute/. Usually the consumed attribute | |
289 | is the value the generator creates its output from. Even if the consumed | |
290 | attribute is not 'returned' from the generator we chose to use the same | |
291 | function style declaration syntax as used in __qi__. The example below declares | |
292 | a __karma__ `rule` consuming a `double` while not expecting any additional | |
293 | inherited attributes. | |
294 | ||
295 | karma::rule<OutputIterator, double()> r; | |
296 | ||
297 | The inherited attributes of nonterminal parsers and generators are normally | |
298 | passed to the component during its invocation. These are the /parameters/ the | |
299 | parser or generator may accept and they can be used to parameterize the | |
300 | component depending on the context they are invoked from. | |
301 | ||
302 | ||
303 | [/ | |
304 | * attribute propagation | |
305 | * explicit and operator%= | |
306 | ] | |
307 | ||
308 | [endsect] | |
309 | ||
310 | [endsect] [/ Attributes] |