1 [/==============================================================================
2 Copyright (C) 2001-2015 Hartmut Kaiser
3 Copyright (C) 2001-2011 Joel de Guzman
5 Distributed under the Boost Software License, Version 1.0. (See accompanying
6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7 ===============================================================================/]
9 [/////////////////////////////////////////////////////////////////////////////]
10 [section:primitive_attributes Attributes of Primitive Components]
12 Parsers in __spirit__ are fully attributed. __x3__ parsers always /expose/ an
13 attribute specific to their type. This is called /synthesized attribute/ as it
14 is returned from a successful match representing the matched input sequence. For
15 instance, numeric parsers, such as `int_` or `double_`, return the `int` or
16 `double` value converted from the matched input sequence. Other primitive parser
17 components have other intuitive attribute types, such as for instance `int_`
18 which has `int`, or `ascii::char_` which has `char`. Primitive parsers apply the
19 normal C++ convertibility rules: you can use any C++ type to receive the parsed
20 value as long as the attribute type of the parser is convertible to the type
21 provided. The following example shows how a synthesized parser attribute (the
22 `int` value) is extracted by calling the API function `x3::parse`:
25 std::string str("123");
26 std::string::iterator strbegin = str.begin();
27 x3::parse(strbegin, str.end(), int_, value); // value == 123
29 For a full list of available parser primitives and their attribute types please
30 see the sections __sec_x3_primitive__.
34 [/////////////////////////////////////////////////////////////////////////////]
35 [section:compound_attributes Attributes of Compound Components]
37 __x3__ implement well defined attribute type propagation rules for all compound
38 parsers, such as sequences, alternatives, Kleene star, etc. The main attribute
39 propagation rule for a sequences is for instance:
41 a: A, b: B --> (a >> b): tuple<A, B>
45 [:Given `a` and `b` are parsers, and `A` is the attribute type of `a`, and `B`
46 is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`)
47 will be `tuple<A, B>`.]
49 [note The notation `tuple<A, B>` is used as a placeholder expression for any
50 fusion sequence holding the types A and B, such as `boost::fusion::tuple<A, B>`
51 or `std::pair<A, B>` (for more information see __fusion__).]
53 As you can see, in order for a type to be compatible with the attribute type
54 of a compound expression it has to
56 * either be convertible to the attribute type,
57 * or it has to expose certain functionalities, i.e. it needs to conform to a
58 concept compatible with the component.
60 Each compound component implements its own set of attribute propagation rules.
61 For a full list of how the different compound parsers consume attributes
62 see the sections __sec_x3_compound__.
64 [heading The Attribute of Sequence Parsers]
66 Sequences require an attribute type to expose the concept of a fusion sequence,
67 where all elements of that fusion sequence have to be compatible with the
68 corresponding element of the component sequence. For example, the expression:
72 is compatible with any fusion sequence holding two types, where both types have
73 to be compatible with `double`. The first element of the fusion sequence has to
74 be compatible with the attribute of the first `double_`, and the second element
75 of the fusion sequence has to be compatible with the attribute of the second
76 `double_`. If we assume to have an instance of a `std::pair<double, double>`,
77 we can directly use the expressions above to do both, parse input to fill the
80 // the following parses "1.0 2.0" into a pair of double
81 std::string input("1.0 2.0");
82 std::string::iterator strbegin = input.begin();
83 std::pair<double, double> p;
84 x3::phrase_parse(strbegin, input.end(),
85 x3::double_ >> x3::double_, // parser grammar
86 x3::space, // delimiter grammar
87 p); // attribute to fill while parsing
89 [tip *For sequences only:* __x3__ exposes a set of API functions
90 usable mainly with sequences. Very much like the functions of the `scanf`
91 and `printf` families these functions allow to pass the attributes for
92 each of the elements of the sequence separately. Using the corresponding
93 overload of /X3's/ parse function, the expression above
94 could be rewritten as:
96 double d1 = 0.0, d2 = 0.0;
97 x3::phrase_parse(begin, end, x3::double_ >> x3::double_, x3::space, d1, d2);
99 where the first attribute is used for the first `double_`, and
100 the second attribute is used for the second `double_`.
103 [heading The Attribute of Alternative Parsers]
105 Alternative parsers are all about - well - alternatives. In
106 order to store possibly different result (attribute) types from the different
107 alternatives we use the data type __boost_variant__. The main attribute
108 propagation rule of these components is:
110 a: A, b: B --> (a | b): variant<A, B>
112 Alternatives have a second very important attribute propagation rule:
114 a: A, b: A --> (a | b): A
116 often simplifying things significantly. If all sub expressions of
117 an alternative expose the same attribute type, the overall alternative
118 will expose exactly the same attribute type as well.
122 [/////////////////////////////////////////////////////////////////////////////]
123 [section:more_compound_attributes More About Attributes of Compound Components]
125 While parsing input, it is often desirable to combine some
126 constant elements with variable parts. For instance, let us look at the example
127 of parsing or formatting a complex number, which is written as `(real, imag)`,
128 where `real` and `imag` are the variables representing the real and imaginary
129 parts of our complex number. This can be achieved by writing:
131 '(' >> double_ >> ", " >> double_ >> ')'
133 Literals (such as `'('` and `", "`) do /not/ expose any attribute
134 (well actually, they do expose the special type `unused_type`, but in this
135 context `unused_type` is interpreted as if the component does not expose any
136 attribute at all). It is very important to understand that the literals don't
137 consume any of the elements of a fusion sequence passed to this component
138 sequence. As said, they just don't expose any attribute and don't produce
139 (consume) any data. The following example shows this:
141 // the following parses "(1.0, 2.0)" into a pair of double
142 std::string input("(1.0, 2.0)");
143 std::string::iterator strbegin = input.begin();
144 std::pair<double, double> p;
145 x3::parse(strbegin, input.end(),
146 '(' >> x3::double_ >> ", " >> x3::double_ >> ')', // parser grammar
147 p); // attribute to fill while parsing
149 where the first element of the pair passed in as the data to generate is still
150 associated with the first `double_`, and the second element is associated with
151 the second `double_` parser.
153 This behavior should be familiar as it conforms to the way other input and
154 output formatting libraries such as `scanf`, `printf` or `boost::format` are
155 handling their variable parts. In this context you can think about __x3__'s
156 primitive components (such as the `double_` above) as of being
157 type safe placeholders for the attribute values.
159 [tip Similarly to the tip provided above, this example could be rewritten
160 using /Spirit's/ multi-attribute API function:
162 double d1 = 0.0, d2 = 0.0;
163 x3::parse(begin, end, '(' >> x3::double_ >> ", " >> x3::double_ >> ')', d1, d2);
165 which provides a clear and comfortable syntax, more similar to the
166 placeholder based syntax as exposed by `printf` or `boost::format`.
169 Let's take a look at this from a more formal perspective:
171 a: A, b: Unused --> (a >> b): A
175 [:Given `a` and `b` are parsers, and `A` is the attribute type of
176 `a`, and `unused_type` is the attribute type of `b`, then the attribute type
177 of `a >> b` (`a << b`) will be `A` as well. This rule applies regardless of
178 the position the element exposing the `unused_type` is at.]
180 This rule is the key to the understanding of the attribute handling in
181 sequences as soon as literals are involved. It is as if elements with
182 `unused_type` attributes 'disappeared' during attribute propagation. Notably,
183 this is not only true for sequences but for any compound components. For
184 instance, for alternative components the corresponding rule is:
186 a: A, b: Unused --> (a | b): A
188 again, allowing to simplify the overall attribute type of an expression.
192 [/////////////////////////////////////////////////////////////////////////////]
193 [section:nonterminal_attributes Attributes of Nonterminals]
195 Nonterminals are the main means of constructing more complex parsers out of
196 simpler ones. The nonterminals in the parser world are very similar to functions
197 in an imperative programming language. They can be used to encapsulate parser
198 expressions for a particular input sequence. After being defined, the
199 nonterminals can be used as 'normal' parsers in more complex expressions
200 whenever the encapsulated input needs to be recognized. Parser nonterminals in
201 __x3__ usually return a value (the synthesized attribute).
203 The type of the synthesized attribute as to be explicitly specified while
204 defining the particular nonterminal. Example (ignore ID for now):