]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/spirit/doc/x3/tutorial/rexpr.qbk
add subtree-ish sources for 12.0.3
[ceph.git] / ceph / src / boost / libs / spirit / doc / x3 / tutorial / rexpr.qbk
1 [/==============================================================================
2 Copyright (C) 2015 Joel de Guzman
3
4 Distributed under the Boost Software License, Version 1.0. (See accompanying
5 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6 ===============================================================================/]
7
8 [section:rexpr RExpressions - ASTs!]
9
10 Stop and think about it... We're actually generating ASTs (abstract
11 syntax trees) in our previoius examples. We parsed a single structure and
12 generated an in-memory representation of it in the form of a struct: the struct
13 employee. If we changed the implementation to parse one or more employees, the
14 result would be a std::vector<employee>. We can go on and add more hierarchy:
15 teams, departments, corporations. Then we'll have an AST representation of it
16 all.
17
18 In this example, we'll explore more on how to create ASTs. We will parse a
19 minimalistic JSON-like language and compile the results into our data structures
20 in the form of a tree.
21
22 /rexpr/ is a parser for RExpressions, a language resembling a minimal subset
23 of json, limited to a dictionary (composed of key=value pairs) where the value
24 can itself be a string or a recursive dictionary.
25
26 Here's an Example:
27
28 {
29 "color" = "blue"
30 "size" = "29 cm."
31 "position" = {
32 "x" = "123"
33 "y" = "456"
34 }
35 }
36
37 This simple parser for X3 is intended as a minimal starting point. It is
38 minimal, yet complete with all things necessary parts that make up rules (and
39 grammars), except error handling and reporting.
40
41 The example can be found here: [@../../../example/x3/rexpr/rexpr_min/rexpr.cpp]
42
43 The same parser, but complete with error handling and reporting, better file
44 organization, separate compilation of grammars, and a full test harness, can
45 be found in this directory:
46
47 [@../../../example/x3/rexpr/rexpr_full/]
48
49 [note rexpr_full is the canonical structure proposed as best practice on how
50 parsers are written using Spirit X3.]
51
52 [heading The AST]
53
54 Here's the AST:
55
56 namespace client { namespace ast
57 {
58 namespace x3 = boost::spirit::x3;
59
60 struct rexpr;
61
62 struct rexpr_value : x3::variant<
63 std::string
64 , x3::forward_ast<rexpr>
65 >
66 {
67 using base_type::base_type;
68 using base_type::operator=;
69 };
70
71 typedef std::map<std::string, rexpr_value> rexpr_map;
72 typedef std::pair<std::string, rexpr_value> rexpr_key_value;
73
74 struct rexpr
75 {
76 rexpr_map entries;
77 };
78 }}
79
80 `x3::variant` is a support utility in Spirit X3 that extends __boost_variant__.
81 Typically, you use __boost_variant__ right out of the box and refer to a
82 particular template instantiation using a typedef. For example:
83
84 typedef boost::variant<std::string, int> my_variant;
85
86 Instead of doing that, we create a `class` (or `struct` in the case) that
87 subclasses from `x3::variant`. By making the variant a subclass, you have a
88 distinct type in your namespace. You also have control of the constructors
89 and assignment operators, as well as giving you the freedom to add more
90 functionality.
91
92 `rexpr_value` has two variant elements. It can either be a `std::string` or a
93 `rexpr`. Since `rexpr` recursively contains a `rexpr_value`, it has to be
94 forward declared. This recursive data structure requires `rexpr` to be wrapped
95 in a `x3::forward_ast` as shown in the declaration.
96
97 We need to tell fusion about our `rexpr` struct to make it a first-class fusion
98 citizen:
99
100 BOOST_FUSION_ADAPT_STRUCT(
101 client::ast::rexpr,
102 entries
103 )
104
105 So essentially, a `rexpr_value` value is either a `std::string` or a
106 `std::map<std::string, rexpr_value>`.
107
108 [heading Walking the AST]
109
110 We add a utility to print out the AST:
111
112 int const tabsize = 4;
113
114 struct rexpr_printer
115 {
116 typedef void result_type;
117
118 rexpr_printer(int indent = 0)
119 : indent(indent) {}
120
121 void operator()(rexpr const& ast) const
122 {
123 std::cout << '{' << std::endl;
124 for (auto const& entry : ast.entries)
125 {
126 tab(indent+tabsize);
127 std::cout << '"' << entry.first << "\" = ";
128 boost::apply_visitor(rexpr_printer(indent+tabsize), entry.second);
129 }
130 tab(indent);
131 std::cout << '}' << std::endl;
132 }
133
134 void operator()(std::string const& text) const
135 {
136 std::cout << '"' << text << '"' << std::endl;
137 }
138
139 void tab(int spaces) const
140 {
141 for (int i = 0; i < spaces; ++i)
142 std::cout << ' ';
143 }
144
145 int indent;
146 };
147
148 Traversing the AST is a recursive excercise. `rexpr_printer` is a function
149 object and the main entry point is `void operator()(rexpr const& ast)`. Notice
150 how it recursively calls an instance of itself for each of the key value pairs,
151 using `boost::apply_visitor`. Before and after iterating through all the
152 elements in the map, the braces are printed to surround the entire body, taking
153 care of correct indentation at each point in the recursive traversal.
154
155 The `operator()` for `std::string` should be self explanatory. It simply prints
156 the text inside double quotes.
157
158 [heading The Grammar]
159
160 namespace client { namespace parser
161 {
162 namespace x3 = boost::spirit::x3;
163 namespace ascii = boost::spirit::x3::ascii;
164
165 using x3::lit;
166 using x3::lexeme;
167
168 using ascii::char_;
169 using ascii::string;
170
171 x3::rule<class rexpr_value, ast::rexpr_value>
172 rexpr_value = "rexpr_value";
173
174 x3::rule<class rexpr, ast::rexpr>
175 rexpr = "rexpr";
176
177 x3::rule<class rexpr_key_value, ast::rexpr_key_value>
178 rexpr_key_value = "rexpr_key_value";
179
180 auto const quoted_string =
181 lexeme['"' >> *(char_ - '"') >> '"'];
182
183 auto const rexpr_value_def =
184 quoted_string | rexpr;
185
186 auto const rexpr_key_value_def =
187 quoted_string >> '=' >> rexpr_value;
188
189 auto const rexpr_def =
190 '{' >> *rexpr_key_value >> '}';
191
192 BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value);
193 }}
194
195 We declare three rules `rexpr_value`, `rexpr` and `rexpr_key_value`. Same deal
196 as before. The attributes declared for each rule are the same structures we
197 declared in our AST above. These are the structures our rules will produce.
198
199 So, going top-down (from the start rule), `rexpr` is defined as zero or more
200 `rexpr_key_value`s enclosed inside the curly braces. That's the kleene star in
201 action there: `*rexpr_key_value`.
202
203 [note Take note of the convention: we use `rexpr` name as the rule name and
204 `rexpr_def` as the rule definition name.]
205
206 To make the grammar clean, we capture the `quoted_string` production using
207 C++ `auto`.
208
209 `rexpr_value` is a `quoted_string` followed by the assignment operator `'='`,
210 followed by a `rexpr_value`. `rexpr_value` is a `quoted_string` /OR/ a `rexpr`.
211 This uses the alternative operator `|`, and maps nicely to the variant AST the
212 rule is associated with.
213
214 Finally, we associate the rules and their definitions:
215
216 BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value);
217
218 [endsect]