1 [/==============================================================================
2 Copyright (C) 2015 Joel de Guzman
4 Distributed under the Boost Software License, Version 1.0. (See accompanying
5 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6 ===============================================================================/]
8 [section:rexpr RExpressions - ASTs!]
10 Stop and think about it... We're actually generating ASTs (abstract
11 syntax trees) in our previoius examples. We parsed a single structure and
12 generated an in-memory representation of it in the form of a struct: the struct
13 employee. If we changed the implementation to parse one or more employees, the
14 result would be a std::vector<employee>. We can go on and add more hierarchy:
15 teams, departments, corporations. Then we'll have an AST representation of it
18 In this example, we'll explore more on how to create ASTs. We will parse a
19 minimalistic JSON-like language and compile the results into our data structures
20 in the form of a tree.
22 /rexpr/ is a parser for RExpressions, a language resembling a minimal subset
23 of json, limited to a dictionary (composed of key=value pairs) where the value
24 can itself be a string or a recursive dictionary.
37 This simple parser for X3 is intended as a minimal starting point. It is
38 minimal, yet complete with all things necessary parts that make up rules (and
39 grammars), except error handling and reporting.
41 The example can be found here: [@../../../example/x3/rexpr/rexpr_min/rexpr.cpp]
43 The same parser, but complete with error handling and reporting, better file
44 organization, separate compilation of grammars, and a full test harness, can
45 be found in this directory:
47 [@../../../example/x3/rexpr/rexpr_full/]
49 [note rexpr_full is the canonical structure proposed as best practice on how
50 parsers are written using Spirit X3.]
56 namespace client { namespace ast
58 namespace x3 = boost::spirit::x3;
62 struct rexpr_value : x3::variant<
64 , x3::forward_ast<rexpr>
67 using base_type::base_type;
68 using base_type::operator=;
71 typedef std::map<std::string, rexpr_value> rexpr_map;
72 typedef std::pair<std::string, rexpr_value> rexpr_key_value;
80 `x3::variant` is a support utility in Spirit X3 that extends __boost_variant__.
81 Typically, you use __boost_variant__ right out of the box and refer to a
82 particular template instantiation using a typedef. For example:
84 typedef boost::variant<std::string, int> my_variant;
86 Instead of doing that, we create a `class` (or `struct` in the case) that
87 subclasses from `x3::variant`. By making the variant a subclass, you have a
88 distinct type in your namespace. You also have control of the constructors
89 and assignment operators, as well as giving you the freedom to add more
92 `rexpr_value` has two variant elements. It can either be a `std::string` or a
93 `rexpr`. Since `rexpr` recursively contains a `rexpr_value`, it has to be
94 forward declared. This recursive data structure requires `rexpr` to be wrapped
95 in a `x3::forward_ast` as shown in the declaration.
97 We need to tell fusion about our `rexpr` struct to make it a first-class fusion
100 BOOST_FUSION_ADAPT_STRUCT(
105 So essentially, a `rexpr_value` value is either a `std::string` or a
106 `std::map<std::string, rexpr_value>`.
108 [heading Walking the AST]
110 We add a utility to print out the AST:
112 int const tabsize = 4;
116 typedef void result_type;
118 rexpr_printer(int indent = 0)
121 void operator()(rexpr const& ast) const
123 std::cout << '{' << std::endl;
124 for (auto const& entry : ast.entries)
127 std::cout << '"' << entry.first << "\" = ";
128 boost::apply_visitor(rexpr_printer(indent+tabsize), entry.second);
131 std::cout << '}' << std::endl;
134 void operator()(std::string const& text) const
136 std::cout << '"' << text << '"' << std::endl;
139 void tab(int spaces) const
141 for (int i = 0; i < spaces; ++i)
148 Traversing the AST is a recursive excercise. `rexpr_printer` is a function
149 object and the main entry point is `void operator()(rexpr const& ast)`. Notice
150 how it recursively calls an instance of itself for each of the key value pairs,
151 using `boost::apply_visitor`. Before and after iterating through all the
152 elements in the map, the braces are printed to surround the entire body, taking
153 care of correct indentation at each point in the recursive traversal.
155 The `operator()` for `std::string` should be self explanatory. It simply prints
156 the text inside double quotes.
158 [heading The Grammar]
160 namespace client { namespace parser
162 namespace x3 = boost::spirit::x3;
163 namespace ascii = boost::spirit::x3::ascii;
171 x3::rule<class rexpr_value, ast::rexpr_value>
172 rexpr_value = "rexpr_value";
174 x3::rule<class rexpr, ast::rexpr>
177 x3::rule<class rexpr_key_value, ast::rexpr_key_value>
178 rexpr_key_value = "rexpr_key_value";
180 auto const quoted_string =
181 lexeme['"' >> *(char_ - '"') >> '"'];
183 auto const rexpr_value_def =
184 quoted_string | rexpr;
186 auto const rexpr_key_value_def =
187 quoted_string >> '=' >> rexpr_value;
189 auto const rexpr_def =
190 '{' >> *rexpr_key_value >> '}';
192 BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value);
195 We declare three rules `rexpr_value`, `rexpr` and `rexpr_key_value`. Same deal
196 as before. The attributes declared for each rule are the same structures we
197 declared in our AST above. These are the structures our rules will produce.
199 So, going top-down (from the start rule), `rexpr` is defined as zero or more
200 `rexpr_key_value`s enclosed inside the curly braces. That's the kleene star in
201 action there: `*rexpr_key_value`.
203 [note Take note of the convention: we use `rexpr` name as the rule name and
204 `rexpr_def` as the rule definition name.]
206 To make the grammar clean, we capture the `quoted_string` production using
209 `rexpr_value` is a `quoted_string` followed by the assignment operator `'='`,
210 followed by a `rexpr_value`. `rexpr_value` is a `quoted_string` /OR/ a `rexpr`.
211 This uses the alternative operator `|`, and maps nicely to the variant AST the
212 rule is associated with.
214 Finally, we associate the rules and their definitions:
216 BOOST_SPIRIT_DEFINE(rexpr_value, rexpr, rexpr_key_value);