]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/============================================================================== |
2 | Copyright (C) 2001-2011 Hartmut Kaiser | |
3 | Copyright (C) 2001-2011 Joel de Guzman | |
4 | ||
5 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
6 | file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
7 | ===============================================================================/] | |
8 | ||
9 | [section Porting from Spirit 1.8.x] | |
10 | ||
11 | [import ../example/qi/porting_guide_classic.cpp] | |
12 | [import ../example/qi/porting_guide_qi.cpp] | |
13 | ||
14 | The current version of __spirit__ is a complete rewrite of earlier versions (we | |
15 | refer to earlier versions as __classic__). The parser generators are now only | |
16 | one part of the whole library. The parser submodule of __spirit__ is now called | |
17 | __qi__. It is conceptually different and exposes a completely different | |
18 | interface. Generally, there is no easy (or automated) way of converting parsers | |
19 | written for __classic__ to __qi__. Therefore this section can give only | |
20 | guidelines on how to approach porting your older parsers to the current version | |
21 | of __spirit__. | |
22 | ||
23 | [heading Include Files] | |
24 | ||
25 | The overall directory structure of the __spirit__ directories is described | |
26 | in the section __include_structure__ and the FAQ entry | |
27 | __include_structure_faq__. This should give you a good overview on how to find | |
28 | the needed header files for your new parsers. Moreover, each section in the | |
29 | __sec_qi_reference__ lists the required include files needed for any particular | |
30 | component. | |
31 | ||
32 | It is possible to tell from the name of a header file, what version it belongs | |
33 | to. While all main include files for __classic__ have the string 'classic_' in | |
34 | their name, for instance: | |
35 | ||
36 | #include <boost/spirit/include/classic_core.hpp> | |
37 | ||
38 | we named all main include files for __qi__ to have the string 'qi_' as part of | |
39 | their name, for instance: | |
40 | ||
41 | #include <boost/spirit/include/qi_core.hpp> | |
42 | ||
43 | The following table gives a rough list of corresponding header file between | |
44 | __classic__ and __qi__, but this can be used as a starting point only, as | |
45 | several components have either been moved to different submodules or might not | |
46 | exist in the never version anymore. We list only include files for the topmost | |
47 | submodules. For header files required for more lower level components please | |
48 | refer to the corresponding reference documentation of this component. | |
49 | ||
50 | [table | |
51 | [[Include file in /Spirit.Classic/] [Include file in /Spirit.Qi/]] | |
52 | [[`classic.hpp`] [`qi.hpp`]] | |
53 | [[`classic_actor.hpp`] [none, use __boost_phoenix__ for writing semantic actions]] | |
54 | [[`classic_attribute.hpp`] [none, use local variables for rules instead of closures, | |
55 | the primitives parsers now directly support lazy | |
56 | parameterization]] | |
57 | [[`classic_core.hpp`] [`qi_core.hpp`]] | |
58 | [[`classic_debug.hpp`] [`qi_debug.hpp`]] | |
59 | [[`classic_dynamic.hpp`] [none, use __qi__ predicates instead of if_p, while_p, for_p | |
60 | (included by `qi_core.hpp`), the equivalent for lazy_p | |
61 | is now included by `qi_auxiliary.hpp`]] | |
62 | [[`classic_error_handling.hpp`] [none, included in `qi_core.hpp`]] | |
63 | [[`classic_meta.hpp`] [none]] | |
64 | [[`classic_symbols.hpp`] [none, included in `qi_core.hpp`]] | |
65 | [[`classic_utility.hpp`] [none, not part of __qi__ anymore, these components | |
66 | will be added over time to the __repo__]] | |
67 | ] | |
68 | ||
69 | [heading The Free Parse Functions] | |
70 | ||
71 | The free parse functions (i.e. the main parser API) has been changed. This | |
72 | includes the names of the free functions as well as their interface. In | |
73 | __classic__ all free functions were named `parse`. In __qi__ they are are named | |
74 | either `qi::parse` or `qi::phrase_parse` depending on whether the parsing should | |
75 | be done using a skipper (`qi::phrase_parse`) or not (`qi::parse`). All free | |
76 | functions now return a simple `bool`. A returned `true` means success (i.e. the | |
77 | parser has matched) or `false` (i.e. the parser didn't match). This is | |
78 | equivalent to the former old `parse_info` member `hit`. __qi__ doesn't support | |
79 | tracking of the matched input length anymore. The old `parse_info` member | |
80 | `full` can be emulated by comparing the iterators after `qi::parse` returned. | |
81 | ||
82 | All code examples in this section assume the following include statements and | |
83 | using directives to be inserted. For __classic__: | |
84 | ||
85 | [porting_guide_classic_includes] | |
86 | [porting_guide_classic_namespace] | |
87 | ||
88 | and for __qi__: | |
89 | ||
90 | [porting_guide_qi_includes] | |
91 | [porting_guide_qi_namespace] | |
92 | ||
93 | The following similar examples should clarify the differences. First the | |
94 | base example in __classic__: | |
95 | ||
96 | [porting_guide_classic_parse] | |
97 | ||
98 | And here is the equivalent piece of code using __qi__: | |
99 | ||
100 | [porting_guide_qi_parse] | |
101 | ||
102 | The changes required for phrase parsing (i.e. parsing using a skipper) are | |
103 | similar. Here is how phrase parsing works in __classic__: | |
104 | ||
105 | [porting_guide_classic_phrase_parse] | |
106 | ||
107 | And here the equivalent example in __qi__: | |
108 | ||
109 | [porting_guide_qi_phrase_parse] | |
110 | ||
111 | Note, how character parsers are in a separate namespace (here | |
112 | `boost::spirit::ascii::space`) as __qi__ now supports working with different | |
113 | character sets. See the section __char_encoding_namespace__ for more information. | |
114 | ||
115 | [heading Naming Conventions] | |
116 | ||
117 | In __classic__ all parser primitives have suffixes appended to their names, | |
118 | encoding their type: `"_p"` for parsers, `"_a"` for lazy actions, `"_d"` for | |
119 | directives, etc. In __qi__ we don't have anything similar. The only suffixes | |
120 | are single underscore letters `"_"` applied where the name would otherwise | |
121 | conflict with a keyword or predefined name (such as `int_` for the | |
122 | integer parser). Overall, most, if not all primitive parsers and directives | |
123 | have been renamed. Please see the __qi_quickref__ for an overview on the | |
124 | names for the different available parser primitives, directives and operators. | |
125 | ||
126 | [heading Parser Attributes] | |
127 | ||
128 | In __classic__ most of the parser primitives don't expose a specific attribute | |
129 | type. Most parsers expose the pair of iterators pointing to the matched input | |
130 | sequence. As in __qi__ all parsers expose a parser specific attribute type it | |
131 | introduces a special directive __qi_raw__`[]` allowing to achieve a similar | |
132 | effect as in __classic__. The __qi_raw__`[]` directive exposes the pair of | |
133 | iterators pointing to the matching sequence of its embedded parser. Even if we | |
134 | very much encourage you to rewrite your parsers to take advantage of the | |
135 | generated parser specific attributes, sometimes it is helpful to get access to | |
136 | the underlying matched input sequence. | |
137 | ||
138 | [heading Grammars and Rules] | |
139 | ||
140 | The `grammar<>` and `rule<>` types are of equal importance to __qi__ as they | |
141 | are for __classic__. Their main purpose is still the same: they allow to | |
142 | define non-terminals and they are the main building blocks for more complex | |
143 | parsers. Nevertheless, both types have been redesigned and their interfaces | |
144 | have changed. Let's have a look at two examples first, we'll explain the | |
145 | differences afterwards. Here is a simple grammar and its usage in __classic__: | |
146 | ||
147 | [porting_guide_classic_grammar] | |
148 | [porting_guide_classic_use_grammar] | |
149 | ||
150 | And here is a similar grammar and its usage in __qi__: | |
151 | ||
152 | [porting_guide_qi_grammar] | |
153 | [porting_guide_qi_use_grammar] | |
154 | ||
155 | Both versions look similar enough, but we see several differences (we will | |
156 | cover each of those differences in more detail below): | |
157 | ||
158 | * Neither the grammars nor the rules depend on a scanner type anymore, both | |
159 | depend only on the underlying iterator type. That means the dreaded scanner | |
160 | business is no issue anymore! | |
161 | * Grammars have no embedded class `definition` anymore | |
162 | * Grammars and rules may have an explicit attribute type specified in their | |
163 | definition | |
164 | * Grammars do not have any explicit start rules anymore. Instead one of the | |
165 | contained rules is used as a start rule by default. | |
166 | ||
167 | The first two points are tightly interrelated. The scanner business (see the | |
168 | FAQ number one of __classic__ here: __scanner_business__) has been | |
169 | a problem for a long time. The grammar and rule types have been specifically | |
170 | redesigned to avoid this problem in the future. This also means that we don't | |
171 | need any delayed instantiation of the inner definition class in a grammar | |
172 | anymore. So the redesign not only helped fixing a long standing design problem, | |
173 | it helped to simplify things considerably. | |
174 | ||
175 | All __qi__ parser components have well defined attribute types. Grammars and | |
176 | rules are no exception. But since both need to be generic enough to be usable | |
177 | for any parser their attribute type has to be explicitly specified. In the | |
178 | example above the `roman` grammar and the rule `first` both have an `unsigned` | |
179 | attribute: | |
180 | ||
181 | // grammar definition | |
182 | template <typename Iterator> | |
183 | struct roman : qi::grammar<Iterator, unsigned()> {...}; | |
184 | ||
185 | // rule definition | |
186 | qi::rule<Iterator, unsigned()> first; | |
187 | ||
188 | The used notation resembles the definition of a function type. This is very | |
189 | natural as you can think of the synthesized attribute of the grammar and the | |
190 | rule as of its 'return value'. In fact the rule and the grammar both 'return' | |
191 | an unsigned value - the value they matched. | |
192 | ||
193 | [note The function type notation allows to specify parameters as well. These | |
194 | are interpreted as the types of inherited attributes the rule or | |
195 | grammar expect to be passed during parsing. For more information | |
196 | please see the section about inherited and synthesized attributes for | |
197 | rules and grammars (__sec_attributes__).] | |
198 | ||
199 | If no attribute is desired none needs to be specified. The default attribute | |
200 | type for both, grammars and rules, is __unused_type__, which is a special | |
201 | placeholder type. Generally, using __unused_type__ as the attribute of a parser | |
202 | is interpreted as 'this parser has no attribute'. This is mostly used for | |
203 | parsers applied to parts of the input not carrying any significant information, | |
204 | rather being delimiters or structural elements needed for correct interpretation | |
205 | of the input. | |
206 | ||
207 | The last difference might seem to be rather cosmetic and insignificant. But it | |
208 | turns out that not having to specify which rule in a grammar is the start rule | |
209 | (by returning it from the function `start()`) also means that any rule in a | |
210 | grammar can be directly used as the start rule. Nevertheless, the grammar base | |
211 | class gets initialized with the rule it has to use as the start rule in case | |
212 | the grammar instance is directly used as a parser. | |
213 | ||
214 | [endsect] | |
215 |