[ceph.git] / ceph / src / boost / libs / spirit / doc / notes / porting_from_1_8.qbk

[/==============================================================================
    Copyright (C) 2001-2011 Hartmut Kaiser
    Copyright (C) 2001-2011 Joel de Guzman

    Distributed under the Boost Software License, Version 1.0. (See accompanying
    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
===============================================================================/]

[section Porting from Spirit 1.8.x]

[import ../example/qi/porting_guide_classic.cpp]
[import ../example/qi/porting_guide_qi.cpp]

The current version of __spirit__ is a complete rewrite of earlier versions (we
refer to earlier versions as __classic__). The parser generators are now only 
one part of the whole library. The parser submodule of __spirit__ is now called 
__qi__. It is conceptually different and exposes a completely different 
interface. Generally, there is no easy (or automated) way of converting parsers 
written for __classic__ to __qi__. Therefore this section can give only 
guidelines on how to approach porting your older parsers to the current version 
of __spirit__.

[heading Include Files]

The overall directory structure of the __spirit__ directories is described 
in the section __include_structure__ and the FAQ entry 
__include_structure_faq__. This should give you a good overview on how to find 
the needed header files for your new parsers. Moreover, each section in the 
__sec_qi_reference__ lists the required include files needed for any particular 
component.

It is possible to tell from the name of a header file, what version it belongs 
to. While all main include files for __classic__ have the string 'classic_' in 
their name, for instance:

    #include <boost/spirit/include/classic_core.hpp>

we named all main include files for __qi__ to have the string 'qi_' as part of 
their name, for instance:

    #include <boost/spirit/include/qi_core.hpp>

The following table gives a rough list of corresponding header file between 
__classic__ and __qi__, but this can be used as a starting point only, as 
several components have either been moved to different submodules or might not
exist in the never version anymore. We list only include files for the topmost 
submodules. For header files required for more lower level components please
refer to the corresponding reference documentation of this component.

[table
    [[Include file in /Spirit.Classic/] [Include file in /Spirit.Qi/]]
    [[`classic.hpp`]                    [`qi.hpp`]]
    [[`classic_actor.hpp`]              [none, use __boost_phoenix__ for writing semantic actions]]
    [[`classic_attribute.hpp`]          [none, use local variables for rules instead of closures, 
                                         the primitives parsers now directly support lazy
                                         parameterization]]
    [[`classic_core.hpp`]               [`qi_core.hpp`]]
    [[`classic_debug.hpp`]              [`qi_debug.hpp`]]
    [[`classic_dynamic.hpp`]            [none, use __qi__ predicates instead of if_p, while_p, for_p
                                         (included by `qi_core.hpp`), the equivalent for lazy_p 
                                         is now included by `qi_auxiliary.hpp`]]
    [[`classic_error_handling.hpp`]     [none, included in `qi_core.hpp`]]
    [[`classic_meta.hpp`]               [none]]
    [[`classic_symbols.hpp`]            [none, included in `qi_core.hpp`]]
    [[`classic_utility.hpp`]            [none, not part of __qi__ anymore, these components
                                         will be added over time to the __repo__]]
]

[heading The Free Parse Functions]

The free parse functions (i.e. the main parser API) has been changed. This 
includes the names of the free functions as well as their interface. In 
__classic__ all free functions were named `parse`. In __qi__ they are are named 
either `qi::parse` or `qi::phrase_parse` depending on whether the parsing should 
be done using a skipper (`qi::phrase_parse`) or not (`qi::parse`). All free 
functions now return a simple `bool`. A returned `true` means success (i.e. the
parser has matched) or `false` (i.e. the parser didn't match). This is
equivalent to the former old `parse_info` member `hit`. __qi__ doesn't support
tracking of the matched input length anymore. The old `parse_info` member 
`full` can be emulated by comparing the iterators after `qi::parse` returned.

All code examples in this section assume the following include statements and 
using directives to be inserted. For __classic__:

[porting_guide_classic_includes]
[porting_guide_classic_namespace]

and for __qi__:

[porting_guide_qi_includes]
[porting_guide_qi_namespace]

The following similar examples should clarify the differences. First the 
base example in __classic__:

[porting_guide_classic_parse]

And here is the equivalent piece of code using __qi__:

[porting_guide_qi_parse]

The changes required for phrase parsing (i.e. parsing using a skipper) are 
similar. Here is how phrase parsing works in __classic__:

[porting_guide_classic_phrase_parse]

And here the equivalent example in __qi__:

[porting_guide_qi_phrase_parse]

Note, how character parsers are in a separate namespace (here 
`boost::spirit::ascii::space`) as __qi__ now supports working with different 
character sets. See the section __char_encoding_namespace__ for more information.

[heading Naming Conventions]

In __classic__ all parser primitives have suffixes appended to their names, 
encoding their type: `"_p"` for parsers, `"_a"` for lazy actions, `"_d"` for 
directives, etc. In __qi__ we don't have anything similar. The only suffixes
are single underscore letters `"_"` applied where the name would otherwise 
conflict with a keyword or predefined name (such as `int_` for the
integer parser). Overall, most, if not all primitive parsers and directives
have been renamed. Please see the __qi_quickref__ for an overview on the 
names for the different available parser primitives, directives and operators.

[heading Parser Attributes]

In __classic__ most of the parser primitives don't expose a specific attribute 
type. Most parsers expose the pair of iterators pointing to the matched input
sequence. As in __qi__ all parsers expose a parser specific attribute type it
introduces a special directive __qi_raw__`[]` allowing to achieve a similar 
effect as in __classic__. The __qi_raw__`[]` directive exposes the pair of 
iterators pointing to the matching sequence of its embedded parser. Even if we 
very much encourage you to rewrite your parsers to take advantage of the 
generated parser specific attributes, sometimes it is helpful to get access to 
the underlying matched input sequence.

[heading Grammars and Rules]

The `grammar<>` and `rule<>` types are of equal importance to __qi__ as they 
are for __classic__. Their main purpose is still the same: they allow to 
define non-terminals and they are the main building blocks for more complex 
parsers. Nevertheless, both types have been redesigned and their interfaces 
have changed. Let's have a look at two examples first, we'll explain the 
differences afterwards. Here is a simple grammar and its usage in __classic__:

[porting_guide_classic_grammar]
[porting_guide_classic_use_grammar]

And here is a similar grammar and its usage in __qi__:

[porting_guide_qi_grammar]
[porting_guide_qi_use_grammar]

Both versions look similar enough, but we see several differences (we will 
cover each of those differences in more detail below):

* Neither the grammars nor the rules depend on a scanner type anymore, both 
  depend only on the underlying iterator type. That means the dreaded scanner 
  business is no issue anymore!
* Grammars have no embedded class `definition` anymore
* Grammars and rules may have an explicit attribute type specified in their 
  definition
* Grammars do not have any explicit start rules anymore. Instead one of the 
  contained rules is used as a start rule by default.

The first two points are tightly interrelated. The scanner business (see the 
FAQ number one of __classic__ here: __scanner_business__) has been 
a problem for a long time. The grammar and rule types have been specifically 
redesigned to avoid this problem in the future. This also means that we don't 
need any delayed instantiation of the inner definition class in a grammar 
anymore. So the redesign not only helped fixing a long standing design problem, 
it helped to simplify things considerably.

All __qi__ parser components have well defined attribute types. Grammars and 
rules are no exception. But since both need to be generic enough to be usable 
for any parser their attribute type has to be explicitly specified. In the
example above the `roman` grammar and the rule `first` both have an `unsigned` 
attribute:

    // grammar definition
    template <typename Iterator>
    struct roman : qi::grammar<Iterator, unsigned()> {...};

    // rule definition
    qi::rule<Iterator, unsigned()> first;

The used notation resembles the definition of a function type. This is very
natural as you can think of the synthesized attribute of the grammar and the 
rule as of its 'return value'. In fact the rule and the grammar both 'return'
an unsigned value - the value they matched. 

[note    The function type notation allows to specify parameters as well. These
         are interpreted as the types of inherited attributes the rule or 
         grammar expect to be passed during parsing. For more information 
         please see the section about inherited and synthesized attributes for
         rules and grammars (__sec_attributes__).]

If no attribute is desired none needs to be specified. The default attribute 
type for both, grammars and rules, is __unused_type__, which is a special 
placeholder type. Generally, using __unused_type__ as the attribute of a parser
is interpreted as 'this parser has no attribute'. This is mostly used for 
parsers applied to parts of the input not carrying any significant information, 
rather being delimiters or structural elements needed for correct interpretation 
of the input.

The last difference might seem to be rather cosmetic and insignificant. But it 
turns out that not having to specify which rule in a grammar is the start rule
(by returning it from the function `start()`) also means that any rule in a
grammar can be directly used as the start rule. Nevertheless, the grammar base 
class gets initialized with the rule it has to use as the start rule in case 
the grammar instance is directly used as a parser.

[endsect]
Commit	Line	Data
7c673cae FG	1	[/==============================================================================
	2	Copyright (C) 2001-2011 Hartmut Kaiser
	3	Copyright (C) 2001-2011 Joel de Guzman
	4
	5	Distributed under the Boost Software License, Version 1.0. (See accompanying
	6	file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
	7	===============================================================================/]
	8
	9	[section Porting from Spirit 1.8.x]
	10
	11	[import ../example/qi/porting_guide_classic.cpp]
	12	[import ../example/qi/porting_guide_qi.cpp]
	13
	14	The current version of __spirit__ is a complete rewrite of earlier versions (we
	15	refer to earlier versions as __classic__). The parser generators are now only
	16	one part of the whole library. The parser submodule of __spirit__ is now called
	17	__qi__. It is conceptually different and exposes a completely different
	18	interface. Generally, there is no easy (or automated) way of converting parsers
	19	written for __classic__ to __qi__. Therefore this section can give only
	20	guidelines on how to approach porting your older parsers to the current version
	21	of __spirit__.
	22
	23	[heading Include Files]
	24
	25	The overall directory structure of the __spirit__ directories is described
	26	in the section __include_structure__ and the FAQ entry
	27	__include_structure_faq__. This should give you a good overview on how to find
	28	the needed header files for your new parsers. Moreover, each section in the
	29	__sec_qi_reference__ lists the required include files needed for any particular
	30	component.
	31
	32	It is possible to tell from the name of a header file, what version it belongs
	33	to. While all main include files for __classic__ have the string 'classic_' in
	34	their name, for instance:
	35
	36	#include <boost/spirit/include/classic_core.hpp>
	37
	38	we named all main include files for __qi__ to have the string 'qi_' as part of
	39	their name, for instance:
	40
	41	#include <boost/spirit/include/qi_core.hpp>
	42
	43	The following table gives a rough list of corresponding header file between
	44	__classic__ and __qi__, but this can be used as a starting point only, as
	45	several components have either been moved to different submodules or might not
	46	exist in the never version anymore. We list only include files for the topmost
	47	submodules. For header files required for more lower level components please
	48	refer to the corresponding reference documentation of this component.
	49
	50	[table
	51	[[Include file in /Spirit.Classic/] [Include file in /Spirit.Qi/]]
	52	[[`classic.hpp`] [`qi.hpp`]]
	53	[[`classic_actor.hpp`] [none, use __boost_phoenix__ for writing semantic actions]]
	54	[[`classic_attribute.hpp`] [none, use local variables for rules instead of closures,
	55	the primitives parsers now directly support lazy
	56	parameterization]]
	57	[[`classic_core.hpp`] [`qi_core.hpp`]]
	58	[[`classic_debug.hpp`] [`qi_debug.hpp`]]
	59	[[`classic_dynamic.hpp`] [none, use __qi__ predicates instead of if_p, while_p, for_p
	60	(included by `qi_core.hpp`), the equivalent for lazy_p
	61	is now included by `qi_auxiliary.hpp`]]
	62	[[`classic_error_handling.hpp`] [none, included in `qi_core.hpp`]]
	63	[[`classic_meta.hpp`] [none]]
	64	[[`classic_symbols.hpp`] [none, included in `qi_core.hpp`]]
65	[[`classic_utility.hpp`] [none, not part of __qi__ anymore, these components
66	will be added over time to the __repo__]]
67	]
68
69	[heading The Free Parse Functions]
70
71	The free parse functions (i.e. the main parser API) has been changed. This
72	includes the names of the free functions as well as their interface. In
73	__classic__ all free functions were named `parse`. In __qi__ they are are named
74	either `qi::parse` or `qi::phrase_parse` depending on whether the parsing should
75	be done using a skipper (`qi::phrase_parse`) or not (`qi::parse`). All free
76	functions now return a simple `bool`. A returned `true` means success (i.e. the
77	parser has matched) or `false` (i.e. the parser didn't match). This is
78	equivalent to the former old `parse_info` member `hit`. __qi__ doesn't support
79	tracking of the matched input length anymore. The old `parse_info` member
80	`full` can be emulated by comparing the iterators after `qi::parse` returned.
81
82	All code examples in this section assume the following include statements and
83	using directives to be inserted. For __classic__:
84
85	[porting_guide_classic_includes]
86	[porting_guide_classic_namespace]
87
88	and for __qi__:
89
90	[porting_guide_qi_includes]
91	[porting_guide_qi_namespace]
92
93	The following similar examples should clarify the differences. First the
94	base example in __classic__:
95
96	[porting_guide_classic_parse]
97
98	And here is the equivalent piece of code using __qi__:
99
100	[porting_guide_qi_parse]
101
102	The changes required for phrase parsing (i.e. parsing using a skipper) are
103	similar. Here is how phrase parsing works in __classic__:
104
105	[porting_guide_classic_phrase_parse]
106
107	And here the equivalent example in __qi__:
108
109	[porting_guide_qi_phrase_parse]
110
111	Note, how character parsers are in a separate namespace (here
112	`boost::spirit::ascii::space`) as __qi__ now supports working with different
113	character sets. See the section __char_encoding_namespace__ for more information.
114
115	[heading Naming Conventions]
116
117	In __classic__ all parser primitives have suffixes appended to their names,
118	encoding their type: `"_p"` for parsers, `"_a"` for lazy actions, `"_d"` for
119	directives, etc. In __qi__ we don't have anything similar. The only suffixes
120	are single underscore letters `"_"` applied where the name would otherwise
121	conflict with a keyword or predefined name (such as `int_` for the
122	integer parser). Overall, most, if not all primitive parsers and directives
123	have been renamed. Please see the __qi_quickref__ for an overview on the
124	names for the different available parser primitives, directives and operators.
125
126	[heading Parser Attributes]
127
128	In __classic__ most of the parser primitives don't expose a specific attribute
129	type. Most parsers expose the pair of iterators pointing to the matched input
130	sequence. As in __qi__ all parsers expose a parser specific attribute type it
131	introduces a special directive __qi_raw__`[]` allowing to achieve a similar
132	effect as in __classic__. The __qi_raw__`[]` directive exposes the pair of
133	iterators pointing to the matching sequence of its embedded parser. Even if we
134	very much encourage you to rewrite your parsers to take advantage of the
135	generated parser specific attributes, sometimes it is helpful to get access to
136	the underlying matched input sequence.
137
138	[heading Grammars and Rules]
139
140	The `grammar<>` and `rule<>` types are of equal importance to __qi__ as they
141	are for __classic__. Their main purpose is still the same: they allow to
142	define non-terminals and they are the main building blocks for more complex
143	parsers. Nevertheless, both types have been redesigned and their interfaces
144	have changed. Let's have a look at two examples first, we'll explain the
145	differences afterwards. Here is a simple grammar and its usage in __classic__:
146
147	[porting_guide_classic_grammar]
148	[porting_guide_classic_use_grammar]
149
150	And here is a similar grammar and its usage in __qi__:
151
152	[porting_guide_qi_grammar]
153	[porting_guide_qi_use_grammar]
154
155	Both versions look similar enough, but we see several differences (we will
156	cover each of those differences in more detail below):
157
158	* Neither the grammars nor the rules depend on a scanner type anymore, both
159	depend only on the underlying iterator type. That means the dreaded scanner
160	business is no issue anymore!
161	* Grammars have no embedded class `definition` anymore
162	* Grammars and rules may have an explicit attribute type specified in their
163	definition
164	* Grammars do not have any explicit start rules anymore. Instead one of the
165	contained rules is used as a start rule by default.
166
167	The first two points are tightly interrelated. The scanner business (see the
168	FAQ number one of __classic__ here: __scanner_business__) has been
169	a problem for a long time. The grammar and rule types have been specifically
170	redesigned to avoid this problem in the future. This also means that we don't
171	need any delayed instantiation of the inner definition class in a grammar
172	anymore. So the redesign not only helped fixing a long standing design problem,
173	it helped to simplify things considerably.
174
175	All __qi__ parser components have well defined attribute types. Grammars and
176	rules are no exception. But since both need to be generic enough to be usable
177	for any parser their attribute type has to be explicitly specified. In the
178	example above the `roman` grammar and the rule `first` both have an `unsigned`
179	attribute:
180
181	// grammar definition
182	template <typename Iterator>
183	struct roman : qi::grammar<Iterator, unsigned()> {...};
184
185	// rule definition
186	qi::rule<Iterator, unsigned()> first;
187
188	The used notation resembles the definition of a function type. This is very
189	natural as you can think of the synthesized attribute of the grammar and the
190	rule as of its 'return value'. In fact the rule and the grammar both 'return'
191	an unsigned value - the value they matched.
192
193	[note The function type notation allows to specify parameters as well. These
194	are interpreted as the types of inherited attributes the rule or
195	grammar expect to be passed during parsing. For more information
196	please see the section about inherited and synthesized attributes for
197	rules and grammars (__sec_attributes__).]
198
199	If no attribute is desired none needs to be specified. The default attribute
200	type for both, grammars and rules, is __unused_type__, which is a special
201	placeholder type. Generally, using __unused_type__ as the attribute of a parser
202	is interpreted as 'this parser has no attribute'. This is mostly used for
203	parsers applied to parts of the input not carrying any significant information,
204	rather being delimiters or structural elements needed for correct interpretation
205	of the input.
206
207	The last difference might seem to be rather cosmetic and insignificant. But it
208	turns out that not having to specify which rule in a grammar is the start rule
209	(by returning it from the function `start()`) also means that any rule in a
210	grammar can be directly used as the start rule. Nevertheless, the grammar base
211	class gets initialized with the rule it has to use as the start rule in case
212	the grammar instance is directly used as a parser.
213
214	[endsect]
215