[ceph.git] / ceph / src / boost / libs / spirit / doc / qi / warming_up.qbk

[/==============================================================================
    Copyright (C) 2001-2011 Joel de Guzman
    Copyright (C) 2001-2011 Hartmut Kaiser

    Distributed under the Boost Software License, Version 1.0. (See accompanying
    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
===============================================================================/]

[section Warming up]

We'll start by showing examples of parser expressions to give you a feel on how
to build parsers from the simplest parser, building up as we go. When comparing 
EBNF to __spirit__, the expressions may seem awkward at first. __spirit__ heavily 
uses operator overloading to accomplish its magic.

[heading Trivial Example #1 Parsing a number]

Create a parser that will parse a floating-point number.

    double_

(You've got to admit, that's trivial!) The above code actually generates a
Spirit floating point parser (a built-in parser). Spirit has many pre-defined
parsers and consistent naming conventions help you keep from going insane!

[heading Trivial Example #2 Parsing two numbers]

Create a parser that will accept a line consisting of two floating-point numbers.

    double_ >> double_

Here you see the familiar floating-point numeric parser `double_` used twice,
once for each number. What's that `>>` operator doing in there? Well, they had
to be separated by something, and this was chosen as the "followed by" sequence
operator. The above program creates a parser from two simpler parsers, glueing
them together with the sequence operator. The result is a parser that is a
composition of smaller parsers. Whitespace between numbers can implicitly be
consumed depending on how the parser is invoked (see below).

[note When we combine parsers, we end up with a "bigger" parser, but
  it's still a parser. Parsers can get bigger and bigger, nesting more and more,
  but whenever you glue two parsers together, you end up with one bigger parser.
  This is an important concept.
]

[heading Trivial Example #3 Parsing zero or more numbers]

Create a parser that will accept zero or more floating-point numbers.

    *double_

This is like a regular-expression Kleene Star, though the syntax might look a
bit odd for a C++ programmer not used to seeing the `*` operator overloaded like
this. Actually, if you know regular expressions it may look odd too since the
star is before the expression it modifies. C'est la vie. Blame it on the fact
that we must work with the syntax rules of C++.

Any expression that evaluates to a parser may be used with the Kleene Star.
Keep in mind that C++ operator precedence rules may require you to put 
expressions in parentheses for complex expressions. The Kleene Star
is also known as a Kleene Closure, but we call it the Star in most places.

[heading Trivial Example #4 Parsing a comma-delimited list of numbers]

This example will create a parser that accepts a comma-delimited list of
numbers.

    double_ >> *(char_(',') >> double_)

Notice `char_(',')`. It is a literal character parser that can recognize the
comma `','`. In this case, the Kleene Star is modifying a more complex parser,
namely, the one generated by the expression:

    (char_(',') >> double_)

Note that this is a case where the parentheses are necessary. The Kleene star
encloses the complete expression above.

[heading Let's Parse!]

We're done with defining the parser. So the next step is now invoking this
parser to do its work. There are a couple of ways to do this. For now, we will
use the `phrase_parse` function. One overload of this function accepts four
arguments:

# An iterator pointing to the start of the input
# An iterator pointing to one past the end of the input
# The parser object
# Another parser called the skip parser

In our example, we wish to skip spaces and tabs. Another parser named `space`
is included in Spirit's repertoire of predefined parsers. It is a very simple
parser that simply recognizes whitespace. We will use `space` as our skip
parser. The skip parser is the one responsible for skipping characters in
between parser elements such as the `double_` and `char_`.

Ok, so now let's parse!

[import ../../example/qi/num_list1.cpp]
[tutorial_numlist1]

The parse function returns `true` or `false` depending on the result of
the parse. The first iterator is passed by reference. On a successful
parse, this iterator is repositioned to the rightmost position consumed
by the parser. If this becomes equal to `last`, then we have a full
match. If not, then we have a partial match. A partial match happens
when the parser is only able to parse a portion of the input.

Note that we inlined the parser directly in the call to parse. Upon calling
parse, the expression evaluates into a temporary, unnamed parser which is passed
into the parse() function, used, and then destroyed.

Here, we opted to make the parser generic by making it a template, parameterized
by the iterator type. By doing so, it can take in data coming from any STL
conforming sequence as long as the iterators conform to a forward iterator.

You can find the full cpp file here: [@../../example/qi/num_list1.cpp]

[note `char` and `wchar_t` operands

The careful reader may notice that the parser expression has `','` instead of
`char_(',')` as the previous examples did. This is ok due to C++ syntax rules of
conversion. There are `>>` operators that are overloaded to accept a `char` or
`wchar_t` argument on its left or right (but not both). An operator may be
overloaded if at least one of its parameters is a user-defined type. In this
case, the `double_` is the 2nd argument to `operator>>`, and so the proper
overload of `>>` is used, converting `','` into a character literal parser.

The problem with omitting the `char_` should be obvious: `'a' >> 'b'` is not a
spirit parser, it is a numeric expression, right-shifting the ASCII (or another
encoding) value of `'a'` by the ASCII value of `'b'`. However, both
`char_('a') >> 'b'` and `'a' >> char_('b')` are Spirit sequence parsers
for the letter `'a'` followed by `'b'`. You'll get used to it, sooner or later.
]

Finally, take note that we test for a full match (i.e. the parser fully parsed
the input) by checking if the first iterator, after parsing, is equal to the end
iterator. You may strike out this part if partial matches are to be allowed.

[endsect] [/ Warming up]
Commit	Line	Data
7c673cae FG	1	[/==============================================================================
	2	Copyright (C) 2001-2011 Joel de Guzman
	3	Copyright (C) 2001-2011 Hartmut Kaiser
	4
	5	Distributed under the Boost Software License, Version 1.0. (See accompanying
	6	file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
	7	===============================================================================/]
	8
	9	[section Warming up]
	10
	11	We'll start by showing examples of parser expressions to give you a feel on how
	12	to build parsers from the simplest parser, building up as we go. When comparing
	13	EBNF to __spirit__, the expressions may seem awkward at first. __spirit__ heavily
	14	uses operator overloading to accomplish its magic.
	15
	16	[heading Trivial Example #1 Parsing a number]
	17
	18	Create a parser that will parse a floating-point number.
	19
	20	double_
	21
	22	(You've got to admit, that's trivial!) The above code actually generates a
	23	Spirit floating point parser (a built-in parser). Spirit has many pre-defined
	24	parsers and consistent naming conventions help you keep from going insane!
	25
	26	[heading Trivial Example #2 Parsing two numbers]
	27
	28	Create a parser that will accept a line consisting of two floating-point numbers.
	29
	30	double_ >> double_
	31
	32	Here you see the familiar floating-point numeric parser `double_` used twice,
	33	once for each number. What's that `>>` operator doing in there? Well, they had
	34	to be separated by something, and this was chosen as the "followed by" sequence
	35	operator. The above program creates a parser from two simpler parsers, glueing
	36	them together with the sequence operator. The result is a parser that is a
	37	composition of smaller parsers. Whitespace between numbers can implicitly be
	38	consumed depending on how the parser is invoked (see below).
	39
	40	[note When we combine parsers, we end up with a "bigger" parser, but
	41	it's still a parser. Parsers can get bigger and bigger, nesting more and more,
	42	but whenever you glue two parsers together, you end up with one bigger parser.
	43	This is an important concept.
	44	]
	45
	46	[heading Trivial Example #3 Parsing zero or more numbers]
	47
	48	Create a parser that will accept zero or more floating-point numbers.
	49
	50	*double_
	51
	52	This is like a regular-expression Kleene Star, though the syntax might look a
	53	bit odd for a C++ programmer not used to seeing the `*` operator overloaded like
	54	this. Actually, if you know regular expressions it may look odd too since the
	55	star is before the expression it modifies. C'est la vie. Blame it on the fact
	56	that we must work with the syntax rules of C++.
	57
	58	Any expression that evaluates to a parser may be used with the Kleene Star.
	59	Keep in mind that C++ operator precedence rules may require you to put
	60	expressions in parentheses for complex expressions. The Kleene Star
	61	is also known as a Kleene Closure, but we call it the Star in most places.
	62
	63	[heading Trivial Example #4 Parsing a comma-delimited list of numbers]
	64
65	This example will create a parser that accepts a comma-delimited list of
66	numbers.
67
68	double_ >> *(char_(',') >> double_)
69
70	Notice `char_(',')`. It is a literal character parser that can recognize the
71	comma `','`. In this case, the Kleene Star is modifying a more complex parser,
72	namely, the one generated by the expression:
73
74	(char_(',') >> double_)
75
76	Note that this is a case where the parentheses are necessary. The Kleene star
77	encloses the complete expression above.
78
79	[heading Let's Parse!]
80
81	We're done with defining the parser. So the next step is now invoking this
82	parser to do its work. There are a couple of ways to do this. For now, we will
83	use the `phrase_parse` function. One overload of this function accepts four
84	arguments:
85
86	# An iterator pointing to the start of the input
87	# An iterator pointing to one past the end of the input
88	# The parser object
89	# Another parser called the skip parser
90
91	In our example, we wish to skip spaces and tabs. Another parser named `space`
92	is included in Spirit's repertoire of predefined parsers. It is a very simple
93	parser that simply recognizes whitespace. We will use `space` as our skip
94	parser. The skip parser is the one responsible for skipping characters in
95	between parser elements such as the `double_` and `char_`.
96
97	Ok, so now let's parse!
98
99	[import ../../example/qi/num_list1.cpp]
100	[tutorial_numlist1]
101
102	The parse function returns `true` or `false` depending on the result of
103	the parse. The first iterator is passed by reference. On a successful
104	parse, this iterator is repositioned to the rightmost position consumed
105	by the parser. If this becomes equal to `last`, then we have a full
106	match. If not, then we have a partial match. A partial match happens
107	when the parser is only able to parse a portion of the input.
108
109	Note that we inlined the parser directly in the call to parse. Upon calling
110	parse, the expression evaluates into a temporary, unnamed parser which is passed
111	into the parse() function, used, and then destroyed.
112
113	Here, we opted to make the parser generic by making it a template, parameterized
114	by the iterator type. By doing so, it can take in data coming from any STL
115	conforming sequence as long as the iterators conform to a forward iterator.
116
117	You can find the full cpp file here: [@../../example/qi/num_list1.cpp]
118
119	[note `char` and `wchar_t` operands
120
121	The careful reader may notice that the parser expression has `','` instead of
122	`char_(',')` as the previous examples did. This is ok due to C++ syntax rules of
123	conversion. There are `>>` operators that are overloaded to accept a `char` or
124	`wchar_t` argument on its left or right (but not both). An operator may be
125	overloaded if at least one of its parameters is a user-defined type. In this
126	case, the `double_` is the 2nd argument to `operator>>`, and so the proper
127	overload of `>>` is used, converting `','` into a character literal parser.
128
129	The problem with omitting the `char_` should be obvious: `'a' >> 'b'` is not a
130	spirit parser, it is a numeric expression, right-shifting the ASCII (or another
131	encoding) value of `'a'` by the ASCII value of `'b'`. However, both
132	`char_('a') >> 'b'` and `'a' >> char_('b')` are Spirit sequence parsers
133	for the letter `'a'` followed by `'b'`. You'll get used to it, sooner or later.
134	]
135
136	Finally, take note that we test for a full match (i.e. the parser fully parsed
137	the input) by checking if the first iterator, after parsing, is equal to the end
138	iterator. You may strike out this part if partial matches are to be allowed.
139
140	[endsect] [/ Warming up]