[ceph.git] / ceph / src / boost / libs / spirit / doc / qi / char.qbk

[/==============================================================================
    Copyright (C) 2001-2011 Joel de Guzman
    Copyright (C) 2001-2011 Hartmut Kaiser

    Distributed under the Boost Software License, Version 1.0. (See accompanying
    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
===============================================================================/]
[section:char Character Parsers]

This module includes parsers for single characters. Currently, this
module includes literal chars (e.g. `'x'`, `L'x'`), `char_` (single
characters, ranges and character sets) and the encoding specific
character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.).

[heading Module Header]

    // forwards to <boost/spirit/home/qi/char.hpp>
    #include <boost/spirit/include/qi_char.hpp>

Also, see __include_structure__.

[/------------------------------------------------------------------------------]
[section:char Character Parser (`char_`, `lit`)]

[heading Description]

The `char_` parser matches single characters. The `char_` parser has an
associated __char_encoding_namespace__. This is needed when doing basic
operations such as inhibiting case sensitivity and dealing with
character ranges.

There are various forms of `char_`. 

[heading char_]

The no argument form of `char_` matches any character in the associated
__char_encoding_namespace__.

    char_               // matches any character

[heading char_(ch)]

The single argument form of `char_` (with a character argument) matches
the supplied character. 

    char_('x')          // matches 'x'
    char_(L'x')         // matches L'x'
    char_(x)            // matches x (a char)

[heading char_(first, last)]

`char_` with two arguments, matches a range of characters.

    char_('a','z')      // alphabetic characters
    char_(L'0',L'9')    // digits

A range of characters is created from a low-high character pair. Such a
parser matches a single character that is in the range, including both
endpoints. Note, the first character must be /before/ the second,
according to the underlying __char_encoding_namespace__.

Character mapping is inherently platform dependent. It is not guaranteed
in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we
purposely attach a specific __char_encoding_namespace__ (such as ASCII,
ISO-8859-1) to the `char_` parser to eliminate such ambiguities.

[note *Sparse bit vectors*

To accommodate 16/32 and 64 bit characters, the char-set statically
switches from a `std::bitset` implementation when the character type is
not greater than 8 bits, to a sparse bit/boolean set which uses a sorted
vector of disjoint ranges (`range_run`). The set is constructed from
ranges such that adjacent or overlapping ranges are coalesced.

`range_runs` are very space-economical in situations where there are lots
of ranges and a few individual disjoint values. Searching is O(log n)
where n is the number of ranges.]

[heading char_(def)]

Lastly, when given a string (a plain C string, a `std::basic_string`,
etc.), the string is regarded as a char-set definition string following
a syntax that resembles posix style regular expression character sets
(except that double quotes delimit the set elements instead of square
brackets and there is no special negation ^ character). Examples:

    char_("a-zA-Z")     // alphabetic characters
    char_("0-9a-fA-F")  // hexadecimal characters
    char_("actgACTG")   // DNA identifiers
    char_("\x7f\x7e")   // Hexadecimal 0x7F and 0x7E

[heading lit(ch)]

`lit`, when passed a single character, behaves like the single argument
`char_` except that `lit` does not synthesize an attribute. A plain
`char` or `wchar_t` is equivalent to a `lit`.

[note `lit` is reused by both the [qi_lit_string string parsers] and the
char parsers. In general, a char parser is created when you pass in a
character and a string parser is created when you pass in a string. The
exception is when you pass a single element literal string, e.g.
`lit("x")`. In this case, we optimize this to create a char parser
instead of a string parser.] 

Examples:

    'x'
    lit('x')
    lit(L'x')
    lit(c) // c is a char

[heading Header]

    // forwards to <boost/spirit/home/qi/char/char.hpp>
    #include <boost/spirit/include/qi_char_.hpp>

Also, see __include_structure__.

[heading Namespace]

[table
    [[Name]]
    [[`boost::spirit::lit // alias: boost::spirit::qi::lit` ]]
    [[`ns::char_`]]
]

In the table above, `ns` represents a __char_encoding_namespace__. 

[heading Model of]

[:__primitive_parser_concept__]

[variablelist Notation
    [[`c`, `f`, `l`]    [A literal char, e.g. `'x'`, `L'x'` or anything that can be
                        converted to a `char` or `wchar_t`, or a __qi_lazy_argument__ 
                        that evaluates to anything that can be converted to a `char` 
                        or `wchar_t`.]]
    [[`ns`]             [A __char_encoding_namespace__.]]
    [[`cs`]             [A __string__ or a __qi_lazy_argument__ that evaluates to a __string__
                        that specifies a char-set definition string following a syntax
                        that resembles posix style regular expression character sets
                        (except the square brackets and the negation `^` character).]]
    [[`cp`]             [A char parser, a char range parser or a char set parser.]]
]

[heading Expression Semantics]

Semantics of an expression is defined only where it differs from, or is
not defined in __primitive_parser_concept__.

[table
    [[Expression]       [Semantics]]
    [[`c`]              [Create char parser from a char, `c`.]]
    [[`lit(c)`]         [Create a char parser from a char, `c`.]]
    [[`ns::char_`]      [Create a char parser that matches any character in the
                        `ns` encoding.]]
    [[`ns::char_(c)`]   [Create a char parser with `ns` encoding from a char, `c`.]]
    [[`ns::char_(f, l)`][Create a char-range parser that matches characters from
                        range (`f` to `l`, inclusive) with `ns` encoding.]]
    [[`ns::char_(cs)`]  [Create a char-set parser with `ns` encoding from a char-set
                        definition string, `cs`.]]
    [[`~cp`]            [Negate `cp`. The result is a negated char parser that
                        matches any character in the `ns` encoding except the
                        characters matched by `cp`.]]
]

[heading Attributes]

[table
    [[Expression]       [Attribute]]
    [[`c`]              [__unused__ or if `c` is a __qi_lazy_argument__, the character 
                        type returned by invoking it.]]
    [[`lit(c)`]         [__unused__ or if `c` is a __qi_lazy_argument__, the character 
                        type returned by invoking it.]]
    [[`ns::char_`]      [The character type of the __char_encoding_namespace__, `ns`.]]
    [[`ns::char_(c)`]   [The character type of the __char_encoding_namespace__, `ns`.]]
    [[`ns::char_(f, l)`][The character type of the __char_encoding_namespace__, `ns`.]]
    [[`ns::char_(cs)`]  [The character type of the __char_encoding_namespace__, `ns`.]]
    [[`~cp`]            [The attribute of `cp`.]]
]

[heading Complexity]

[:*O(N)*, except for char-sets with 16-bit (or more) characters (e.g.
`wchar_t`). These have *O(log N)* complexity, where N is the number of
distinct character ranges in the set.]

[heading Example]

[note The test harness for the example(s) below is presented in the
__qi_basics_examples__ section.]

Some using declarations:

[reference_using_declarations_lit_char]

Basic literals:

[reference_char_literals]

Range:

[reference_char_range]

Character set:

[reference_char_set]

Lazy char_ using __phoenix__

[reference_char_phoenix]

[endsect] [/ Char]

[/------------------------------------------------------------------------------]
[section:char_class Character Classification Parsers (`alnum`, `digit`, etc.)]

[heading Description]

The library has the full repertoire of single character parsers for
character classification. This includes the usual `alnum`, `alpha`,
`digit`, `xdigit`, etc. parsers. These parsers have an associated
__char_encoding_namespace__. This is needed when doing basic operations
such as inhibiting case sensitivity.

[heading Header]

    // forwards to <boost/spirit/home/qi/char/char_class.hpp>
    #include <boost/spirit/include/qi_char_class.hpp>

Also, see __include_structure__.

[heading Namespace]

[table
    [[Name]]
    [[`ns::alnum`]]
    [[`ns::alpha`]]
    [[`ns::blank`]]
    [[`ns::cntrl`]]
    [[`ns::digit`]]
    [[`ns::graph`]]
    [[`ns::lower`]]
    [[`ns::print`]]
    [[`ns::punct`]]
    [[`ns::space`]]
    [[`ns::upper`]]
    [[`ns::xdigit`]]
]

In the table above, `ns` represents a __char_encoding_namespace__. 

[heading Model of]

[:__primitive_parser_concept__]

[variablelist Notation
    [[`ns`]             [A __char_encoding_namespace__.]]
]

[heading Expression Semantics]

Semantics of an expression is defined only where it differs from, or is
not defined in __primitive_parser_concept__.

[table
    [[Expression]       [Semantics]]
    [[`ns::alnum`]      [Matches alpha-numeric characters]]
    [[`ns::alpha`]      [Matches alphabetic characters]]
    [[`ns::blank`]      [Matches spaces or tabs]]
    [[`ns::cntrl`]      [Matches control characters]]
    [[`ns::digit`]      [Matches numeric digits]]
    [[`ns::graph`]      [Matches non-space printing characters]]
    [[`ns::lower`]      [Matches lower case letters]]
    [[`ns::print`]      [Matches printable characters]]
    [[`ns::punct`]      [Matches punctuation symbols]]
    [[`ns::space`]      [Matches spaces, tabs, returns, and newlines]]
    [[`ns::upper`]      [Matches upper case letters]]
    [[`ns::xdigit`]     [Matches hexadecimal digits]]
]

[heading Attributes]

[:The character type of the __char_encoding_namespace__, `ns`.]

[heading Complexity]

[:O(N)]

[heading Example]

[note The test harness for the example(s) below is presented in the
__qi_basics_examples__ section.]

Some using declarations:

[reference_using_declarations_char_class]

Basic usage:

[reference_char_class]

[endsect] [/ Char Classification]

[endsect]
Commit	Line	Data
7c673cae FG	1	[/==============================================================================
	2	Copyright (C) 2001-2011 Joel de Guzman
	3	Copyright (C) 2001-2011 Hartmut Kaiser
	4
	5	Distributed under the Boost Software License, Version 1.0. (See accompanying
	6	file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
	7	===============================================================================/]
	8	[section:char Character Parsers]
	9
	10	This module includes parsers for single characters. Currently, this
	11	module includes literal chars (e.g. `'x'`, `L'x'`), `char_` (single
	12	characters, ranges and character sets) and the encoding specific
	13	character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.).
	14
	15	[heading Module Header]
	16
	17	// forwards to <boost/spirit/home/qi/char.hpp>
	18	#include <boost/spirit/include/qi_char.hpp>
	19
	20	Also, see __include_structure__.
	21
	22	[/------------------------------------------------------------------------------]
	23	[section:char Character Parser (`char_`, `lit`)]
	24
	25	[heading Description]
	26
	27	The `char_` parser matches single characters. The `char_` parser has an
	28	associated __char_encoding_namespace__. This is needed when doing basic
	29	operations such as inhibiting case sensitivity and dealing with
	30	character ranges.
	31
	32	There are various forms of `char_`.
	33
	34	[heading char_]
	35
	36	The no argument form of `char_` matches any character in the associated
	37	__char_encoding_namespace__.
	38
	39	char_ // matches any character
	40
	41	[heading char_(ch)]
	42
	43	The single argument form of `char_` (with a character argument) matches
	44	the supplied character.
	45
	46	char_('x') // matches 'x'
	47	char_(L'x') // matches L'x'
	48	char_(x) // matches x (a char)
	49
	50	[heading char_(first, last)]
	51
	52	`char_` with two arguments, matches a range of characters.
	53
	54	char_('a','z') // alphabetic characters
	55	char_(L'0',L'9') // digits
	56
	57	A range of characters is created from a low-high character pair. Such a
	58	parser matches a single character that is in the range, including both
	59	endpoints. Note, the first character must be /before/ the second,
	60	according to the underlying __char_encoding_namespace__.
	61
	62	Character mapping is inherently platform dependent. It is not guaranteed
	63	in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we
	64	purposely attach a specific __char_encoding_namespace__ (such as ASCII,
65	ISO-8859-1) to the `char_` parser to eliminate such ambiguities.
66
67	[note Sparse bit vectors
68
69	To accommodate 16/32 and 64 bit characters, the char-set statically
70	switches from a `std::bitset` implementation when the character type is
71	not greater than 8 bits, to a sparse bit/boolean set which uses a sorted
72	vector of disjoint ranges (`range_run`). The set is constructed from
73	ranges such that adjacent or overlapping ranges are coalesced.
74
75	`range_runs` are very space-economical in situations where there are lots
76	of ranges and a few individual disjoint values. Searching is O(log n)
77	where n is the number of ranges.]
78
79	[heading char_(def)]
80
81	Lastly, when given a string (a plain C string, a `std::basic_string`,
82	etc.), the string is regarded as a char-set definition string following
83	a syntax that resembles posix style regular expression character sets
84	(except that double quotes delimit the set elements instead of square
85	brackets and there is no special negation ^ character). Examples:
86
87	char_("a-zA-Z") // alphabetic characters
88	char_("0-9a-fA-F") // hexadecimal characters
89	char_("actgACTG") // DNA identifiers
90	char_("\x7f\x7e") // Hexadecimal 0x7F and 0x7E
91
92	[heading lit(ch)]
93
94	`lit`, when passed a single character, behaves like the single argument
95	`char_` except that `lit` does not synthesize an attribute. A plain
96	`char` or `wchar_t` is equivalent to a `lit`.
97
98	[note `lit` is reused by both the [qi_lit_string string parsers] and the
99	char parsers. In general, a char parser is created when you pass in a
100	character and a string parser is created when you pass in a string. The
101	exception is when you pass a single element literal string, e.g.
102	`lit("x")`. In this case, we optimize this to create a char parser
103	instead of a string parser.]
104
105	Examples:
106
107	'x'
108	lit('x')
109	lit(L'x')
110	lit(c) // c is a char
111
112	[heading Header]
113
114	// forwards to <boost/spirit/home/qi/char/char.hpp>
115	#include <boost/spirit/include/qi_char_.hpp>
116
117	Also, see __include_structure__.
118
119	[heading Namespace]
120
121	[table
122	[[Name]]
123	[[`boost::spirit::lit // alias: boost::spirit::qi::lit` ]]
124	[[`ns::char_`]]
125	]
126
127	In the table above, `ns` represents a __char_encoding_namespace__.
128
129	[heading Model of]
130
131	[:__primitive_parser_concept__]
132
133	[variablelist Notation
134	[[`c`, `f`, `l`] [A literal char, e.g. `'x'`, `L'x'` or anything that can be
135	converted to a `char` or `wchar_t`, or a __qi_lazy_argument__
136	that evaluates to anything that can be converted to a `char`
137	or `wchar_t`.]]
138	[[`ns`] [A __char_encoding_namespace__.]]
139	[[`cs`] [A __string__ or a __qi_lazy_argument__ that evaluates to a __string__
140	that specifies a char-set definition string following a syntax
141	that resembles posix style regular expression character sets
142	(except the square brackets and the negation `^` character).]]
143	[[`cp`] [A char parser, a char range parser or a char set parser.]]
144	]
145
146	[heading Expression Semantics]
147
148	Semantics of an expression is defined only where it differs from, or is
149	not defined in __primitive_parser_concept__.
150
151	[table
152	[[Expression] [Semantics]]
153	[[`c`] [Create char parser from a char, `c`.]]
154	[[`lit(c)`] [Create a char parser from a char, `c`.]]
155	[[`ns::char_`] [Create a char parser that matches any character in the
156	`ns` encoding.]]
157	[[`ns::char_(c)`] [Create a char parser with `ns` encoding from a char, `c`.]]
158	[[`ns::char_(f, l)`][Create a char-range parser that matches characters from
159	range (`f` to `l`, inclusive) with `ns` encoding.]]
160	[[`ns::char_(cs)`] [Create a char-set parser with `ns` encoding from a char-set
161	definition string, `cs`.]]
162	[[`~cp`] [Negate `cp`. The result is a negated char parser that
163	matches any character in the `ns` encoding except the
164	characters matched by `cp`.]]
165	]
166
167	[heading Attributes]
168
169	[table
170	[[Expression] [Attribute]]
171	[[`c`] [__unused__ or if `c` is a __qi_lazy_argument__, the character
172	type returned by invoking it.]]
173	[[`lit(c)`] [__unused__ or if `c` is a __qi_lazy_argument__, the character
174	type returned by invoking it.]]
175	[[`ns::char_`] [The character type of the __char_encoding_namespace__, `ns`.]]
176	[[`ns::char_(c)`] [The character type of the __char_encoding_namespace__, `ns`.]]
177	[[`ns::char_(f, l)`][The character type of the __char_encoding_namespace__, `ns`.]]
178	[[`ns::char_(cs)`] [The character type of the __char_encoding_namespace__, `ns`.]]
179	[[`~cp`] [The attribute of `cp`.]]
180	]
181
182	[heading Complexity]
183
184	[:O(N), except for char-sets with 16-bit (or more) characters (e.g.
185	`wchar_t`). These have O(log N) complexity, where N is the number of
186	distinct character ranges in the set.]
187
188	[heading Example]
189
190	[note The test harness for the example(s) below is presented in the
191	__qi_basics_examples__ section.]
192
193	Some using declarations:
194
195	[reference_using_declarations_lit_char]
196
197	Basic literals:
198
199	[reference_char_literals]
200
201	Range:
202
203	[reference_char_range]
204
205	Character set:
206
207	[reference_char_set]
208
209	Lazy char_ using __phoenix__
210
211	[reference_char_phoenix]
212
213	[endsect] [/ Char]
214
215	[/------------------------------------------------------------------------------]
216	[section:char_class Character Classification Parsers (`alnum`, `digit`, etc.)]
217
218	[heading Description]
219
220	The library has the full repertoire of single character parsers for
221	character classification. This includes the usual `alnum`, `alpha`,
222	`digit`, `xdigit`, etc. parsers. These parsers have an associated
223	__char_encoding_namespace__. This is needed when doing basic operations
224	such as inhibiting case sensitivity.
225
226	[heading Header]
227
228	// forwards to <boost/spirit/home/qi/char/char_class.hpp>
229	#include <boost/spirit/include/qi_char_class.hpp>
230
231	Also, see __include_structure__.
232
233	[heading Namespace]
234
235	[table
236	[[Name]]
237	[[`ns::alnum`]]
238	[[`ns::alpha`]]
239	[[`ns::blank`]]
240	[[`ns::cntrl`]]
241	[[`ns::digit`]]
242	[[`ns::graph`]]
243	[[`ns::lower`]]
244	[[`ns::print`]]
245	[[`ns::punct`]]
246	[[`ns::space`]]
247	[[`ns::upper`]]
248	[[`ns::xdigit`]]
249	]
250
251	In the table above, `ns` represents a __char_encoding_namespace__.
252
253	[heading Model of]
254
255	[:__primitive_parser_concept__]
256
257	[variablelist Notation
258	[[`ns`] [A __char_encoding_namespace__.]]
259	]
260
261	[heading Expression Semantics]
262
263	Semantics of an expression is defined only where it differs from, or is
264	not defined in __primitive_parser_concept__.
265
266	[table
267	[[Expression] [Semantics]]
268	[[`ns::alnum`] [Matches alpha-numeric characters]]
269	[[`ns::alpha`] [Matches alphabetic characters]]
270	[[`ns::blank`] [Matches spaces or tabs]]
271	[[`ns::cntrl`] [Matches control characters]]
272	[[`ns::digit`] [Matches numeric digits]]
273	[[`ns::graph`] [Matches non-space printing characters]]
274	[[`ns::lower`] [Matches lower case letters]]
275	[[`ns::print`] [Matches printable characters]]
276	[[`ns::punct`] [Matches punctuation symbols]]
277	[[`ns::space`] [Matches spaces, tabs, returns, and newlines]]
278	[[`ns::upper`] [Matches upper case letters]]
279	[[`ns::xdigit`] [Matches hexadecimal digits]]
280	]
281
282	[heading Attributes]
283
284	[:The character type of the __char_encoding_namespace__, `ns`.]
285
286	[heading Complexity]
287
288	[:O(N)]
289
290	[heading Example]
291
292	[note The test harness for the example(s) below is presented in the
293	__qi_basics_examples__ section.]
294
295	Some using declarations:
296
297	[reference_using_declarations_char_class]
298
299	Basic usage:
300
301	[reference_char_class]
302
303	[endsect] [/ Char Classification]
304
305	[endsect]