]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/spirit/doc/qi/char.qbk
add subtree-ish sources for 12.0.3
[ceph.git] / ceph / src / boost / libs / spirit / doc / qi / char.qbk
1 [/==============================================================================
2 Copyright (C) 2001-2011 Joel de Guzman
3 Copyright (C) 2001-2011 Hartmut Kaiser
4
5 Distributed under the Boost Software License, Version 1.0. (See accompanying
6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7 ===============================================================================/]
8 [section:char Character Parsers]
9
10 This module includes parsers for single characters. Currently, this
11 module includes literal chars (e.g. `'x'`, `L'x'`), `char_` (single
12 characters, ranges and character sets) and the encoding specific
13 character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.).
14
15 [heading Module Header]
16
17 // forwards to <boost/spirit/home/qi/char.hpp>
18 #include <boost/spirit/include/qi_char.hpp>
19
20 Also, see __include_structure__.
21
22 [/------------------------------------------------------------------------------]
23 [section:char Character Parser (`char_`, `lit`)]
24
25 [heading Description]
26
27 The `char_` parser matches single characters. The `char_` parser has an
28 associated __char_encoding_namespace__. This is needed when doing basic
29 operations such as inhibiting case sensitivity and dealing with
30 character ranges.
31
32 There are various forms of `char_`.
33
34 [heading char_]
35
36 The no argument form of `char_` matches any character in the associated
37 __char_encoding_namespace__.
38
39 char_ // matches any character
40
41 [heading char_(ch)]
42
43 The single argument form of `char_` (with a character argument) matches
44 the supplied character.
45
46 char_('x') // matches 'x'
47 char_(L'x') // matches L'x'
48 char_(x) // matches x (a char)
49
50 [heading char_(first, last)]
51
52 `char_` with two arguments, matches a range of characters.
53
54 char_('a','z') // alphabetic characters
55 char_(L'0',L'9') // digits
56
57 A range of characters is created from a low-high character pair. Such a
58 parser matches a single character that is in the range, including both
59 endpoints. Note, the first character must be /before/ the second,
60 according to the underlying __char_encoding_namespace__.
61
62 Character mapping is inherently platform dependent. It is not guaranteed
63 in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we
64 purposely attach a specific __char_encoding_namespace__ (such as ASCII,
65 ISO-8859-1) to the `char_` parser to eliminate such ambiguities.
66
67 [note *Sparse bit vectors*
68
69 To accommodate 16/32 and 64 bit characters, the char-set statically
70 switches from a `std::bitset` implementation when the character type is
71 not greater than 8 bits, to a sparse bit/boolean set which uses a sorted
72 vector of disjoint ranges (`range_run`). The set is constructed from
73 ranges such that adjacent or overlapping ranges are coalesced.
74
75 `range_runs` are very space-economical in situations where there are lots
76 of ranges and a few individual disjoint values. Searching is O(log n)
77 where n is the number of ranges.]
78
79 [heading char_(def)]
80
81 Lastly, when given a string (a plain C string, a `std::basic_string`,
82 etc.), the string is regarded as a char-set definition string following
83 a syntax that resembles posix style regular expression character sets
84 (except that double quotes delimit the set elements instead of square
85 brackets and there is no special negation ^ character). Examples:
86
87 char_("a-zA-Z") // alphabetic characters
88 char_("0-9a-fA-F") // hexadecimal characters
89 char_("actgACTG") // DNA identifiers
90 char_("\x7f\x7e") // Hexadecimal 0x7F and 0x7E
91
92 [heading lit(ch)]
93
94 `lit`, when passed a single character, behaves like the single argument
95 `char_` except that `lit` does not synthesize an attribute. A plain
96 `char` or `wchar_t` is equivalent to a `lit`.
97
98 [note `lit` is reused by both the [qi_lit_string string parsers] and the
99 char parsers. In general, a char parser is created when you pass in a
100 character and a string parser is created when you pass in a string. The
101 exception is when you pass a single element literal string, e.g.
102 `lit("x")`. In this case, we optimize this to create a char parser
103 instead of a string parser.]
104
105 Examples:
106
107 'x'
108 lit('x')
109 lit(L'x')
110 lit(c) // c is a char
111
112 [heading Header]
113
114 // forwards to <boost/spirit/home/qi/char/char.hpp>
115 #include <boost/spirit/include/qi_char_.hpp>
116
117 Also, see __include_structure__.
118
119 [heading Namespace]
120
121 [table
122 [[Name]]
123 [[`boost::spirit::lit // alias: boost::spirit::qi::lit` ]]
124 [[`ns::char_`]]
125 ]
126
127 In the table above, `ns` represents a __char_encoding_namespace__.
128
129 [heading Model of]
130
131 [:__primitive_parser_concept__]
132
133 [variablelist Notation
134 [[`c`, `f`, `l`] [A literal char, e.g. `'x'`, `L'x'` or anything that can be
135 converted to a `char` or `wchar_t`, or a __qi_lazy_argument__
136 that evaluates to anything that can be converted to a `char`
137 or `wchar_t`.]]
138 [[`ns`] [A __char_encoding_namespace__.]]
139 [[`cs`] [A __string__ or a __qi_lazy_argument__ that evaluates to a __string__
140 that specifies a char-set definition string following a syntax
141 that resembles posix style regular expression character sets
142 (except the square brackets and the negation `^` character).]]
143 [[`cp`] [A char parser, a char range parser or a char set parser.]]
144 ]
145
146 [heading Expression Semantics]
147
148 Semantics of an expression is defined only where it differs from, or is
149 not defined in __primitive_parser_concept__.
150
151 [table
152 [[Expression] [Semantics]]
153 [[`c`] [Create char parser from a char, `c`.]]
154 [[`lit(c)`] [Create a char parser from a char, `c`.]]
155 [[`ns::char_`] [Create a char parser that matches any character in the
156 `ns` encoding.]]
157 [[`ns::char_(c)`] [Create a char parser with `ns` encoding from a char, `c`.]]
158 [[`ns::char_(f, l)`][Create a char-range parser that matches characters from
159 range (`f` to `l`, inclusive) with `ns` encoding.]]
160 [[`ns::char_(cs)`] [Create a char-set parser with `ns` encoding from a char-set
161 definition string, `cs`.]]
162 [[`~cp`] [Negate `cp`. The result is a negated char parser that
163 matches any character in the `ns` encoding except the
164 characters matched by `cp`.]]
165 ]
166
167 [heading Attributes]
168
169 [table
170 [[Expression] [Attribute]]
171 [[`c`] [__unused__ or if `c` is a __qi_lazy_argument__, the character
172 type returned by invoking it.]]
173 [[`lit(c)`] [__unused__ or if `c` is a __qi_lazy_argument__, the character
174 type returned by invoking it.]]
175 [[`ns::char_`] [The character type of the __char_encoding_namespace__, `ns`.]]
176 [[`ns::char_(c)`] [The character type of the __char_encoding_namespace__, `ns`.]]
177 [[`ns::char_(f, l)`][The character type of the __char_encoding_namespace__, `ns`.]]
178 [[`ns::char_(cs)`] [The character type of the __char_encoding_namespace__, `ns`.]]
179 [[`~cp`] [The attribute of `cp`.]]
180 ]
181
182 [heading Complexity]
183
184 [:*O(N)*, except for char-sets with 16-bit (or more) characters (e.g.
185 `wchar_t`). These have *O(log N)* complexity, where N is the number of
186 distinct character ranges in the set.]
187
188 [heading Example]
189
190 [note The test harness for the example(s) below is presented in the
191 __qi_basics_examples__ section.]
192
193 Some using declarations:
194
195 [reference_using_declarations_lit_char]
196
197 Basic literals:
198
199 [reference_char_literals]
200
201 Range:
202
203 [reference_char_range]
204
205 Character set:
206
207 [reference_char_set]
208
209 Lazy char_ using __phoenix__
210
211 [reference_char_phoenix]
212
213 [endsect] [/ Char]
214
215 [/------------------------------------------------------------------------------]
216 [section:char_class Character Classification Parsers (`alnum`, `digit`, etc.)]
217
218 [heading Description]
219
220 The library has the full repertoire of single character parsers for
221 character classification. This includes the usual `alnum`, `alpha`,
222 `digit`, `xdigit`, etc. parsers. These parsers have an associated
223 __char_encoding_namespace__. This is needed when doing basic operations
224 such as inhibiting case sensitivity.
225
226 [heading Header]
227
228 // forwards to <boost/spirit/home/qi/char/char_class.hpp>
229 #include <boost/spirit/include/qi_char_class.hpp>
230
231 Also, see __include_structure__.
232
233 [heading Namespace]
234
235 [table
236 [[Name]]
237 [[`ns::alnum`]]
238 [[`ns::alpha`]]
239 [[`ns::blank`]]
240 [[`ns::cntrl`]]
241 [[`ns::digit`]]
242 [[`ns::graph`]]
243 [[`ns::lower`]]
244 [[`ns::print`]]
245 [[`ns::punct`]]
246 [[`ns::space`]]
247 [[`ns::upper`]]
248 [[`ns::xdigit`]]
249 ]
250
251 In the table above, `ns` represents a __char_encoding_namespace__.
252
253 [heading Model of]
254
255 [:__primitive_parser_concept__]
256
257 [variablelist Notation
258 [[`ns`] [A __char_encoding_namespace__.]]
259 ]
260
261 [heading Expression Semantics]
262
263 Semantics of an expression is defined only where it differs from, or is
264 not defined in __primitive_parser_concept__.
265
266 [table
267 [[Expression] [Semantics]]
268 [[`ns::alnum`] [Matches alpha-numeric characters]]
269 [[`ns::alpha`] [Matches alphabetic characters]]
270 [[`ns::blank`] [Matches spaces or tabs]]
271 [[`ns::cntrl`] [Matches control characters]]
272 [[`ns::digit`] [Matches numeric digits]]
273 [[`ns::graph`] [Matches non-space printing characters]]
274 [[`ns::lower`] [Matches lower case letters]]
275 [[`ns::print`] [Matches printable characters]]
276 [[`ns::punct`] [Matches punctuation symbols]]
277 [[`ns::space`] [Matches spaces, tabs, returns, and newlines]]
278 [[`ns::upper`] [Matches upper case letters]]
279 [[`ns::xdigit`] [Matches hexadecimal digits]]
280 ]
281
282 [heading Attributes]
283
284 [:The character type of the __char_encoding_namespace__, `ns`.]
285
286 [heading Complexity]
287
288 [:O(N)]
289
290 [heading Example]
291
292 [note The test harness for the example(s) below is presented in the
293 __qi_basics_examples__ section.]
294
295 Some using declarations:
296
297 [reference_using_declarations_char_class]
298
299 Basic usage:
300
301 [reference_char_class]
302
303 [endsect] [/ Char Classification]
304
305 [endsect]