1 [/==============================================================================
2 Copyright (C) 2001-2011 Hartmut Kaiser
3 Copyright (C) 2001-2011 Joel de Guzman
5 Distributed under the Boost Software License, Version 1.0. (See accompanying
6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7 ===============================================================================/]
9 [section:char Char Generators]
11 This module includes different character oriented generators allowing to output
12 single characters. Currently, it includes literal chars (e.g. `'x'`, `L'x'`),
13 `char_` (single characters, ranges and character sets) and the encoding
14 specific character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.).
16 [heading Module Header]
18 // forwards to <boost/spirit/home/karma/char.hpp>
19 #include <boost/spirit/include/karma_char.hpp>
21 Also, see __include_structure__.
23 [/////////////////////////////////////////////////////////////////////////////]
24 [section:char_generator Character Generators (`char_`, `lit`)]
28 The character generators described in this section are:
30 The `char_` generator emits single characters. The `char_` generator has an
31 associated __karma_char_encoding_namespace__. This is needed when doing basic
32 operations such as forcing lower or upper case and dealing with
35 There are various forms of `char_`.
39 The no argument form of `char_` emits any character in the associated
40 __karma_char_encoding_namespace__.
42 char_ // emits any character as supplied by the attribute
46 The single argument form of `char_` (with a character argument) emits
47 the supplied character.
49 char_('x') // emits 'x'
50 char_(L'x') // emits L'x'
51 char_(x) // emits x (a char)
53 [heading char_(first, last)]
55 `char_` with two arguments, emits any character from a range of characters as
56 supplied by the attribute.
58 char_('a','z') // alphabetic characters
59 char_(L'0',L'9') // digits
61 A range of characters is created from a low-high character pair. Such a
62 generator emits a single character that is in the range, including both
63 endpoints. Note, the first character must be /before/ the second,
64 according to the underlying __karma_char_encoding_namespace__.
66 Character mapping is inherently platform dependent. It is not guaranteed
67 in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we
68 purposely attach a specific __karma_char_encoding_namespace__ (such as ASCII,
69 ISO-8859-1) to the `char_` generator to eliminate such ambiguities.
71 [note *Sparse bit vectors*
73 To accommodate 16/32 and 64 bit characters, the char-set statically
74 switches from a `std::bitset` implementation when the character type is
75 not greater than 8 bits, to a sparse bit/boolean set which uses a sorted
76 vector of disjoint ranges (`range_run`). The set is constructed from
77 ranges such that adjacent or overlapping ranges are coalesced.
79 `range_runs` are very space-economical in situations where there are lots
80 of ranges and a few individual disjoint values. Searching is O(log n)
81 where n is the number of ranges.]
85 Lastly, when given a string (a plain C string, a `std::basic_string`,
86 etc.), the string is regarded as a char-set definition string following
87 a syntax that resembles posix style regular expression character sets
88 (except that double quotes delimit the set elements instead of square
89 brackets and there is no special negation ^ character). Examples:
91 char_("a-zA-Z") // alphabetic characters
92 char_("0-9a-fA-F") // hexadecimal characters
93 char_("actgACTG") // DNA identifiers
94 char_("\x7f\x7e") // Hexadecimal 0x7F and 0x7E
96 These generators emit any character from a range of characters as
97 supplied by the attribute.
101 `lit`, when passed a single character, behaves like the single argument
102 `char_` except that `lit` does not consume an attribute. A plain
103 `char` or `wchar_t` is equivalent to a `lit`.
105 [note `lit` is reused by the [karma_string String Generators], the
106 char generators, and the Numeric Generators (see [signed_int signed integer],
107 [unsigned_int unsigned integer], and [real_number real number] generators). In
108 general, a char generator is created when you pass in a
109 character, a string generator is created when you pass in a string, and a
110 numeric generator is created when you use a numeric literal. The
111 exception is when you pass a single element literal string, e.g.
112 `lit("x")`. In this case, we optimize this to create a char generator
113 instead of a string generator.]
120 lit(c) // c is a char
124 // forwards to <boost/spirit/home/karma/char/char.hpp>
125 #include <boost/spirit/include/karma_char_.hpp>
127 Also, see __include_structure__.
133 [[`boost::spirit::lit // alias: boost::spirit::karma::lit` ]]
137 In the table above, `ns` represents a __karma_char_encoding_namespace__.
141 [:__primitive_generator_concept__]
143 [variablelist Notation
144 [[`ch`, `ch1`, `ch2`]
145 [Character-class specific character (See __char_class_types__),
146 or a __karma_lazy_argument__ that evaluates to a
147 character-class specific character value]]
148 [[`cs`] [Character-set specifier string (See
149 __char_class_types__), or a __karma_lazy_argument__ that
150 evaluates to a character-set specifier string, or a
151 pointer/reference to a null-terminated array of characters.
152 This string specifies a char-set definition string following
153 a syntax that resembles posix style regular expression character
154 sets (except the square brackets and the negation `^` character).]]
155 [[`ns`] [A __karma_char_encoding_namespace__.]]
156 [[`cg`] [A char generator, a char range generator, or a char set generator.]]]
158 [heading Expression Semantics]
160 Semantics of an expression is defined only where it differs from, or is
161 not defined in __primitive_generator_concept__.
164 [[Expression] [Description]]
165 [[`ch`] [Generate the character literal `ch`. This generator
166 never fails (unless the underlying output stream
168 [[`lit(ch)`] [Generate the character literal `ch`. This generator
169 never fails (unless the underlying output stream
171 [[`ns::char_`] [Generate the character provided by a mandatory
172 attribute interpreted in the character set defined
173 by `ns`. This generator never fails (unless the
174 underlying output stream reports an error).]]
175 [[`ns::char_(ch)`] [Generate the character `ch` as provided by the
176 immediate literal value the generator is initialized
177 from. If this generator has an associated attribute
178 it succeeds only as long as the attribute is equal
179 to the immediate literal (unless the underlying
180 output stream reports an error). Otherwise this
181 generator fails and does not generate any output.]]
182 [[`ns::char_("c")`] [Generate the character `c` as provided by the
183 immediate literal value the generator is initialized
184 from. If this generator has an associated attribute
185 it succeeds only as long as the attribute is equal
186 to the immediate literal (unless the underlying
187 output stream reports an error). Otherwise this
188 generator fails and does not generate any output.]]
189 [[`ns::char_(ch1, ch2)`][Generate the character provided by a mandatory
190 attribute interpreted in the character set defined
191 by `ns`. The generator succeeds as long as the
192 attribute belongs to the character range `[ch1, ch2]`
193 (unless the underlying output stream reports an
194 error). Otherwise this generator fails and does not
195 generate any output.]]
196 [[`ns::char_(cs)`] [Generate the character provided by a mandatory
197 attribute interpreted in the character set defined
198 by `ns`. The generator succeeds as long as the
199 attribute belongs to the character set `cs`
200 (unless the underlying output stream reports an
201 error). Otherwise this generator fails and does not
202 generate any output.]]
203 [[`~cg`] [Negate `cg`. The result is a negated char generator
204 that inverts the test condition of the character
205 generator it is attached to.]]
208 A character `ch` is assumed to belong to the character range defined by
209 `ns::char_(ch1, ch2)` if its character value (binary representation)
210 interpreted in the character set defined by `ns` is not smaller than the
211 character value of `ch1` and not larger then the character value of `ch2` (i.e.
214 The `charset` parameter passed to `ns::char_(charset)` must be a string
215 containing more than one character. Every single character in this string is
216 assumed to belong to the character set defined by this expression. An exception
217 to this is the `'-'` character which has a special meaning if it is not
218 specified as the first and not the last character in `charset`. If the `'-'`
219 is used in between to characters it is interpreted as spanning a character
220 range. A character `ch` is considered to belong to the defined character set
221 `charset` if it matches one of the characters as specified by the string
222 parameter described above. For example
225 [[Example] [Description]]
226 [[`char_("abc")`] ['a', 'b', and 'c']]
227 [[`char_("a-z")`] [all characters (and including) from 'a' to 'z']]
228 [[`char_("a-zA-Z")`] [all characters (and including) from 'a' to 'z' and 'A' and 'Z']]
229 [[`char_("-1-9")`] ['-' and all characters (and including) from '1' to '9']]
235 [[Expression] [Attribute]]
236 [[`ch`] [__unused__]]
237 [[`lit(ch)`] [__unused__]]
238 [[`ns::char_`] [`Ch`, attribute is mandatory (otherwise compilation
239 will fail). `Ch` is the character type of the
240 __karma_char_encoding_namespace__, `ns`.]]
241 [[`ns::char_(ch)`] [`Ch`, attribute is optional, if it is supplied, the
242 generator compares the attribute with `ch` and
243 succeeds only if both are equal, failing otherwise.
244 `Ch` is the character type of the
245 __karma_char_encoding_namespace__, `ns`.]]
246 [[`ns::char_("c")`] [`Ch`, attribute is optional, if it is supplied, the
247 generator compares the attribute with `c` and
248 succeeds only if both are equal, failing otherwise.
249 `Ch` is the character type of the
250 __karma_char_encoding_namespace__, `ns`.]]
251 [[`ns::char_(ch1, ch2)`][`Ch`, attribute is mandatory (otherwise compilation
252 will fail), the generator succeeds if the attribute
253 belongs to the character range `[ch1, ch2]`
254 interpreted in the character set defined by `ns`.
255 `Ch` is the character type of the
256 __karma_char_encoding_namespace__, `ns`.]]
257 [[`ns::char_(cs)`] [`Ch`, attribute is mandatory (otherwise compilation
258 will fail), the generator succeeds if the attribute
259 belongs to the character set `cs`, interpreted
260 in the character set defined by `ns`.
261 `Ch` is the character type of the
262 __karma_char_encoding_namespace__, `ns`.]]
263 [[`~cg`] [Attribute of `cg`]]
266 [note In addition to their usual attribute of type `Ch` all listed generators
267 accept an instance of a `boost::optional<Ch>` as well. If the
268 `boost::optional<>` is initialized (holds a value) the generators behave
269 as if their attribute was an instance of `Ch` and emit the value stored
270 in the `boost::optional<>`. Otherwise the generators will fail.]
276 The complexity of `ch`, `lit(ch)`, `ns::char_`, `ns::char_(ch)`, and
277 `ns::char_("c")` is constant as all generators emit exactly one character per
280 The character range generator (`ns::char_(ch1, ch2)`) additionally requires
281 constant lookup time for the verification whether the attribute belongs to
284 The character set generator (`ns::char_(cs)`) additionally requires
285 O(log N) lookup time for the verification whether the attribute belongs to
286 the character set, where N is the number of characters in the character set.
290 [note The test harness for the example(s) below is presented in the
291 __karma_basics_examples__ section.]
295 [reference_karma_includes]
297 Some using declarations:
299 [reference_karma_using_declarations_char]
301 Basic usage of `char_` generators:
303 [reference_karma_char]
307 [/////////////////////////////////////////////////////////////////////////////]
308 [section:char_class Character Classification Generators (`alnum`, `digit`, etc.)]
310 [heading Description]
312 The library has the full repertoire of single character generators for
313 character classification. This includes the usual `alnum`, `alpha`,
314 `digit`, `xdigit`, etc. generators. These generators have an associated
315 __karma_char_encoding_namespace__. This is needed when doing basic operations
316 such as forcing lower or upper case.
320 // forwards to <boost/spirit/home/karma/char/char_class.hpp>
321 #include <boost/spirit/include/karma_char_class.hpp>
323 Also, see __include_structure__.
343 In the table above, `ns` represents a __karma_char_encoding_namespace__ used by the
344 corresponding character class generator. All listed generators have a mandatory
345 attribute `Ch` and will not compile if no attribute is associated.
350 [:__primitive_generator_concept__]
352 [variablelist Notation
353 [[`ns`] [A __karma_char_encoding_namespace__.]]]
355 [heading Expression Semantics]
357 Semantics of an expression is defined only where it differs from, or is
358 not defined in __primitive_generator_concept__.
361 [[Expression] [Semantics]]
362 [[`ns::alnum`] [If the mandatory attribute satisfies the concept of
363 `std::isalnum` in the __karma_char_encoding_namespace__
364 the generator succeeds after emitting
365 its attribute (unless the underlying output stream
366 reports an error). This generator fails otherwise
367 while not generating anything.]]
368 [[`ns::alpha`] [If the mandatory attribute satisfies the concept of
369 `std::isalpha` in the __karma_char_encoding_namespace__
370 the generator succeeds after emitting
371 its attribute (unless the underlying output stream
372 reports an error). This generator fails otherwise
373 while not generating anything.]]
374 [[`ns::blank`] [If the mandatory attribute satisfies the concept of
375 `std::isblank` in the __karma_char_encoding_namespace__
376 the generator succeeds after emitting
377 its attribute (unless the underlying output stream
378 reports an error). This generator fails otherwise
379 while not generating anything.]]
380 [[`ns::cntrl`] [If the mandatory attribute satisfies the concept of
381 `std::iscntrl` in the __karma_char_encoding_namespace__
382 the generator succeeds after emitting
383 its attribute (unless the underlying output stream
384 reports an error). This generator fails otherwise
385 while not generating anything.]]
386 [[`ns::digit`] [If the mandatory attribute satisfies the concept of
387 `std::isdigit` in the __karma_char_encoding_namespace__
388 the generator succeeds after emitting
389 its attribute (unless the underlying output stream
390 reports an error). This generator fails otherwise
391 while not generating anything.]]
392 [[`ns::graph`] [If the mandatory attribute satisfies the concept of
393 `std::isgraph` in the __karma_char_encoding_namespace__
394 the generator succeeds after emitting
395 its attribute (unless the underlying output stream
396 reports an error). This generator fails otherwise
397 while not generating anything.]]
398 [[`ns::print`] [If the mandatory attribute satisfies the concept of
399 `std::isprint` in the __karma_char_encoding_namespace__
400 the generator succeeds after emitting
401 its attribute (unless the underlying output stream
402 reports an error). This generator fails otherwise
403 while not generating anything.]]
404 [[`ns::punct`] [If the mandatory attribute satisfies the concept of
405 `std::ispunct` in the __karma_char_encoding_namespace__
406 the generator succeeds after emitting
407 its attribute (unless the underlying output stream
408 reports an error). This generator fails otherwise
409 while not generating anything.]]
410 [[`ns::xdigit`] [If the mandatory attribute satisfies the concept of
411 `std::isxdigit` in the __karma_char_encoding_namespace__
412 the generator succeeds after emitting
413 its attribute (unless the underlying output stream
414 reports an error). This generator fails otherwise
415 while not generating anything.]]
416 [[`ns::lower`] [If the mandatory attribute satisfies the concept of
417 `std::islower` in the __karma_char_encoding_namespace__
418 the generator succeeds after emitting
419 its attribute (unless the underlying output stream
420 reports an error). This generator fails otherwise
421 while not generating anything.]]
422 [[`ns::upper`] [If the mandatory attribute satisfies the concept of
423 `std::isupper` in the __karma_char_encoding_namespace__
424 the generator succeeds after emitting
425 its attribute (unless the underlying output stream
426 reports an error). This generator fails otherwise
427 while not generating anything.]]
428 [[`ns::space`] [If the optional attribute satisfies the concept of
429 `std::isspace` in the __karma_char_encoding_namespace__
430 the generator succeeds after emitting
431 its attribute (unless the underlying output stream
432 reports an error). This generator fails otherwise
433 while not generating anything.If no attribute is
434 supplied this generator emits a single space
435 character in the character set defined by `ns`.]]
438 Possible values for `ns` are described in the section __karma_char_encoding_namespace__.
440 [note The generators `alpha` and `alnum` might seem to behave unexpected if
441 used inside a `lower[]` or `upper[]` directive. Both directives
442 additionally apply the semantics of `std::islower` or `std::isupper`
443 to the respective character class. Some examples:
446 std::back_insert_iterator<std::string> out(s);
447 generate(out, lower[alpha], 'a'); // succeeds emitting 'a'
448 generate(out, lower[alpha], 'A'); // fails
450 The generator directive `upper[]` behaves correspondingly.
455 [:All listed character class generators can take any attribute `Ch`. All
456 character class generators (except `space`) require an attribute and will
457 fail compiling otherwise.]
459 [note In addition to their usual attribute of type `Ch` all listed generators
460 accept an instance of a `boost::optional<Ch>` as well. If the
461 `boost::optional<>` is initialized (holds a value) the generators behave
462 as if their attribute was an instance of `Ch` and emit the value stored
463 in the `boost::optional<>`. Otherwise the generators will fail.]
469 The complexity is constant as the generators emit not more than one character
474 [note The test harness for the example(s) below is presented in the
475 __karma_basics_examples__ section.]
479 [reference_karma_includes]
481 Some using declarations:
483 [reference_karma_using_declarations_char_class]
485 Basic usage of an `alpha` generator:
487 [reference_karma_char_class]