]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/============================================================================== |
2 | Copyright (C) 2001-2011 Hartmut Kaiser | |
3 | Copyright (C) 2001-2011 Joel de Guzman | |
4 | ||
5 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
6 | file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
7 | ===============================================================================/] | |
8 | ||
9 | [section:char Char Generators] | |
10 | ||
11 | This module includes different character oriented generators allowing to output | |
12 | single characters. Currently, it includes literal chars (e.g. `'x'`, `L'x'`), | |
13 | `char_` (single characters, ranges and character sets) and the encoding | |
14 | specific character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.). | |
15 | ||
16 | [heading Module Header] | |
17 | ||
18 | // forwards to <boost/spirit/home/karma/char.hpp> | |
19 | #include <boost/spirit/include/karma_char.hpp> | |
20 | ||
21 | Also, see __include_structure__. | |
22 | ||
23 | [/////////////////////////////////////////////////////////////////////////////] | |
24 | [section:char_generator Character Generators (`char_`, `lit`)] | |
25 | ||
26 | [heading Description] | |
27 | ||
28 | The character generators described in this section are: | |
29 | ||
30 | The `char_` generator emits single characters. The `char_` generator has an | |
31 | associated __karma_char_encoding_namespace__. This is needed when doing basic | |
32 | operations such as forcing lower or upper case and dealing with | |
33 | character ranges. | |
34 | ||
35 | There are various forms of `char_`. | |
36 | ||
37 | [heading char_] | |
38 | ||
39 | The no argument form of `char_` emits any character in the associated | |
40 | __karma_char_encoding_namespace__. | |
41 | ||
42 | char_ // emits any character as supplied by the attribute | |
43 | ||
44 | [heading char_(ch)] | |
45 | ||
46 | The single argument form of `char_` (with a character argument) emits | |
47 | the supplied character. | |
48 | ||
49 | char_('x') // emits 'x' | |
50 | char_(L'x') // emits L'x' | |
51 | char_(x) // emits x (a char) | |
52 | ||
53 | [heading char_(first, last)] | |
54 | ||
55 | `char_` with two arguments, emits any character from a range of characters as | |
56 | supplied by the attribute. | |
57 | ||
58 | char_('a','z') // alphabetic characters | |
59 | char_(L'0',L'9') // digits | |
60 | ||
61 | A range of characters is created from a low-high character pair. Such a | |
62 | generator emits a single character that is in the range, including both | |
63 | endpoints. Note, the first character must be /before/ the second, | |
64 | according to the underlying __karma_char_encoding_namespace__. | |
65 | ||
66 | Character mapping is inherently platform dependent. It is not guaranteed | |
67 | in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we | |
68 | purposely attach a specific __karma_char_encoding_namespace__ (such as ASCII, | |
69 | ISO-8859-1) to the `char_` generator to eliminate such ambiguities. | |
70 | ||
71 | [note *Sparse bit vectors* | |
72 | ||
73 | To accommodate 16/32 and 64 bit characters, the char-set statically | |
74 | switches from a `std::bitset` implementation when the character type is | |
75 | not greater than 8 bits, to a sparse bit/boolean set which uses a sorted | |
76 | vector of disjoint ranges (`range_run`). The set is constructed from | |
77 | ranges such that adjacent or overlapping ranges are coalesced. | |
78 | ||
79 | `range_runs` are very space-economical in situations where there are lots | |
80 | of ranges and a few individual disjoint values. Searching is O(log n) | |
81 | where n is the number of ranges.] | |
82 | ||
83 | [heading char_(def)] | |
84 | ||
85 | Lastly, when given a string (a plain C string, a `std::basic_string`, | |
86 | etc.), the string is regarded as a char-set definition string following | |
87 | a syntax that resembles posix style regular expression character sets | |
88 | (except that double quotes delimit the set elements instead of square | |
89 | brackets and there is no special negation ^ character). Examples: | |
90 | ||
91 | char_("a-zA-Z") // alphabetic characters | |
92 | char_("0-9a-fA-F") // hexadecimal characters | |
93 | char_("actgACTG") // DNA identifiers | |
94 | char_("\x7f\x7e") // Hexadecimal 0x7F and 0x7E | |
95 | ||
96 | These generators emit any character from a range of characters as | |
97 | supplied by the attribute. | |
98 | ||
99 | [heading lit(ch)] | |
100 | ||
101 | `lit`, when passed a single character, behaves like the single argument | |
102 | `char_` except that `lit` does not consume an attribute. A plain | |
103 | `char` or `wchar_t` is equivalent to a `lit`. | |
104 | ||
105 | [note `lit` is reused by the [karma_string String Generators], the | |
106 | char generators, and the Numeric Generators (see [signed_int signed integer], | |
107 | [unsigned_int unsigned integer], and [real_number real number] generators). In | |
108 | general, a char generator is created when you pass in a | |
109 | character, a string generator is created when you pass in a string, and a | |
110 | numeric generator is created when you use a numeric literal. The | |
111 | exception is when you pass a single element literal string, e.g. | |
112 | `lit("x")`. In this case, we optimize this to create a char generator | |
113 | instead of a string generator.] | |
114 | ||
115 | Examples: | |
116 | ||
117 | 'x' | |
118 | lit('x') | |
119 | lit(L'x') | |
120 | lit(c) // c is a char | |
121 | ||
122 | [heading Header] | |
123 | ||
124 | // forwards to <boost/spirit/home/karma/char/char.hpp> | |
125 | #include <boost/spirit/include/karma_char_.hpp> | |
126 | ||
127 | Also, see __include_structure__. | |
128 | ||
129 | [heading Namespace] | |
130 | ||
131 | [table | |
132 | [[Name]] | |
133 | [[`boost::spirit::lit // alias: boost::spirit::karma::lit` ]] | |
134 | [[`ns::char_`]] | |
135 | ] | |
136 | ||
137 | In the table above, `ns` represents a __karma_char_encoding_namespace__. | |
138 | ||
139 | [heading Model of] | |
140 | ||
141 | [:__primitive_generator_concept__] | |
142 | ||
143 | [variablelist Notation | |
144 | [[`ch`, `ch1`, `ch2`] | |
145 | [Character-class specific character (See __char_class_types__), | |
146 | or a __karma_lazy_argument__ that evaluates to a | |
147 | character-class specific character value]] | |
148 | [[`cs`] [Character-set specifier string (See | |
149 | __char_class_types__), or a __karma_lazy_argument__ that | |
150 | evaluates to a character-set specifier string, or a | |
151 | pointer/reference to a null-terminated array of characters. | |
152 | This string specifies a char-set definition string following | |
153 | a syntax that resembles posix style regular expression character | |
154 | sets (except the square brackets and the negation `^` character).]] | |
155 | [[`ns`] [A __karma_char_encoding_namespace__.]] | |
156 | [[`cg`] [A char generator, a char range generator, or a char set generator.]]] | |
157 | ||
158 | [heading Expression Semantics] | |
159 | ||
160 | Semantics of an expression is defined only where it differs from, or is | |
161 | not defined in __primitive_generator_concept__. | |
162 | ||
163 | [table | |
164 | [[Expression] [Description]] | |
165 | [[`ch`] [Generate the character literal `ch`. This generator | |
166 | never fails (unless the underlying output stream | |
167 | reports an error).]] | |
168 | [[`lit(ch)`] [Generate the character literal `ch`. This generator | |
169 | never fails (unless the underlying output stream | |
170 | reports an error).]] | |
171 | [[`ns::char_`] [Generate the character provided by a mandatory | |
172 | attribute interpreted in the character set defined | |
173 | by `ns`. This generator never fails (unless the | |
174 | underlying output stream reports an error).]] | |
175 | [[`ns::char_(ch)`] [Generate the character `ch` as provided by the | |
176 | immediate literal value the generator is initialized | |
177 | from. If this generator has an associated attribute | |
178 | it succeeds only as long as the attribute is equal | |
179 | to the immediate literal (unless the underlying | |
180 | output stream reports an error). Otherwise this | |
181 | generator fails and does not generate any output.]] | |
182 | [[`ns::char_("c")`] [Generate the character `c` as provided by the | |
183 | immediate literal value the generator is initialized | |
184 | from. If this generator has an associated attribute | |
185 | it succeeds only as long as the attribute is equal | |
186 | to the immediate literal (unless the underlying | |
187 | output stream reports an error). Otherwise this | |
188 | generator fails and does not generate any output.]] | |
189 | [[`ns::char_(ch1, ch2)`][Generate the character provided by a mandatory | |
190 | attribute interpreted in the character set defined | |
191 | by `ns`. The generator succeeds as long as the | |
192 | attribute belongs to the character range `[ch1, ch2]` | |
193 | (unless the underlying output stream reports an | |
194 | error). Otherwise this generator fails and does not | |
195 | generate any output.]] | |
196 | [[`ns::char_(cs)`] [Generate the character provided by a mandatory | |
197 | attribute interpreted in the character set defined | |
198 | by `ns`. The generator succeeds as long as the | |
199 | attribute belongs to the character set `cs` | |
200 | (unless the underlying output stream reports an | |
201 | error). Otherwise this generator fails and does not | |
202 | generate any output.]] | |
203 | [[`~cg`] [Negate `cg`. The result is a negated char generator | |
204 | that inverts the test condition of the character | |
205 | generator it is attached to.]] | |
206 | ] | |
207 | ||
208 | A character `ch` is assumed to belong to the character range defined by | |
209 | `ns::char_(ch1, ch2)` if its character value (binary representation) | |
210 | interpreted in the character set defined by `ns` is not smaller than the | |
211 | character value of `ch1` and not larger then the character value of `ch2` (i.e. | |
212 | `ch1 <= ch <= ch2`). | |
213 | ||
214 | The `charset` parameter passed to `ns::char_(charset)` must be a string | |
215 | containing more than one character. Every single character in this string is | |
216 | assumed to belong to the character set defined by this expression. An exception | |
217 | to this is the `'-'` character which has a special meaning if it is not | |
218 | specified as the first and not the last character in `charset`. If the `'-'` | |
219 | is used in between to characters it is interpreted as spanning a character | |
220 | range. A character `ch` is considered to belong to the defined character set | |
221 | `charset` if it matches one of the characters as specified by the string | |
222 | parameter described above. For example | |
223 | ||
224 | [table | |
225 | [[Example] [Description]] | |
226 | [[`char_("abc")`] ['a', 'b', and 'c']] | |
227 | [[`char_("a-z")`] [all characters (and including) from 'a' to 'z']] | |
228 | [[`char_("a-zA-Z")`] [all characters (and including) from 'a' to 'z' and 'A' and 'Z']] | |
229 | [[`char_("-1-9")`] ['-' and all characters (and including) from '1' to '9']] | |
230 | ] | |
231 | ||
232 | [heading Attributes] | |
233 | ||
234 | [table | |
235 | [[Expression] [Attribute]] | |
236 | [[`ch`] [__unused__]] | |
237 | [[`lit(ch)`] [__unused__]] | |
238 | [[`ns::char_`] [`Ch`, attribute is mandatory (otherwise compilation | |
239 | will fail). `Ch` is the character type of the | |
240 | __karma_char_encoding_namespace__, `ns`.]] | |
241 | [[`ns::char_(ch)`] [`Ch`, attribute is optional, if it is supplied, the | |
242 | generator compares the attribute with `ch` and | |
243 | succeeds only if both are equal, failing otherwise. | |
244 | `Ch` is the character type of the | |
245 | __karma_char_encoding_namespace__, `ns`.]] | |
246 | [[`ns::char_("c")`] [`Ch`, attribute is optional, if it is supplied, the | |
247 | generator compares the attribute with `c` and | |
248 | succeeds only if both are equal, failing otherwise. | |
249 | `Ch` is the character type of the | |
250 | __karma_char_encoding_namespace__, `ns`.]] | |
251 | [[`ns::char_(ch1, ch2)`][`Ch`, attribute is mandatory (otherwise compilation | |
252 | will fail), the generator succeeds if the attribute | |
253 | belongs to the character range `[ch1, ch2]` | |
254 | interpreted in the character set defined by `ns`. | |
255 | `Ch` is the character type of the | |
256 | __karma_char_encoding_namespace__, `ns`.]] | |
257 | [[`ns::char_(cs)`] [`Ch`, attribute is mandatory (otherwise compilation | |
258 | will fail), the generator succeeds if the attribute | |
259 | belongs to the character set `cs`, interpreted | |
260 | in the character set defined by `ns`. | |
261 | `Ch` is the character type of the | |
262 | __karma_char_encoding_namespace__, `ns`.]] | |
263 | [[`~cg`] [Attribute of `cg`]] | |
264 | ] | |
265 | ||
266 | [note In addition to their usual attribute of type `Ch` all listed generators | |
267 | accept an instance of a `boost::optional<Ch>` as well. If the | |
268 | `boost::optional<>` is initialized (holds a value) the generators behave | |
269 | as if their attribute was an instance of `Ch` and emit the value stored | |
270 | in the `boost::optional<>`. Otherwise the generators will fail.] | |
271 | ||
272 | [heading Complexity] | |
273 | ||
274 | [:O(1)] | |
275 | ||
276 | The complexity of `ch`, `lit(ch)`, `ns::char_`, `ns::char_(ch)`, and | |
277 | `ns::char_("c")` is constant as all generators emit exactly one character per | |
278 | invocation. | |
279 | ||
280 | The character range generator (`ns::char_(ch1, ch2)`) additionally requires | |
281 | constant lookup time for the verification whether the attribute belongs to | |
282 | the character range. | |
283 | ||
284 | The character set generator (`ns::char_(cs)`) additionally requires | |
285 | O(log N) lookup time for the verification whether the attribute belongs to | |
286 | the character set, where N is the number of characters in the character set. | |
287 | ||
288 | [heading Example] | |
289 | ||
290 | [note The test harness for the example(s) below is presented in the | |
291 | __karma_basics_examples__ section.] | |
292 | ||
293 | Some includes: | |
294 | ||
295 | [reference_karma_includes] | |
296 | ||
297 | Some using declarations: | |
298 | ||
299 | [reference_karma_using_declarations_char] | |
300 | ||
301 | Basic usage of `char_` generators: | |
302 | ||
303 | [reference_karma_char] | |
304 | ||
305 | [endsect] | |
306 | ||
307 | [/////////////////////////////////////////////////////////////////////////////] | |
308 | [section:char_class Character Classification Generators (`alnum`, `digit`, etc.)] | |
309 | ||
310 | [heading Description] | |
311 | ||
312 | The library has the full repertoire of single character generators for | |
313 | character classification. This includes the usual `alnum`, `alpha`, | |
314 | `digit`, `xdigit`, etc. generators. These generators have an associated | |
315 | __karma_char_encoding_namespace__. This is needed when doing basic operations | |
316 | such as forcing lower or upper case. | |
317 | ||
318 | [heading Header] | |
319 | ||
320 | // forwards to <boost/spirit/home/karma/char/char_class.hpp> | |
321 | #include <boost/spirit/include/karma_char_class.hpp> | |
322 | ||
323 | Also, see __include_structure__. | |
324 | ||
325 | [heading Namespace] | |
326 | ||
327 | [table | |
328 | [[Name]] | |
329 | [[`ns::alnum`]] | |
330 | [[`ns::alpha`]] | |
331 | [[`ns::blank`]] | |
332 | [[`ns::cntrl`]] | |
333 | [[`ns::digit`]] | |
334 | [[`ns::graph`]] | |
335 | [[`ns::lower`]] | |
336 | [[`ns::print`]] | |
337 | [[`ns::punct`]] | |
338 | [[`ns::space`]] | |
339 | [[`ns::upper`]] | |
340 | [[`ns::xdigit`]] | |
341 | ] | |
342 | ||
343 | In the table above, `ns` represents a __karma_char_encoding_namespace__ used by the | |
344 | corresponding character class generator. All listed generators have a mandatory | |
345 | attribute `Ch` and will not compile if no attribute is associated. | |
346 | ||
347 | ||
348 | [heading Model of] | |
349 | ||
350 | [:__primitive_generator_concept__] | |
351 | ||
352 | [variablelist Notation | |
353 | [[`ns`] [A __karma_char_encoding_namespace__.]]] | |
354 | ||
355 | [heading Expression Semantics] | |
356 | ||
357 | Semantics of an expression is defined only where it differs from, or is | |
358 | not defined in __primitive_generator_concept__. | |
359 | ||
360 | [table | |
361 | [[Expression] [Semantics]] | |
362 | [[`ns::alnum`] [If the mandatory attribute satisfies the concept of | |
363 | `std::isalnum` in the __karma_char_encoding_namespace__ | |
364 | the generator succeeds after emitting | |
365 | its attribute (unless the underlying output stream | |
366 | reports an error). This generator fails otherwise | |
367 | while not generating anything.]] | |
368 | [[`ns::alpha`] [If the mandatory attribute satisfies the concept of | |
369 | `std::isalpha` in the __karma_char_encoding_namespace__ | |
370 | the generator succeeds after emitting | |
371 | its attribute (unless the underlying output stream | |
372 | reports an error). This generator fails otherwise | |
373 | while not generating anything.]] | |
374 | [[`ns::blank`] [If the mandatory attribute satisfies the concept of | |
375 | `std::isblank` in the __karma_char_encoding_namespace__ | |
376 | the generator succeeds after emitting | |
377 | its attribute (unless the underlying output stream | |
378 | reports an error). This generator fails otherwise | |
379 | while not generating anything.]] | |
380 | [[`ns::cntrl`] [If the mandatory attribute satisfies the concept of | |
381 | `std::iscntrl` in the __karma_char_encoding_namespace__ | |
382 | the generator succeeds after emitting | |
383 | its attribute (unless the underlying output stream | |
384 | reports an error). This generator fails otherwise | |
385 | while not generating anything.]] | |
386 | [[`ns::digit`] [If the mandatory attribute satisfies the concept of | |
387 | `std::isdigit` in the __karma_char_encoding_namespace__ | |
388 | the generator succeeds after emitting | |
389 | its attribute (unless the underlying output stream | |
390 | reports an error). This generator fails otherwise | |
391 | while not generating anything.]] | |
392 | [[`ns::graph`] [If the mandatory attribute satisfies the concept of | |
393 | `std::isgraph` in the __karma_char_encoding_namespace__ | |
394 | the generator succeeds after emitting | |
395 | its attribute (unless the underlying output stream | |
396 | reports an error). This generator fails otherwise | |
397 | while not generating anything.]] | |
398 | [[`ns::print`] [If the mandatory attribute satisfies the concept of | |
399 | `std::isprint` in the __karma_char_encoding_namespace__ | |
400 | the generator succeeds after emitting | |
401 | its attribute (unless the underlying output stream | |
402 | reports an error). This generator fails otherwise | |
403 | while not generating anything.]] | |
404 | [[`ns::punct`] [If the mandatory attribute satisfies the concept of | |
405 | `std::ispunct` in the __karma_char_encoding_namespace__ | |
406 | the generator succeeds after emitting | |
407 | its attribute (unless the underlying output stream | |
408 | reports an error). This generator fails otherwise | |
409 | while not generating anything.]] | |
410 | [[`ns::xdigit`] [If the mandatory attribute satisfies the concept of | |
411 | `std::isxdigit` in the __karma_char_encoding_namespace__ | |
412 | the generator succeeds after emitting | |
413 | its attribute (unless the underlying output stream | |
414 | reports an error). This generator fails otherwise | |
415 | while not generating anything.]] | |
416 | [[`ns::lower`] [If the mandatory attribute satisfies the concept of | |
417 | `std::islower` in the __karma_char_encoding_namespace__ | |
418 | the generator succeeds after emitting | |
419 | its attribute (unless the underlying output stream | |
420 | reports an error). This generator fails otherwise | |
421 | while not generating anything.]] | |
422 | [[`ns::upper`] [If the mandatory attribute satisfies the concept of | |
423 | `std::isupper` in the __karma_char_encoding_namespace__ | |
424 | the generator succeeds after emitting | |
425 | its attribute (unless the underlying output stream | |
426 | reports an error). This generator fails otherwise | |
427 | while not generating anything.]] | |
428 | [[`ns::space`] [If the optional attribute satisfies the concept of | |
429 | `std::isspace` in the __karma_char_encoding_namespace__ | |
430 | the generator succeeds after emitting | |
431 | its attribute (unless the underlying output stream | |
432 | reports an error). This generator fails otherwise | |
433 | while not generating anything.If no attribute is | |
434 | supplied this generator emits a single space | |
435 | character in the character set defined by `ns`.]] | |
436 | ] | |
437 | ||
438 | Possible values for `ns` are described in the section __karma_char_encoding_namespace__. | |
439 | ||
440 | [note The generators `alpha` and `alnum` might seem to behave unexpected if | |
441 | used inside a `lower[]` or `upper[]` directive. Both directives | |
442 | additionally apply the semantics of `std::islower` or `std::isupper` | |
443 | to the respective character class. Some examples: | |
444 | `` | |
445 | std::string s; | |
446 | std::back_insert_iterator<std::string> out(s); | |
447 | generate(out, lower[alpha], 'a'); // succeeds emitting 'a' | |
448 | generate(out, lower[alpha], 'A'); // fails | |
449 | `` | |
450 | The generator directive `upper[]` behaves correspondingly. | |
451 | ] | |
452 | ||
453 | [heading Attributes] | |
454 | ||
455 | [:All listed character class generators can take any attribute `Ch`. All | |
456 | character class generators (except `space`) require an attribute and will | |
457 | fail compiling otherwise.] | |
458 | ||
459 | [note In addition to their usual attribute of type `Ch` all listed generators | |
460 | accept an instance of a `boost::optional<Ch>` as well. If the | |
461 | `boost::optional<>` is initialized (holds a value) the generators behave | |
462 | as if their attribute was an instance of `Ch` and emit the value stored | |
463 | in the `boost::optional<>`. Otherwise the generators will fail.] | |
464 | ||
465 | [heading Complexity] | |
466 | ||
467 | [:O(1)] | |
468 | ||
469 | The complexity is constant as the generators emit not more than one character | |
470 | per invocation. | |
471 | ||
472 | [heading Example] | |
473 | ||
474 | [note The test harness for the example(s) below is presented in the | |
475 | __karma_basics_examples__ section.] | |
476 | ||
477 | Some includes: | |
478 | ||
479 | [reference_karma_includes] | |
480 | ||
481 | Some using declarations: | |
482 | ||
483 | [reference_karma_using_declarations_char_class] | |
484 | ||
485 | Basic usage of an `alpha` generator: | |
486 | ||
487 | [reference_karma_char_class] | |
488 | ||
489 | [endsect] | |
490 | ||
491 | [endsect] |