]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/ |
2 | Copyright 2006-2007 John Maddock. | |
3 | Distributed under the Boost Software License, Version 1.0. | |
4 | (See accompanying file LICENSE_1_0.txt or copy at | |
5 | http://www.boost.org/LICENSE_1_0.txt). | |
6 | ] | |
7 | ||
8 | ||
9 | [section:collating_names Collating Names] | |
10 | ||
11 | [section:digraphs Digraphs] | |
12 | ||
13 | The following are treated as valid digraphs when used as a collating name: | |
14 | ||
15 | "ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ". | |
16 | ||
17 | So for example the expression: | |
18 | ||
19 | [pre \[\[.ae.\]-c\] ] | |
20 | ||
21 | will match any character that collates between the digraph "ae" and the character "c". | |
22 | ||
23 | [endsect] | |
24 | ||
25 | [section:posix_symbolic_names POSIX Symbolic Names] | |
26 | ||
27 | The following symbolic names are recognised as valid collating element names, | |
28 | in addition to any single character, this allows you to write for example: | |
29 | ||
30 | [pre \[\[.left-square-bracket.\]\[.right-square-bracket.\]\]] | |
31 | ||
32 | if you wanted to match either "\[" or "\]". | |
33 | ||
34 | [table | |
35 | [[Name][Character]] | |
36 | [[NUL] [\\x00]] | |
37 | [[SOH] [\\x01]] | |
38 | [[STX] [\\x02]] | |
39 | [[ETX] [\\x03]] | |
40 | [[EOT] [\\x04]] | |
41 | [[ENQ] [\\x05]] | |
42 | [[ACK] [\\x06]] | |
43 | [[alert] [\\x07]] | |
44 | [[backspace] [\\x08]] | |
45 | [[tab] [\\t]] | |
46 | [[newline] [\\n]] | |
47 | [[vertical-tab] [\\v]] | |
48 | [[form-feed] [\\f]] | |
49 | [[carriage-return] [\\r]] | |
50 | [[SO] [\\xE]] | |
51 | [[SI] [\\xF]] | |
52 | [[DLE] [\\x10]] | |
53 | [[DC1] [\\x11]] | |
54 | [[DC2] [\\x12]] | |
55 | [[DC3] [\\x13]] | |
56 | [[DC4] [\\x14]] | |
57 | [[NAK] [\\x15]] | |
58 | [[SYN] [\\x16]] | |
59 | [[ETB] [\\x17]] | |
60 | [[CAN] [\\x18]] | |
61 | [[EM] [\\x19]] | |
62 | [[SUB] [\\x1A]] | |
63 | [[ESC] [\\x1B]] | |
64 | [[IS4] [\\x1C]] | |
65 | [[IS3] [\\x1D]] | |
66 | [[IS2] [\\x1E]] | |
67 | [[IS1] [\\x1F]] | |
68 | [[space] [\\x20]] | |
69 | [[exclamation-mark] [!]] | |
70 | [[quotation-mark] ["]] | |
71 | [[number-sign] [#]] | |
72 | [[dollar-sign] [$]] | |
73 | [[percent-sign] [%]] | |
74 | [[ampersand] [&]] | |
75 | [[apostrophe] [\']] | |
76 | [[left-parenthesis] [(]] | |
77 | [[right-parenthesis] [)]] | |
78 | [[asterisk] [\*]] | |
79 | [[plus-sign] [+]] | |
80 | [[comma] [,]] | |
81 | [[hyphen] [-]] | |
82 | [[period] [.]] | |
83 | [[slash] [ / ]] | |
84 | [[zero] [0]] | |
85 | [[one] [1]] | |
86 | [[two] [2]] | |
87 | [[three] [3]] | |
88 | [[four] [4]] | |
89 | [[five] [5]] | |
90 | [[six] [6]] | |
91 | [[seven] [7]] | |
92 | [[eight] [8]] | |
93 | [[nine] [9]] | |
94 | [[colon] [\:]] | |
95 | [[semicolon] [;]] | |
96 | [[less-than-sign] [<]] | |
97 | [[equals-sign] [=]] | |
98 | [[greater-than-sign] [>]] | |
99 | [[question-mark] [?]] | |
100 | [[commercial-at] [@]] | |
101 | [[left-square-bracket] [\[]] | |
102 | [[backslash][\\]] | |
103 | [[right-square-bracket][\]]] | |
104 | [[circumflex][~]] | |
105 | [[underscore][_]] | |
106 | [[grave-accent][`]] | |
107 | [[left-curly-bracket][{]] | |
108 | [[vertical-line][|]] | |
109 | [[right-curly-bracket][}]] | |
110 | [[tilde][~]] | |
111 | [[DEL][\\x7F]] | |
112 | ] | |
113 | ||
114 | [endsect] | |
115 | ||
116 | [section:named_unicode Named Unicode Characters] | |
117 | ||
118 | When using [link boost_regex.unicode Unicode aware regular expressions] (with the `u32regex` type), all | |
119 | the normal symbolic names for Unicode characters (those given in Unidata.txt) | |
120 | are recognised. So for example: | |
121 | ||
122 | [pre \[\[.CYRILLIC CAPITAL LETTER I.\]\] ] | |
123 | ||
124 | would match the Unicode character 0x0418. | |
125 | ||
126 | [endsect] | |
127 | [endsect] | |
128 |