[ceph.git] / ceph / src / boost / libs / regex / doc / collating_names.qbk

[/ 
  Copyright 2006-2007 John Maddock.
  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]


[section:collating_names Collating Names]

[section:digraphs Digraphs]

The following are treated as valid digraphs when used as a collating name:

"ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ".

So for example the expression:

[pre \[\[.ae.\]-c\] ]

will match any character that collates between the digraph "ae" and the character "c".

[endsect]

[section:posix_symbolic_names POSIX Symbolic Names]

The following symbolic names are recognised as valid collating element names, 
in addition to any single character, this allows you to write for example:

[pre \[\[.left-square-bracket.\]\[.right-square-bracket.\]\]]

if you wanted to match either "\[" or "\]".

[table
[[Name][Character]]
[[NUL] 	[\\x00]]
[[SOH] 	[\\x01]]
[[STX] 	[\\x02]]
[[ETX] 	[\\x03]]
[[EOT] 	[\\x04]]
[[ENQ] 	[\\x05]]
[[ACK] 	[\\x06]]
[[alert] 	[\\x07]]
[[backspace] 	[\\x08]]
[[tab] 	[\\t]]
[[newline] 	[\\n]]
[[vertical-tab] 	[\\v]]
[[form-feed] 	[\\f]]
[[carriage-return] 	[\\r]]
[[SO] 	[\\xE]]
[[SI] 	[\\xF]]
[[DLE] 	[\\x10]]
[[DC1] 	[\\x11]]
[[DC2] 	[\\x12]]
[[DC3] 	[\\x13]]
[[DC4] 	[\\x14]]
[[NAK] 	[\\x15]]
[[SYN] 	[\\x16]]
[[ETB] 	[\\x17]]
[[CAN] 	[\\x18]]
[[EM] 	[\\x19]]
[[SUB] 	[\\x1A]]
[[ESC] 	[\\x1B]]
[[IS4] 	[\\x1C]]
[[IS3] 	[\\x1D]]
[[IS2] 	[\\x1E]]
[[IS1] 	[\\x1F]]
[[space] 	[\\x20]]
[[exclamation-mark] 	[!]]
[[quotation-mark] 	["]]
[[number-sign] 	[#]]
[[dollar-sign] 	[$]]
[[percent-sign] 	[%]]
[[ampersand] 	[&]]
[[apostrophe] 	[\']]
[[left-parenthesis] 	[(]]
[[right-parenthesis] 	[)]]
[[asterisk] 	[\*]]
[[plus-sign] 	[+]]
[[comma] 	[,]]
[[hyphen] 	[-]]
[[period] 	[.]]
[[slash] 	[ / ]]
[[zero] 	[0]]
[[one] 	[1]]
[[two] 	[2]]
[[three] 	[3]]
[[four] 	[4]]
[[five] 	[5]]
[[six] 	[6]]
[[seven] 	[7]]
[[eight] 	[8]]
[[nine] 	[9]]
[[colon] 	[\:]]
[[semicolon] 	[;]]
[[less-than-sign] 	[<]]
[[equals-sign] 	[=]]
[[greater-than-sign] 	[>]]
[[question-mark] 	[?]]
[[commercial-at] 	[@]]
[[left-square-bracket] 	[\[]]
[[backslash][\\]]
[[right-square-bracket][\]]]
[[circumflex][~]]
[[underscore][_]]
[[grave-accent][`]]
[[left-curly-bracket][{]]
[[vertical-line][|]]
[[right-curly-bracket][}]]
[[tilde][~]]
[[DEL][\\x7F]]
]

[endsect]

[section:named_unicode Named Unicode Characters]

When using [link boost_regex.unicode Unicode aware regular expressions] (with the `u32regex` type), all 
the normal symbolic names for Unicode characters (those given in Unidata.txt) 
are recognised.  So for example:

[pre \[\[.CYRILLIC CAPITAL LETTER I.\]\] ]

would match the Unicode character 0x0418.

[endsect]
[endsect]
Commit	Line	Data
7c673cae FG	1	[/
	2	Copyright 2006-2007 John Maddock.
	3	Distributed under the Boost Software License, Version 1.0.
	4	(See accompanying file LICENSE_1_0.txt or copy at
	5	http://www.boost.org/LICENSE_1_0.txt).
	6	]
	7
	8
	9	[section:collating_names Collating Names]
	10
	11	[section:digraphs Digraphs]
	12
	13	The following are treated as valid digraphs when used as a collating name:
	14
	15	"ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ".
	16
	17	So for example the expression:
	18
	19	[pre \[\[.ae.\]-c\] ]
	20
	21	will match any character that collates between the digraph "ae" and the character "c".
	22
	23	[endsect]
	24
	25	[section:posix_symbolic_names POSIX Symbolic Names]
	26
	27	The following symbolic names are recognised as valid collating element names,
	28	in addition to any single character, this allows you to write for example:
	29
	30	[pre \[\[.left-square-bracket.\]\[.right-square-bracket.\]\]]
	31
	32	if you wanted to match either "\[" or "\]".
	33
	34	[table
	35	[[Name][Character]]
	36	[[NUL] [\\x00]]
	37	[[SOH] [\\x01]]
	38	[[STX] [\\x02]]
	39	[[ETX] [\\x03]]
	40	[[EOT] [\\x04]]
	41	[[ENQ] [\\x05]]
	42	[[ACK] [\\x06]]
	43	[[alert] [\\x07]]
	44	[[backspace] [\\x08]]
	45	[[tab] [\\t]]
	46	[[newline] [\\n]]
	47	[[vertical-tab] [\\v]]
	48	[[form-feed] [\\f]]
	49	[[carriage-return] [\\r]]
	50	[[SO] [\\xE]]
	51	[[SI] [\\xF]]
	52	[[DLE] [\\x10]]
	53	[[DC1] [\\x11]]
	54	[[DC2] [\\x12]]
	55	[[DC3] [\\x13]]
	56	[[DC4] [\\x14]]
	57	[[NAK] [\\x15]]
	58	[[SYN] [\\x16]]
	59	[[ETB] [\\x17]]
	60	[[CAN] [\\x18]]
	61	[[EM] [\\x19]]
	62	[[SUB] [\\x1A]]
	63	[[ESC] [\\x1B]]
	64	[[IS4] [\\x1C]]
65	[[IS3] [\\x1D]]
66	[[IS2] [\\x1E]]
67	[[IS1] [\\x1F]]
68	[[space] [\\x20]]
69	[[exclamation-mark] [!]]
70	[[quotation-mark] ["]]
71	[[number-sign] [#]]
72	[[dollar-sign] [$]]
73	[[percent-sign] [%]]
74	[[ampersand] [&]]
75	[[apostrophe] [\']]
76	[[left-parenthesis] [(]]
77	[[right-parenthesis] [)]]
78	[[asterisk] [\*]]
79	[[plus-sign] [+]]
80	[[comma] [,]]
81	[[hyphen] [-]]
82	[[period] [.]]
83	[[slash] [ / ]]
84	[[zero] [0]]
85	[[one] [1]]
86	[[two] [2]]
87	[[three] [3]]
88	[[four] [4]]
89	[[five] [5]]
90	[[six] [6]]
91	[[seven] [7]]
92	[[eight] [8]]
93	[[nine] [9]]
94	[[colon] [\:]]
95	[[semicolon] [;]]
96	[[less-than-sign] [<]]
97	[[equals-sign] [=]]
98	[[greater-than-sign] [>]]
99	[[question-mark] [?]]
100	[[commercial-at] [@]]
101	[[left-square-bracket] [\[]]
102	[[backslash][\\]]
103	[[right-square-bracket][\]]]
104	[[circumflex][~]]
105	[[underscore][_]]
106	[[grave-accent][`]]
107	[[left-curly-bracket][{]]
108	[[vertical-line][\|]]
109	[[right-curly-bracket][}]]
110	[[tilde][~]]
111	[[DEL][\\x7F]]
112	]
113
114	[endsect]
115
116	[section:named_unicode Named Unicode Characters]
117
118	When using [link boost_regex.unicode Unicode aware regular expressions] (with the `u32regex` type), all
119	the normal symbolic names for Unicode characters (those given in Unidata.txt)
120	are recognised. So for example:
121
122	[pre \[\[.CYRILLIC CAPITAL LETTER I.\]\] ]
123
124	would match the Unicode character 0x0418.
125
126	[endsect]
127	[endsect]
128