ceph/src/boost/libs/regex/doc/collating_names.qbk

   1 [/
   2   Copyright 2006-2007 John Maddock.
   3   Distributed under the Boost Software License, Version 1.0.
   4   (See accompanying file LICENSE_1_0.txt or copy at
   5   http://www.boost.org/LICENSE_1_0.txt).
   6 ]
   7
   8
   9 [section:collating_names Collating Names]
  10
  11 [section:digraphs Digraphs]
  12
  13 The following are treated as valid digraphs when used as a collating name:
  14
  15 "ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ".
  16
  17 So for example the expression:
  18
  19 [pre \[\[.ae.\]-c\] ]
  20
  21 will match any character that collates between the digraph "ae" and the character "c".
  22
  23 [endsect]
  24
  25 [section:posix_symbolic_names POSIX Symbolic Names]
  26
  27 The following symbolic names are recognised as valid collating element names,
  28 in addition to any single character, this allows you to write for example:
  29
  30 [pre \[\[.left-square-bracket.\]\[.right-square-bracket.\]\]]
  31
  32 if you wanted to match either "\[" or "\]".
  33
  34 [table
  35 [[Name][Character]]
  36 [[NUL]  [\\x00]]
  37 [[SOH]  [\\x01]]
  38 [[STX]  [\\x02]]
  39 [[ETX]  [\\x03]]
  40 [[EOT]  [\\x04]]
  41 [[ENQ]  [\\x05]]
  42 [[ACK]  [\\x06]]
  43 [[alert]        [\\x07]]
  44 [[backspace]    [\\x08]]
  45 [[tab]  [\\t]]
  46 [[newline]      [\\n]]
  47 [[vertical-tab]         [\\v]]
  48 [[form-feed]    [\\f]]
  49 [[carriage-return]      [\\r]]
  50 [[SO]   [\\xE]]
  51 [[SI]   [\\xF]]
  52 [[DLE]  [\\x10]]
  53 [[DC1]  [\\x11]]
  54 [[DC2]  [\\x12]]
  55 [[DC3]  [\\x13]]
  56 [[DC4]  [\\x14]]
  57 [[NAK]  [\\x15]]
  58 [[SYN]  [\\x16]]
  59 [[ETB]  [\\x17]]
  60 [[CAN]  [\\x18]]
  61 [[EM]   [\\x19]]
  62 [[SUB]  [\\x1A]]
  63 [[ESC]  [\\x1B]]
  64 [[IS4]  [\\x1C]]
  65 [[IS3]  [\\x1D]]
  66 [[IS2]  [\\x1E]]
  67 [[IS1]  [\\x1F]]
  68 [[space]        [\\x20]]
  69 [[exclamation-mark]     [!]]
  70 [[quotation-mark]       ["]]
  71 [[number-sign]  [#]]
  72 [[dollar-sign]  [$]]
  73 [[percent-sign]         [%]]
  74 [[ampersand]    [&]]
  75 [[apostrophe]   [\']]
  76 [[left-parenthesis]     [(]]
  77 [[right-parenthesis]    [)]]
  78 [[asterisk]     [\*]]
  79 [[plus-sign]    [+]]
  80 [[comma]        [,]]
  81 [[hyphen]       [-]]
  82 [[period]       [.]]
  83 [[slash]        [ / ]]
  84 [[zero]         [0]]
  85 [[one]  [1]]
  86 [[two]  [2]]
  87 [[three]        [3]]
  88 [[four]         [4]]
  89 [[five]         [5]]
  90 [[six]  [6]]
  91 [[seven]        [7]]
  92 [[eight]        [8]]
  93 [[nine]         [9]]
  94 [[colon]        [\:]]
  95 [[semicolon]    [;]]
  96 [[less-than-sign]       [<]]
  97 [[equals-sign]  [=]]
  98 [[greater-than-sign]    [>]]
  99 [[question-mark]        [?]]
 100 [[commercial-at]        [@]]
 101 [[left-square-bracket]  [\[]]
 102 [[backslash][\\]]
 103 [[right-square-bracket][\]]]
 104 [[circumflex][~]]
 105 [[underscore][_]]
 106 [[grave-accent][`]]
 107 [[left-curly-bracket][{]]
 108 [[vertical-line][|]]
 109 [[right-curly-bracket][}]]
 110 [[tilde][~]]
 111 [[DEL][\\x7F]]
 112 ]
 113
 114 [endsect]
 115
 116 [section:named_unicode Named Unicode Characters]
 117
 118 When using [link boost_regex.unicode Unicode aware regular expressions] (with the `u32regex` type), all
 119 the normal symbolic names for Unicode characters (those given in Unidata.txt)
 120 are recognised.  So for example:
 121
 122 [pre \[\[.CYRILLIC CAPITAL LETTER I.\]\] ]
 123
 124 would match the Unicode character 0x0418.
 125
 126 [endsect]
 127 [endsect]
 128