]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/ |
2 | / Copyright (c) 2008 Eric Niebler | |
3 | / | |
4 | / Distributed under the Boost Software License, Version 1.0. (See accompanying | |
5 | / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
6 | /] | |
7 | ||
8 | [section Localization and Regex Traits] | |
9 | ||
10 | [h2 Overview] | |
11 | ||
12 | Matching a regular expression against a string often requires locale-dependent information. For example, | |
13 | how are case-insensitive comparisons performed? The locale-sensitive behavior is captured in a traits class. | |
14 | xpressive provides three traits class templates: `cpp_regex_traits<>`, `c_regex_traits<>` and `null_regex_traits<>`. | |
15 | The first wraps a `std::locale`, the second wraps the global C locale, and the third is a stub traits type for | |
16 | use when searching non-character data. All traits templates conform to the | |
17 | [link boost_xpressive.user_s_guide.concepts.traits_requirements Regex Traits Concept]. | |
18 | ||
19 | [h2 Setting the Default Regex Trait] | |
20 | ||
21 | By default, xpressive uses `cpp_regex_traits<>` for all patterns. This causes all regex objects to use | |
22 | the global `std::locale`. If you compile with `BOOST_XPRESSIVE_USE_C_TRAITS` defined, then xpressive will use | |
23 | `c_regex_traits<>` by default. | |
24 | ||
25 | [h2 Using Custom Traits with Dynamic Regexes] | |
26 | ||
27 | To create a dynamic regex that uses a custom traits object, you must use _regex_compiler_. | |
28 | The basic steps are shown in the following example: | |
29 | ||
30 | // Declare a regex_compiler that uses the global C locale | |
31 | regex_compiler<char const *, c_regex_traits<char> > crxcomp; | |
32 | cregex crx = crxcomp.compile( "\\w+" ); | |
33 | ||
34 | // Declare a regex_compiler that uses a custom std::locale | |
35 | std::locale loc = /* ... create a locale here ... */; | |
36 | regex_compiler<char const *, cpp_regex_traits<char> > cpprxcomp(loc); | |
37 | cregex cpprx = cpprxcomp.compile( "\\w+" ); | |
38 | ||
39 | The `regex_compiler` objects act as regex factories. Once they have been imbued with a locale, | |
40 | every regex object they create will use that locale. | |
41 | ||
42 | [h2 Using Custom Traits with Static Regexes] | |
43 | ||
44 | If you want a particular static regex to use a different set of traits, you can use the special `imbue()` | |
45 | pattern modifier. For instance: | |
46 | ||
47 | // Define a regex that uses the global C locale | |
48 | c_regex_traits<char> ctraits; | |
49 | sregex crx = imbue(ctraits)( +_w ); | |
50 | ||
51 | // Define a regex that uses a customized std::locale | |
52 | std::locale loc = /* ... create a locale here ... */; | |
53 | cpp_regex_traits<char> cpptraits(loc); | |
54 | sregex cpprx1 = imbue(cpptraits)( +_w ); | |
55 | ||
56 | // A shorthand for above | |
57 | sregex cpprx2 = imbue(loc)( +_w ); | |
58 | ||
59 | The `imbue()` pattern modifier must wrap the entire pattern. It is an error to `imbue` only | |
60 | part of a static regex. For example: | |
61 | ||
62 | // ERROR! Cannot imbue() only part of a regex | |
63 | sregex error = _w >> imbue(loc)( _w ); | |
64 | ||
65 | [h2 Searching Non-Character Data With [^null_regex_traits]] | |
66 | ||
67 | With xpressive static regexes, you are not limitted to searching for patterns in character sequences. | |
68 | You can search for patterns in raw bytes, integers, or anything that conforms to the | |
69 | [link boost_xpressive.user_s_guide.concepts.chart_requirements Char Concept]. The `null_regex_traits<>` makes it simple. It is a | |
70 | stub implementation of the [link boost_xpressive.user_s_guide.concepts.traits_requirements Regex Traits Concept]. It recognizes | |
71 | no character classes and does no case-sensitive mappings. | |
72 | ||
73 | For example, with `null_regex_traits<>`, you can write a static regex to find a pattern in a | |
74 | sequence of integers as follows: | |
75 | ||
76 | // some integral data to search | |
77 | int const data[] = {0, 1, 2, 3, 4, 5, 6}; | |
78 | ||
79 | // create a null_regex_traits<> object for searching integers ... | |
80 | null_regex_traits<int> nul; | |
81 | ||
82 | // imbue a regex object with the null_regex_traits ... | |
83 | basic_regex<int const *> rex = imbue(nul)(1 >> +((set= 2,3) | 4) >> 5); | |
84 | match_results<int const *> what; | |
85 | ||
86 | // search for the pattern in the array of integers ... | |
87 | regex_search(data, data + 7, what, rex); | |
88 | ||
89 | assert(what[0].matched); | |
90 | assert(*what[0].first == 1); | |
91 | assert(*what[0].second == 6); | |
92 | ||
93 | [endsect] |