]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/ |
2 | / Copyright (c) 2009 Eric Niebler | |
3 | / | |
4 | / Distributed under the Boost Software License, Version 1.0. (See accompanying | |
5 | / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
6 | /] | |
7 | ||
8 | [section Named Captures] | |
9 | ||
10 | [h2 Overview] | |
11 | ||
12 | For complicated regular expressions, dealing with numbered captures can be a | |
13 | pain. Counting left parentheses to figure out which capture to reference is | |
14 | no fun. Less fun is the fact that merely editing a regular expression could | |
15 | cause a capture to be assigned a new number, invaliding code that refers back | |
16 | to it by the old number. | |
17 | ||
18 | Other regular expression engines solve this problem with a feature called | |
19 | /named captures/. This feature allows you to assign a name to a capture, and | |
20 | to refer back to the capture by name rather by number. Xpressive also supports | |
21 | named captures, both in dynamic and in static regexes. | |
22 | ||
23 | [h2 Dynamic Named Captures] | |
24 | ||
25 | For dynamic regular expressions, xpressive follows the lead of other popular | |
26 | regex engines with the syntax of named captures. You can create a named capture | |
27 | with `"(?P<xxx>...)"` and refer back to that capture with `"(?P=xxx)"`. Here, | |
28 | for instance, is a regular expression that creates a named capture and refers | |
29 | back to it: | |
30 | ||
31 | // Create a named capture called "char" that matches a single | |
32 | // character and refer back to that capture by name. | |
33 | sregex rx = sregex::compile("(?P<char>.)(?P=char)"); | |
34 | ||
35 | The effect of the above regular expression is to find the first doubled | |
36 | character. | |
37 | ||
38 | Once you have executed a match or search operation using a regex with named | |
39 | captures, you can access the named capture through the _match_results_ object | |
40 | using the capture's name. | |
41 | ||
42 | std::string str("tweet"); | |
43 | sregex rx = sregex::compile("(?P<char>.)(?P=char)"); | |
44 | smatch what; | |
45 | if(regex_search(str, what, rx)) | |
46 | { | |
47 | std::cout << "char = " << what["char"] << std::endl; | |
48 | } | |
49 | ||
50 | The above code displays: | |
51 | ||
52 | [pre | |
53 | char = e | |
54 | ] | |
55 | ||
56 | You can also refer back to a named capture from within a substitution string. | |
57 | The syntax for that is `"\\g<xxx>"`. Below is some code that demonstrates how | |
58 | to use named captures when doing string substitution. | |
59 | ||
60 | std::string str("tweet"); | |
61 | sregex rx = sregex::compile("(?P<char>.)(?P=char)"); | |
62 | str = regex_replace(str, rx, "**\\g<char>**", regex_constants::format_perl); | |
63 | std::cout << str << std::endl; | |
64 | ||
65 | Notice that you have to specify `format_perl` when using named captures. Only | |
66 | the perl syntax recognizes the `"\\g<xxx>"` syntax. The above code displays: | |
67 | ||
68 | [pre | |
69 | tw\*\*e\*\*t | |
70 | ] | |
71 | ||
72 | [h2 Static Named Captures] | |
73 | ||
74 | If you're using static regular expressions, creating and using named | |
75 | captures is even easier. You can use the _mark_tag_ type to create | |
76 | a variable that you can use like [globalref boost::xpressive::s1 `s1`], | |
77 | [globalref boost::xpressive::s1 `s2`] and friends, but with a name | |
78 | that is more meaningful. Below is how the above example would look | |
79 | using static regexes: | |
80 | ||
81 | mark_tag char_(1); // char_ is now a synonym for s1 | |
82 | sregex rx = (char_= _) >> char_; | |
83 | ||
84 | After a match operation, you can use the `mark_tag` to index into the | |
85 | _match_results_ to access the named capture: | |
86 | ||
87 | std::string str("tweet"); | |
88 | mark_tag char_(1); | |
89 | sregex rx = (char_= _) >> char_; | |
90 | smatch what; | |
91 | if(regex_search(str, what, rx)) | |
92 | { | |
93 | std::cout << what[char_] << std::endl; | |
94 | } | |
95 | ||
96 | The above code displays: | |
97 | ||
98 | [pre | |
99 | char = e | |
100 | ] | |
101 | ||
102 | When doing string substitutions with _regex_replace_, you can use named | |
103 | captures to create /format expressions/ as below: | |
104 | ||
105 | std::string str("tweet"); | |
106 | mark_tag char_(1); | |
107 | sregex rx = (char_= _) >> char_; | |
108 | str = regex_replace(str, rx, "**" + char_ + "**"); | |
109 | std::cout << str << std::endl; | |
110 | ||
111 | The above code displays: | |
112 | ||
113 | [pre | |
114 | tw\*\*e\*\*t | |
115 | ] | |
116 | ||
117 | [note You need to include [^<boost/xpressive/regex_actions.hpp>] to | |
118 | use format expressions.] | |
119 | ||
120 | [endsect] |