]> git.proxmox.com Git - ceph.git/blame - ceph/src/boost/libs/regex/doc/html/boost_regex/partial_matches.html
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / regex / doc / html / boost_regex / partial_matches.html
CommitLineData
7c673cae
FG
1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
4<title>Partial Matches</title>
5<link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css">
6<meta name="generator" content="DocBook XSL Stylesheets V1.77.1">
7<link rel="home" href="../index.html" title="Boost.Regex 5.1.2">
8<link rel="up" href="../index.html" title="Boost.Regex 5.1.2">
9<link rel="prev" href="captures.html" title="Understanding Marked Sub-Expressions and Captures">
10<link rel="next" href="syntax.html" title="Regular Expression Syntax">
11</head>
12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13<table cellpadding="2" width="100%"><tr>
14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td>
15<td align="center"><a href="../../../../../index.html">Home</a></td>
16<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td>
17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19<td align="center"><a href="../../../../../more/index.htm">More</a></td>
20</tr></table>
21<hr>
22<div class="spirit-nav">
23<a accesskey="p" href="captures.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="syntax.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
24</div>
25<div class="section">
26<div class="titlepage"><div><div><h2 class="title" style="clear: both">
27<a name="boost_regex.partial_matches"></a><a class="link" href="partial_matches.html" title="Partial Matches">Partial Matches</a>
28</h2></div></div></div>
29<p>
30 The <a class="link" href="ref/match_flag_type.html" title="match_flag_type"><code class="computeroutput"><span class="identifier">match_flag_type</span></code></a>
31 <code class="computeroutput"><span class="identifier">match_partial</span></code> can be passed
32 to the following algorithms: <a class="link" href="ref/regex_match.html" title="regex_match"><code class="computeroutput"><span class="identifier">regex_match</span></code></a>, <a class="link" href="ref/regex_search.html" title="regex_search"><code class="computeroutput"><span class="identifier">regex_search</span></code></a>, and <a class="link" href="ref/deprecated_interfaces/regex_grep.html" title="regex_grep (Deprecated)"><code class="computeroutput"><span class="identifier">regex_grep</span></code></a>, and used with the iterator
33 <a class="link" href="ref/regex_iterator.html" title="regex_iterator"><code class="computeroutput"><span class="identifier">regex_iterator</span></code></a>.
34 When used it indicates that partial as well as full matches should be found.
35 A partial match is one that matched one or more characters at the end of the
36 text input, but did not match all of the regular expression (although it may
37 have done so had more input been available). Partial matches are typically
38 used when either validating data input (checking each character as it is entered
39 on the keyboard), or when searching texts that are either too long to load
40 into memory (or even into a memory mapped file), or are of indeterminate length
41 (for example the source may be a socket or similar). Partial and full matches
42 can be differentiated as shown in the following table (the variable M represents
43 an instance of <a class="link" href="ref/match_results.html" title="match_results"><code class="computeroutput"><span class="identifier">match_results</span></code></a> as filled in by <a class="link" href="ref/regex_match.html" title="regex_match"><code class="computeroutput"><span class="identifier">regex_match</span></code></a>,
44 <a class="link" href="ref/regex_search.html" title="regex_search"><code class="computeroutput"><span class="identifier">regex_search</span></code></a>
45 or <a class="link" href="ref/deprecated_interfaces/regex_grep.html" title="regex_grep (Deprecated)"><code class="computeroutput"><span class="identifier">regex_grep</span></code></a>):
46 </p>
47<div class="informaltable"><table class="table">
48<colgroup>
49<col>
50<col>
51<col>
52<col>
53<col>
54</colgroup>
55<thead><tr>
56<th>
57 </th>
58<th>
59 <p>
60 Result
61 </p>
62 </th>
63<th>
64 <p>
65 M[0].matched
66 </p>
67 </th>
68<th>
69 <p>
70 M[0].first
71 </p>
72 </th>
73<th>
74 <p>
75 M[0].second
76 </p>
77 </th>
78</tr></thead>
79<tbody>
80<tr>
81<td>
82 <p>
83 No match
84 </p>
85 </td>
86<td>
87 <p>
88 False
89 </p>
90 </td>
91<td>
92 <p>
93 Undefined
94 </p>
95 </td>
96<td>
97 <p>
98 Undefined
99 </p>
100 </td>
101<td>
102 <p>
103 Undefined
104 </p>
105 </td>
106</tr>
107<tr>
108<td>
109 <p>
110 Partial match
111 </p>
112 </td>
113<td>
114 <p>
115 True
116 </p>
117 </td>
118<td>
119 <p>
120 False
121 </p>
122 </td>
123<td>
124 <p>
125 Start of partial match.
126 </p>
127 </td>
128<td>
129 <p>
130 End of partial match (end of text).
131 </p>
132 </td>
133</tr>
134<tr>
135<td>
136 <p>
137 Full match
138 </p>
139 </td>
140<td>
141 <p>
142 True
143 </p>
144 </td>
145<td>
146 <p>
147 True
148 </p>
149 </td>
150<td>
151 <p>
152 Start of full match.
153 </p>
154 </td>
155<td>
156 <p>
157 End of full match.
158 </p>
159 </td>
160</tr>
161</tbody>
162</table></div>
163<p>
164 Be aware that using partial matches can sometimes result in somewhat imperfect
165 behavior:
166 </p>
167<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
168<li class="listitem">
169 There are some expressions, such as ".*abc" that will always
170 produce a partial match. This problem can be reduced by careful construction
171 of the regular expressions used, or by setting flags like match_not_dot_newline
172 so that expressions like .* can't match past line boundaries.
173 </li>
174<li class="listitem">
175 Boost.Regex currently prefers leftmost matches to full matches, so for
176 example matching "abc|b" against "ab" produces a partial
177 match against the "ab" rather than a full match against "b".
178 It's more efficient to work this way, but may not be the behavior you want
179 in all situations.
180 </li>
181<li class="listitem">
182 There are situations where full matches are found even though partial matches
183 are also possible: for example if the partial string terminates with "abc"
184 and the regular expression is "\w+", then a full match is found
185 even though there may be more alphabetical characters to come. This particular
186 case can be detected by checking if the match found terminates at the end
187 of current input string. However, there are situations where that is not
188 possible: for example an expression such as "abc.*123" may always
189 have longer matches available since it could conceivably match the entire
190 input string (no matter how long it may be).
191 </li>
192</ul></div>
193<p>
194 The following example tests to see whether the text could be a valid credit
195 card number, as the user presses a key, the character entered would be added
196 to the string being built up, and passed to <code class="computeroutput"><span class="identifier">is_possible_card_number</span></code>.
197 If this returns true then the text could be a valid card number, so the user
198 interface's OK button would be enabled. If it returns false, then this is not
199 yet a valid card number, but could be with more input, so the user interface
200 would disable the OK button. Finally, if the procedure throws an exception
201 the input could never become a valid number, and the inputted character must
202 be discarded, and a suitable error indication displayed to the user.
203 </p>
204<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
205<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
206<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">regex</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
207
208<span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e</span><span class="special">(</span><span class="string">"(\\d{3,4})[- ]?(\\d{4})[- ]?(\\d{4})[- ]?(\\d{4})"</span><span class="special">);</span>
209
210<span class="keyword">bool</span> <span class="identifier">is_possible_card_number</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;</span> <span class="identifier">input</span><span class="special">)</span>
211<span class="special">{</span>
212 <span class="comment">//</span>
213 <span class="comment">// return false for partial match, true for full match, or throw for</span>
214 <span class="comment">// impossible match based on what we have so far...</span>
215 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">match_results</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">::</span><span class="identifier">const_iterator</span><span class="special">&gt;</span> <span class="identifier">what</span><span class="special">;</span>
216 <span class="keyword">if</span><span class="special">(</span><span class="number">0</span> <span class="special">==</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">e</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">match_default</span> <span class="special">|</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">match_partial</span><span class="special">))</span>
217 <span class="special">{</span>
218 <span class="comment">// the input so far could not possibly be valid so reject it:</span>
219 <span class="keyword">throw</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">runtime_error</span><span class="special">(</span>
220 <span class="string">"Invalid data entered - this could not possibly be a valid card number"</span><span class="special">);</span>
221 <span class="special">}</span>
222 <span class="comment">// OK so far so good, but have we finished?</span>
223 <span class="keyword">if</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">matched</span><span class="special">)</span>
224 <span class="special">{</span>
225 <span class="comment">// excellent, we have a result:</span>
226 <span class="keyword">return</span> <span class="keyword">true</span><span class="special">;</span>
227 <span class="special">}</span>
228 <span class="comment">// what we have so far is only a partial match...</span>
229 <span class="keyword">return</span> <span class="keyword">false</span><span class="special">;</span>
230<span class="special">}</span>
231</pre>
232<p>
233 In the following example, text input is taken from a stream containing an unknown
234 amount of text; this example simply counts the number of html tags encountered
235 in the stream. The text is loaded into a buffer and searched a part at a time,
236 if a partial match was encountered, then the partial match gets searched a
237 second time as the start of the next batch of text:
238 </p>
239<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
240<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">fstream</span><span class="special">&gt;</span>
241<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">sstream</span><span class="special">&gt;</span>
242<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">string</span><span class="special">&gt;</span>
243<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">regex</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
244
245<span class="comment">// match some kind of html tag:</span>
246<span class="identifier">boost</span><span class="special">::</span><span class="identifier">regex</span> <span class="identifier">e</span><span class="special">(</span><span class="string">"&lt;[^&gt;]*&gt;"</span><span class="special">);</span>
247<span class="comment">// count how many:</span>
248<span class="keyword">unsigned</span> <span class="keyword">int</span> <span class="identifier">tags</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
249
250<span class="keyword">void</span> <span class="identifier">search</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">istream</span><span class="special">&amp;</span> <span class="identifier">is</span><span class="special">)</span>
251<span class="special">{</span>
252 <span class="comment">// buffer we'll be searching in:</span>
253 <span class="keyword">char</span> <span class="identifier">buf</span><span class="special">[</span><span class="number">4096</span><span class="special">];</span>
254 <span class="comment">// saved position of end of partial match:</span>
255 <span class="keyword">const</span> <span class="keyword">char</span><span class="special">*</span> <span class="identifier">next_pos</span> <span class="special">=</span> <span class="identifier">buf</span> <span class="special">+</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">buf</span><span class="special">);</span>
256 <span class="comment">// flag to indicate whether there is more input to come:</span>
257 <span class="keyword">bool</span> <span class="identifier">have_more</span> <span class="special">=</span> <span class="keyword">true</span><span class="special">;</span>
258
259 <span class="keyword">while</span><span class="special">(</span><span class="identifier">have_more</span><span class="special">)</span>
260 <span class="special">{</span>
261 <span class="comment">// how much do we copy forward from last try:</span>
262 <span class="keyword">unsigned</span> <span class="identifier">leftover</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">buf</span> <span class="special">+</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">buf</span><span class="special">))</span> <span class="special">-</span> <span class="identifier">next_pos</span><span class="special">;</span>
263 <span class="comment">// and how much is left to fill:</span>
264 <span class="keyword">unsigned</span> <span class="identifier">size</span> <span class="special">=</span> <span class="identifier">next_pos</span> <span class="special">-</span> <span class="identifier">buf</span><span class="special">;</span>
265 <span class="comment">// copy forward whatever we have left:</span>
266 <span class="identifier">std</span><span class="special">::</span><span class="identifier">memmove</span><span class="special">(</span><span class="identifier">buf</span><span class="special">,</span> <span class="identifier">next_pos</span><span class="special">,</span> <span class="identifier">leftover</span><span class="special">);</span>
267 <span class="comment">// fill the rest from the stream:</span>
268 <span class="identifier">is</span><span class="special">.</span><span class="identifier">read</span><span class="special">(</span><span class="identifier">buf</span> <span class="special">+</span> <span class="identifier">leftover</span><span class="special">,</span> <span class="identifier">size</span><span class="special">);</span>
269 <span class="keyword">unsigned</span> <span class="identifier">read</span> <span class="special">=</span> <span class="identifier">is</span><span class="special">.</span><span class="identifier">gcount</span><span class="special">();</span>
270 <span class="comment">// check to see if we've run out of text:</span>
271 <span class="identifier">have_more</span> <span class="special">=</span> <span class="identifier">read</span> <span class="special">==</span> <span class="identifier">size</span><span class="special">;</span>
272 <span class="comment">// reset next_pos:</span>
273 <span class="identifier">next_pos</span> <span class="special">=</span> <span class="identifier">buf</span> <span class="special">+</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">buf</span><span class="special">);</span>
274 <span class="comment">// and then iterate:</span>
275 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">cregex_iterator</span> <span class="identifier">a</span><span class="special">(</span>
276 <span class="identifier">buf</span><span class="special">,</span>
277 <span class="identifier">buf</span> <span class="special">+</span> <span class="identifier">read</span> <span class="special">+</span> <span class="identifier">leftover</span><span class="special">,</span>
278 <span class="identifier">e</span><span class="special">,</span>
279 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">match_default</span> <span class="special">|</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">match_partial</span><span class="special">);</span>
280 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">cregex_iterator</span> <span class="identifier">b</span><span class="special">;</span>
281
282 <span class="keyword">while</span><span class="special">(</span><span class="identifier">a</span> <span class="special">!=</span> <span class="identifier">b</span><span class="special">)</span>
283 <span class="special">{</span>
284 <span class="keyword">if</span><span class="special">((*</span><span class="identifier">a</span><span class="special">)[</span><span class="number">0</span><span class="special">].</span><span class="identifier">matched</span> <span class="special">==</span> <span class="keyword">false</span><span class="special">)</span>
285 <span class="special">{</span>
286 <span class="comment">// Partial match, save position and break:</span>
287 <span class="identifier">next_pos</span> <span class="special">=</span> <span class="special">(*</span><span class="identifier">a</span><span class="special">)[</span><span class="number">0</span><span class="special">].</span><span class="identifier">first</span><span class="special">;</span>
288 <span class="keyword">break</span><span class="special">;</span>
289 <span class="special">}</span>
290 <span class="keyword">else</span>
291 <span class="special">{</span>
292 <span class="comment">// full match:</span>
293 <span class="special">++</span><span class="identifier">tags</span><span class="special">;</span>
294 <span class="special">}</span>
295
296 <span class="comment">// move to next match:</span>
297 <span class="special">++</span><span class="identifier">a</span><span class="special">;</span>
298 <span class="special">}</span>
299 <span class="special">}</span>
300<span class="special">}</span>
301</pre>
302</div>
303<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
304<td align="left"></td>
305<td align="right"><div class="copyright-footer">Copyright &#169; 1998-2013 John Maddock<p>
306 Distributed under the Boost Software License, Version 1.0. (See accompanying
307 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
308 </p>
309</div></td>
310</tr></table>
311<hr>
312<div class="spirit-nav">
313<a accesskey="p" href="captures.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="syntax.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
314</div>
315</body>
316</html>