1 <!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN">
4 <TITLE>Tutorial
</TITLE>
5 <LINK REL=
"stylesheet" HREF=
"../../../../boost.css">
6 <LINK REL=
"stylesheet" HREF=
"../theme/iostreams.css">
12 <H1 CLASS=
"title">Tutorial
</H1>
20 <A HREF='tab_expanding_filters.html'
><IMG BORDER=
0 WIDTH=
19 HEIGHT=
19 SRC='../../../../doc/src/images/prev.png'
></A>
21 <A HREF='tutorial.html'
><IMG BORDER=
0 WIDTH=
19 HEIGHT=
19 SRC='../../../../doc/src/images/up.png'
></A>
22 <A HREF='unix2dos_filters.html'
><IMG BORDER=
0 WIDTH=
19 HEIGHT=
19 SRC='../../../../doc/src/images/next.png'
></A>
27 <H2>2.2.6. Dictionary Filters
</H2>
30 A
<SPAN CLASS='term'
>dictionary filter
</SPAN> is a Filter which performs text substitution in the following manner. It maintains a collection of pairs of strings whose first components are words and whose second components represent replacement text
— I'll call such a collection a
<SPAN CLASS='term'
>dictionary
</SPAN>, and refer to the pairs it contains as
<SPAN CLASS='term'
>definitions
</SPAN>. When a dictionary filter encounters a word which appears as the first component of a definition, it forwards the replacement text instead of the original word. Other words, whitespace and punctuation are forwarded unchanged.
34 The basic algorithm is as follows: You examine characters one at a time, appending them to a string which I'll call the
<SPAN CLASS='term'
>current word
</SPAN>. When you encounter a non-alphabetic character, you consult the dictionary to determine whether the current word appears as the first component of a definition. If it does, you forward the replacement text followed by the non-alphabetic character. Otherwise, you forward the current word followed by the non-alphabetic character. When the end-of-stream is reached, you consult the dictionary again and forward either the curent word or its replacement, as appropriate.
38 In the following sections, I'll express this algorithm as a
<A HREF=
"../classes/stdio_filter.html"><CODE>stdio_filter
</CODE></A>, an
<A HREF=
"../concepts/input_filter.html">InputFilter
</A> and an
<A HREF=
"../concepts/output_filter.html">OutputFilter
</A>. The source code can be found in the header
<A HREF=
"../../example/dictionary_filter.hpp"><CODE><libs/iostreams/example/dictionary_filter.hpp
></CODE></A>.
41 <A NAME=
"dictionary"></A>
42 <H4><CODE>dictionary
</CODE></H4>
44 <P>You can represent a dictionary using the following class:
</P>
46 <PRE class=
"broken_ie"><SPAN CLASS='preprocessor'
>#include
</SPAN> <SPAN CLASS=
"literal"><map
></SPAN>
47 <SPAN CLASS=
"preprocessor">#include
</SPAN> <SPAN CLASS=
"literal"><string
></SPAN>
49 <SPAN CLASS='keyword'
>namespace
</SPAN> boost {
<SPAN CLASS='keyword'
>namespace
</SPAN> iostreams {
<SPAN CLASS='keyword'
>namespace
</SPAN> example {
51 <SPAN CLASS=
"keyword">class
</SPAN> dictionary {
52 <SPAN CLASS=
"keyword">public
</SPAN>:
53 <SPAN CLASS=
"keyword">void
</SPAN> add(std::string key,
<SPAN CLASS=
"keyword">const
</SPAN> std::string
& value);
54 <SPAN CLASS=
"keyword">void
</SPAN> replace(std::string
& key);
56 <SPAN CLASS='comment'
>/* ... */
</SPAN>
59 } } }
<SPAN CLASS=
"comment">// End namespace boost::iostreams:example
</SPAN></PRE>
62 The member function
<CODE>add
</CODE> converts
<CODE>key
</CODE> to lower case and adds the pair
<CODE>key
</CODE>,
<CODE>value
</CODE> to the dictionary. The member function
<CODE>replace
</CODE> searches for a definition whose first component is equal to the result of converting
<CODE>key
</CODE> to lower case. If it finds such a definition, it assigns the replacement text to
<CODE>key
</CODE>, adjusting the case of the first character to match the case of the first character of
<CODE>key
</CODE>. Otherwise, it does nothing.
65 <A NAME=
"dictionary_stdio_filter"></A>
66 <H4><CODE>dictionary_stdio_filter
</CODE></H4>
68 <P>You can express a dictionary filter as a
<A HREF=
"../classes/stdio_filter.html"><CODE>stdio_filter
</CODE></A> as follows:
</P>
70 <PRE class=
"broken_ie"><SPAN CLASS='preprocessor'
>#include
</SPAN> <SPAN CLASS=
"literal"><cstdio
></SPAN> <SPAN CLASS=
"comment">// EOF
</SPAN>
71 <SPAN CLASS=
"preprocessor">#include
</SPAN> <SPAN CLASS=
"literal"><iostream
></SPAN> <SPAN CLASS=
"comment">// cin, cout
</SPAN>
72 <SPAN CLASS=
"preprocessor">#include
</SPAN> <A CLASS=
"header" HREF=
"../../../../boost/iostreams/filter/stdio.hpp"><SPAN CLASS=
"literal"><boost/iostreams/filter/stdio.hpp
></SPAN></A>
74 <SPAN CLASS='keyword'
>namespace
</SPAN> boost {
<SPAN CLASS='keyword'
>namespace
</SPAN> iostreams {
<SPAN CLASS='keyword'
>namespace
</SPAN> example {
76 <SPAN CLASS=
"keyword">class
</SPAN> dictionary_stdio_filter :
<SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">public
</SPAN></SPAN></SPAN> stdio_filter {
77 <SPAN CLASS=
"keyword">public
</SPAN>:
78 dictionary_stdio_filter(dictionary
& d) : dictionary_(d) { }
79 <SPAN CLASS=
"keyword">private
</SPAN>:
80 <SPAN CLASS=
"keyword">void
</SPAN> do_filter()
82 <SPAN CLASS=
"keyword">using
</SPAN> <SPAN CLASS=
"keyword">namespace
</SPAN> std;
83 <SPAN CLASS=
"keyword">while
</SPAN> (
<SPAN CLASS=
"keyword">true
</SPAN>) {
84 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">int
</SPAN></SPAN> c = std::cin.get();
85 <SPAN CLASS=
"keyword">if
</SPAN> (c ==
<SPAN CLASS=
"numeric_literal">EOF
</SPAN> || !std::isalpha((
<SPAN CLASS=
"keyword">unsigned
</SPAN> <SPAN CLASS=
"keyword">char
</SPAN>) c)) {
86 dictionary_.replace(current_word_);
87 cout.write( current_word_.data(),
88 <SPAN CLASS=
"keyword">static_cast
</SPAN><streamsize
>(current_word_.size()) );
89 current_word_.erase();
90 <SPAN CLASS=
"keyword">if
</SPAN> (c ==
<SPAN CLASS=
"numeric_literal">EOF
</SPAN>)
93 }
<SPAN CLASS=
"keyword">else
</SPAN> {
98 dictionary
& dictionary_;
99 std::string current_word_;
102 } } }
<SPAN CLASS=
"comment">// End namespace boost::iostreams:example
</SPAN></PRE>
105 The implementation of
<CODE>do_filter
</CODE> simply loops, reading characters from
<CODE>std::cin
</CODE> and
<CODE>appending
</CODE> them to the member variable
<CODE>current_word_
</CODE> until a non-alphabetic character or end-of-stream indication is encountered. When this occurs it uses its dictionary, stored in the member variable
<CODE>dictionary_
</CODE>, to replace the current word if necessary. Finally, it writes the current word, followed by the non-alphabetic character, if any, to
<CODE>std::cout
</CODE>.
108 <A NAME=
"dictionary_input_filter"></A>
109 <H4><CODE>dictionary_input_filter
</CODE></H4>
111 <P>You can express a dictionary filter as an
<A HREF=
"../concepts/input_filter.html">InputFilter
</A> as follows:
</P>
113 <PRE class=
"broken_ie"><SPAN CLASS='preprocessor'
>#include
</SPAN> <A CLASS=
"header" HREF=
"../../../../boost/iostreams/char_traits.hpp"><SPAN CLASS=
"literal"><boost/iostreams/char_traits.hpp
></SPAN></A> <SPAN CLASS=
"comment">// EOF, WOULD_BLOCK
</SPAN>
114 <SPAN CLASS='preprocessor'
>#include
</SPAN> <A CLASS=
"header" HREF=
"../../../../boost/iostreams/concepts.hpp"><SPAN CLASS=
"literal"><boost/iostreams/concepts.hpp
></SPAN></A> <SPAN CLASS=
"comment">// input_filter
</SPAN>
115 <SPAN CLASS='preprocessor'
>#include
</SPAN> <A CLASS=
"header" HREF=
"../../../../boost/iostreams/operations.hpp"><SPAN CLASS=
"literal"><boost/iostreams/operations.hpp
></SPAN></A> <SPAN CLASS=
"comment">// get
</SPAN>
117 <SPAN CLASS='keyword'
>namespace
</SPAN> boost {
<SPAN CLASS='keyword'
>namespace
</SPAN> iostreams {
<SPAN CLASS='keyword'
>namespace
</SPAN> example {
119 <SPAN CLASS=
"keyword">class
</SPAN> dictionary_input_filter :
<SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">public
</SPAN></SPAN></SPAN> input_filter {
120 <SPAN CLASS=
"keyword">public
</SPAN>:
121 dictionary_input_filter(dictionary
& d)
122 : dictionary_(d), off_(std::string::npos), eof_(
<SPAN CLASS=
"keyword">false
</SPAN>)
125 <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Source
>
126 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">int
</SPAN></SPAN> get(Source
& src);
128 <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Source
>
129 <SPAN CLASS=
"keyword">void
</SPAN> close(Source
&);
130 <SPAN CLASS=
"keyword">private
</SPAN>:
131 dictionary
& dictionary_;
132 std::string current_word_;
133 std::string::size_type off_;
134 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">bool
</SPAN></SPAN> eof_;
137 } } }
<SPAN CLASS=
"comment">// End namespace boost::iostreams:example
</SPAN></PRE>
139 <P>The function
<CODE>get
</CODE> is implemented as follows:
</P>
141 <PRE class=
"broken_ie"> <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Source
>
142 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">int
</SPAN></SPAN> get(Source
& src)
144 <SPAN CLASS='comment'
>// Handle unfinished business.
</SPAN>
145 <SPAN CLASS=
"keyword">if
</SPAN> (off_ != std::string::npos
&& off_
< current_word_.size())
146 <SPAN CLASS=
"keyword">return
</SPAN> current_word_[off_++];
147 <SPAN CLASS=
"keyword">if
</SPAN> (off_ == current_word_.size()) {
148 current_word_.erase();
149 off_ = std::string::npos;
151 <SPAN CLASS=
"keyword">if
</SPAN> (eof_)
152 <SPAN CLASS=
"keyword">return
</SPAN> <SPAN CLASS=
"numeric_literal">EOF
</SPAN>;
154 <SPAN CLASS='comment'
>// Compute curent word.
</SPAN>
155 <SPAN CLASS=
"keyword">while
</SPAN> (
<SPAN CLASS=
"keyword">true
</SPAN>) {
156 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">int
</SPAN></SPAN> c;
157 <SPAN CLASS=
"keyword">if
</SPAN> ((c = iostreams::get(src)) == WOULD_BLOCK)
158 <SPAN CLASS=
"keyword">return
</SPAN> WOULD_BLOCK;
160 <SPAN CLASS=
"keyword">if
</SPAN> (c ==
<SPAN CLASS=
"numeric_literal">EOF
</SPAN> || !std::isalpha((
<SPAN CLASS=
"keyword">unsigned
</SPAN> <SPAN CLASS=
"keyword">char
</SPAN>) c)) {
161 dictionary_.replace(current_word_);
163 <SPAN CLASS=
"keyword">if
</SPAN> (c ==
<SPAN CLASS=
"numeric_literal">EOF
</SPAN>)
164 eof_ =
<SPAN CLASS=
"keyword">true
</SPAN>;
165 <SPAN CLASS=
"keyword">else
</SPAN>
168 }
<SPAN CLASS=
"keyword">else
</SPAN> {
173 <SPAN CLASS=
"keyword">return
</SPAN> this-
>get(src);
<SPAN CLASS='comment'
>// Note: current_word_ is not empty.
</SPAN>
177 You first check to see whether there are any characters which remain from a previous invocation of
<CODE>get
</CODE>. If so, you update some book keeping information and return the first such character.
180 The
<CODE>while
</CODE> loop is very similar to that of
<A HREF=
"#dictionary_stdio_filter"><CODE>dictionary_stdio_filter::do_filter
</CODE></A>: it reads characters from the
<A HREF=
"../concepts/source.html">Source
</A> <CODE>src
</CODE>, appending them to
<CODE>current_word_
</CODE> until a non-alphabetic character,
<CODE>EOF
</CODE> or
<CODE>WOULD_BLOCK
</CODE> is encountered. The value
<CODE>WOULD_BLOCK
</CODE> is passed on to the caller. In the remaining cases, the dictionary is consulted to determine the appropriate replacement text.
183 <P>Finally,
<CODE>get
</CODE> is called recursively to return the first character of the current word.
</P>
185 <P>As usual, the function
<CODE>close
</CODE> resets the Filter's state:
</P>
187 <PRE class=
"broken_ie"> <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Source
>
188 <SPAN CLASS=
"keyword">void
</SPAN> close(Source
&)
190 current_word_.erase();
191 off_ = std::string::npos;
192 eof_ =
<SPAN CLASS=
"keyword">false
</SPAN>;
195 <A NAME=
"dictionary_output_filter"></A>
196 <H4><CODE>dictionary_output_filter
</CODE></H4>
198 <P>You can express a dictionary filter as an
<A HREF=
"../concepts/output_filter.html">OutputFilter
</A> as follows:
</P>
200 <PRE class=
"broken_ie"><SPAN CLASS='preprocessor'
>#include
</SPAN> <SPAN CLASS='literal'
><algorithm
></SPAN> <SPAN CLASS='comment'
>// swap
</SPAN>
201 <SPAN CLASS='preprocessor'
>#include
</SPAN> <A CLASS=
"header" HREF=
"../../../../boost/iostreams/concepts.hpp"><SPAN CLASS=
"literal"><boost/iostreams/concepts.hpp
></SPAN></A> <SPAN CLASS=
"comment">// output_filter
</SPAN>
202 <SPAN CLASS='preprocessor'
>#include
</SPAN> <A CLASS=
"header" HREF=
"../../../../boost/iostreams/operations.hpp"><SPAN CLASS=
"literal"><boost/iostreams/operations.hpp
></SPAN></A> <SPAN CLASS=
"comment">// write
</SPAN>
204 <SPAN CLASS='keyword'
>namespace
</SPAN> boost {
<SPAN CLASS='keyword'
>namespace
</SPAN> iostreams {
<SPAN CLASS='keyword'
>namespace
</SPAN> example {
206 <SPAN CLASS=
"keyword">class
</SPAN> dictionary_output_filter :
<SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">public
</SPAN></SPAN></SPAN> output_filter {
207 <SPAN CLASS=
"keyword">public
</SPAN>:
208 <SPAN CLASS=
"keyword">typedef
</SPAN> std::map
<std::string, std::string
> map_type;
209 dictionary_output_filter(dictionary
& d)
210 : dictionary_(d), off_(std::string::npos)
213 <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Sink
>
214 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">bool
</SPAN></SPAN> put(Sink
& dest,
<SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">int
</SPAN></SPAN> c);
216 <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Sink
>
217 <SPAN CLASS=
"keyword">void
</SPAN> close(Sink
& dest);
218 <SPAN CLASS=
"keyword">private
</SPAN>:
219 <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Sink
>
220 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">bool
</SPAN></SPAN> write_current_word(Sink
& dest);
221 dictionary
& dictionary_;
222 std::string current_word_;
223 std::string::size_type off_;
226 } } }
<SPAN CLASS=
"comment">// End namespace boost::iostreams:example
</SPAN></PRE>
228 <P>Let's look first at the helper function
<CODE>write_current_word
</CODE>:
</P>
230 <PRE class=
"broken_ie"> <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Sink
>
231 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">bool
</SPAN></SPAN> write_current_word(Sink
& dest)
233 <SPAN CLASS=
"keyword">using
</SPAN> <SPAN CLASS=
"keyword">namespace
</SPAN> std;
234 streamsize amt =
<SPAN CLASS=
"keyword">static_cast
</SPAN><streamsize
>(current_word_.size() - off_);
236 iostreams::write(dest, current_word_.data() + off_, amt);
237 <SPAN CLASS=
"keyword">if
</SPAN> (result == amt) {
238 current_word_.erase();
240 <SPAN CLASS=
"keyword">return
</SPAN> <SPAN CLASS=
"keyword">true
</SPAN>;
241 }
<SPAN CLASS=
"keyword">else
</SPAN> {
242 off_ +=
<SPAN CLASS=
"keyword">static_cast
</SPAN><string::size_type
>(result);
243 <SPAN CLASS=
"keyword">return
</SPAN> <SPAN CLASS=
"keyword">false
</SPAN>;
248 This function attempts to write
<CODE>current_word_
</CODE>, beginning at the offset
<CODE>off_
</CODE>, to the provided
<A HREF=
"../concepts/sink.html">Sink
</A>. If the entire sequence is successfully written,
<CODE>current_word_
</CODE> is cleared and the function returns
<CODE>true
</CODE>. Otherwise the member variable
<CODE>off_
</CODE> is updated to point to the first unwritten character and the function fails.
251 <P>Using
<CODE>write_current_word
</CODE> you can implement
<CODE>put
</CODE> as follows:
</P>
253 <PRE class=
"broken_ie"> <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Sink
>
254 <SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">bool
</SPAN></SPAN> put(Sink
& dest,
<SPAN CLASS=
"keyword"><SPAN CLASS=
"keyword">int
</SPAN></SPAN> c)
256 <SPAN CLASS=
"keyword">if
</SPAN> (off_ != std::string::npos
&& !write_current_word(dest))
257 <SPAN CLASS=
"keyword">return
</SPAN> <SPAN CLASS=
"keyword">false
</SPAN>;
258 <SPAN CLASS=
"keyword">if
</SPAN> (!std::isalpha((
<SPAN CLASS=
"keyword">unsigned
</SPAN> <SPAN CLASS=
"keyword">char
</SPAN>) c)) {
259 dictionary_.replace(current_word_);
264 <SPAN CLASS=
"keyword">return
</SPAN> <SPAN CLASS=
"keyword">true
</SPAN>;
268 As in the implementation of
<A HREF=
"#dictionary_input_filter"><CODE>dictionary_input_filter::get
</CODE></A>, you first check to see whether there are any characters from a previous invocation of
<CODE>put
</CODE> which remain to be written. If so, you attempt to write these characters using
<CODE>write_current_word
</CODE>. If successful, you next examine the given character
<CODE>c
</CODE>. If it is a non-alphabetic character, you consult the dictionary to determine the appropriate replacement text. In any case, you append
<CODE>c
</CODE> to
<CODE>current_word_
</CODE> and return
<CODE>true
</CODE>.
271 <P>The function
<CODE>close
</CODE> has more work to do in this case than simply reseting the Filter's state. Unless the last character of the unfiltered sequence happened to be a non-alphabetic character, the contents of current_word_ will not yet have been written:
</P>
273 <PRE class=
"broken_ie"> <SPAN CLASS=
"keyword">template
</SPAN><<SPAN CLASS=
"keyword">typename
</SPAN> Sink
>
274 void close(Sink
& dest)
276 <SPAN CLASS='comment'
>// Reset current_word_ and off_, saving old values.
</SPAN>
277 std::string current_word;
278 std::string::size_type off =
<SPAN CLASS='numeric_literal'
>0</SPAN>;
279 current_word.swap(current_word_);
280 std::swap(off, off_);
282 <SPAN CLASS='comment'
>// Write remaining characters to dest.
</SPAN>
283 <SPAN CLASS=
"keyword">if
</SPAN> (off == std::string::npos) {
284 dictionary_.replace(current_word);
285 off =
<SPAN CLASS='numeric_literal'
>0</SPAN>;
287 <SPAN CLASS=
"keyword">if
</SPAN> (!current_word.empty())
290 current_word.data() + off,
291 <SPAN CLASS=
"keyword">static_cast
</SPAN><std::streamsize
>(current_word.size() - off)
295 <P>Note that you may assume that the template argument is a
<A HREF=
"../concepts/blocking.html">Blocking
</A> <A HREF=
"../concepts/sink.html">Sink
</A>, and that you must reset the values of
<CODE>current_word_
</CODE> and
<CODE>off_
</CODE> before calling
<A HREF=
"../functions/write.html"><CODE>write
</CODE></A>, in case
<A HREF=
"../functions/write.html"><CODE>write
</CODE></A> throws an exception.
</P>
300 <A HREF='tab_expanding_filters.html'
><IMG BORDER=
0 WIDTH=
19 HEIGHT=
19 SRC='../../../../doc/src/images/prev.png'
></A>
301 <A HREF='tutorial.html'
><IMG BORDER=
0 WIDTH=
19 HEIGHT=
19 SRC='../../../../doc/src/images/up.png'
></A>
302 <A HREF='unix2dos_filters.html'
><IMG BORDER=
0 WIDTH=
19 HEIGHT=
19 SRC='../../../../doc/src/images/next.png'
></A>
307 <!-- Begin Footer -->
313 <P CLASS=
"copyright">© Copyright
2008 <a href=
"http://www.coderage.com/" target=
"_top">CodeRage, LLC
</a><br/>© Copyright
2004-
2007 <a href=
"http://www.coderage.com/turkanis/" target=
"_top">Jonathan Turkanis
</a></P>
314 <P CLASS=
"copyright">
315 Use, modification, and distribution are subject to the Boost Software License, Version
2.0. (See accompanying file
<A HREF=
"../../../../LICENSE_1_0.txt">LICENSE_1_0.txt
</A> or copy at
<A HREF=
"http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt
</A>)