2.2.6. Dictionary Filters
30 A <SPAN CLASS='term'>dictionary filter</SPAN> is a Filter which performs text substitution in the following manner. It maintains a collection of pairs of strings whose first components are words and whose second components represent replacement text &#8212; I'll call such a collection a <SPAN CLASS='term'>dictionary</SPAN>, and refer to the pairs it contains as <SPAN CLASS='term'>definitions</SPAN>. When a dictionary filter encounters a word which appears as the first component of a definition, it forwards the replacement text instead of the original word. Other words, whitespace and punctuation are forwarded unchanged.
34 The basic algorithm is as follows: You examine characters one at a time, appending them to a string which I'll call the <SPAN CLASS='term'>current word</SPAN>. When you encounter a non-alphabetic character, you consult the dictionary to determine whether the current word appears as the first component of a definition. If it does, you forward the replacement text followed by the non-alphabetic character. Otherwise, you forward the current word followed by the non-alphabetic character. When the end-of-stream is reached, you consult the dictionary again and forward either the curent word or its replacement, as appropriate.
38 In the following sections, I'll express this algorithm as a <A HREF="../classes/stdio_filter.html"><CODE>stdio_filter</CODE></A>, an <A HREF="../concepts/input_filter.html">InputFilter</A> and an <A HREF="../concepts/output_filter.html">OutputFilter</A>. The source code can be found in the header <A HREF="../../example/dictionary_filter.hpp"><CODE>&lt;libs/iostreams/example/dictionary_filter.hpp&gt;</CODE></A>.
44 <P>You can represent a dictionary using the following class:</P>
46 <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <SPAN CLASS="literal">&lt;map&gt;</SPAN>
47 <SPAN CLASS="preprocessor">#include</SPAN> <SPAN CLASS="literal">&lt;string&gt;</SPAN>
49 <SPAN CLASS='keyword'>namespace</SPAN> boost { <SPAN CLASS='keyword'>namespace</SPAN> iostreams { <SPAN CLASS='keyword'>namespace</SPAN> example {
51 <SPAN CLASS="keyword">class</SPAN> dictionary {
void add(std::string key, const std::string& value);
void replace(std::string& key);
};
59 } } } <SPAN CLASS="comment">// End namespace boost::iostreams:example</SPAN></PRE>
62 The member function <CODE>add</CODE> converts <CODE>key</CODE> to lower case and adds the pair <CODE>key</CODE>, <CODE>value</CODE> to the dictionary. The member function <CODE>replace</CODE> searches for a definition whose first component is equal to the result of converting <CODE>key</CODE> to lower case. If it finds such a definition, it assigns the replacement text to <CODE>key</CODE>, adjusting the case of the first character to match the case of the first character of <CODE>key</CODE>. Otherwise, it does nothing.
dictionary_stdio_filter
68 <P>You can express a dictionary filter as a <A HREF="../classes/stdio_filter.html"><CODE>stdio_filter</CODE></A> as follows:</P>
70 <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <SPAN CLASS="literal">&lt;cstdio&gt;</SPAN> <SPAN CLASS="comment">// EOF</SPAN>
71 <SPAN CLASS="preprocessor">#include</SPAN> <SPAN CLASS="literal">&lt;iostream&gt;</SPAN> <SPAN CLASS="comment">// cin, cout</SPAN>
72 <SPAN CLASS="preprocessor">#include</SPAN> <A CLASS="header" HREF="../../../../boost/iostreams/filter/stdio.hpp"><SPAN CLASS="literal">&lt;boost/iostreams/filter/stdio.hpp&gt;</SPAN></A>
74 <SPAN CLASS='keyword'>namespace</SPAN> boost { <SPAN CLASS='keyword'>namespace</SPAN> iostreams { <SPAN CLASS='keyword'>namespace</SPAN> example {
76 <SPAN CLASS="keyword">class</SPAN> dictionary_stdio_filter : <SPAN CLASS="keyword"><SPAN CLASS="keyword"><SPAN CLASS="keyword">public</SPAN></SPAN></SPAN> stdio_filter {
dictionary_stdio_filter(dictionary& d) : dictionary_(d) { }
void do_filter()
while (true) {
int c = std::cin.get();
if (c == EOF || !std::isalpha((unsigned char) c)) {
dictionary_.replace(current_word_);
cout.write( current_word_.data(),
static_cast&lt;streamsize&gt;(current_word_.size()) );
current_word_.erase();
if (c == EOF)
break;
cout.put(c);
current_word_ += c;
dictionary& dictionary_;
std::string current_word_;
};
102 } } } <SPAN CLASS="comment">// End namespace boost::iostreams:example</SPAN></PRE>
105 The implementation of <CODE>do_filter</CODE> simply loops, reading characters from <CODE>std::cin</CODE> and <CODE>appending</CODE> them to the member variable <CODE>current_word_</CODE> until a non-alphabetic character or end-of-stream indication is encountered. When this occurs it uses its dictionary, stored in the member variable <CODE>dictionary_</CODE>, to replace the current word if necessary. Finally, it writes the current word, followed by the non-alphabetic character, if any, to <CODE>std::cout</CODE>.
dictionary_input_filter
111 <P>You can express a dictionary filter as an <A HREF="../concepts/input_filter.html">InputFilter</A> as follows:</P>
113 <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/iostreams/char_traits.hpp"><SPAN CLASS="literal">&lt;boost/iostreams/char_traits.hpp&gt;</SPAN></A> <SPAN CLASS="comment">// EOF, WOULD_BLOCK</SPAN>
114 <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/iostreams/concepts.hpp"><SPAN CLASS="literal">&lt;boost/iostreams/concepts.hpp&gt;</SPAN></A> <SPAN CLASS="comment">// input_filter</SPAN>
115 <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/iostreams/operations.hpp"><SPAN CLASS="literal">&lt;boost/iostreams/operations.hpp&gt;</SPAN></A> <SPAN CLASS="comment">// get</SPAN>
117 <SPAN CLASS='keyword'>namespace</SPAN> boost { <SPAN CLASS='keyword'>namespace</SPAN> iostreams { <SPAN CLASS='keyword'>namespace</SPAN> example {
119 <SPAN CLASS="keyword">class</SPAN> dictionary_input_filter : <SPAN CLASS="keyword"><SPAN CLASS="keyword"><SPAN CLASS="keyword">public</SPAN></SPAN></SPAN> input_filter {
dictionary_input_filter(dictionary& d)
: dictionary_(d), off_(std::string::npos), eof_(false)
{ }
125 <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Source&gt;
int get(Source& src);
128 <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Source&gt;
void close(Source&);
dictionary& dictionary_;
std::string current_word_;
std::string::size_type off_;
};
137 } } } <SPAN CLASS="comment">// End namespace boost::iostreams:example</SPAN></PRE>
141 <PRE class="broken_ie"> <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Source&gt;
int get(Source& src)
{
if (off_ != std::string::npos && off_ &lt; current_word_.size())
return current_word_[off_++];
if (off_ == current_word_.size()) {
current_word_.erase();
off_ = std::string::npos;
}
if (eof_)
return EOF;
154 <SPAN CLASS='comment'>// Compute curent word.</SPAN>
while (true) {
int c;
if ((c = iostreams::get(src)) == WOULD_BLOCK)
return WOULD_BLOCK;
if (c == EOF || !std::isalpha((unsigned char) c)) {
dictionary_.replace(current_word_);
off_ = 0;
if (c == EOF)
eof_ = true;
else
current_word_ += c;
break;
} else {
current_word_ += c;
}
}
return this-&gt;get(src); // Note: current_word_ is not empty.
}
177 You first check to see whether there are any characters which remain from a previous invocation of <CODE>get</CODE>. If so, you update some book keeping information and return the first such character.
180 The <CODE>while</CODE> loop is very similar to that of <A HREF="#dictionary_stdio_filter"><CODE>dictionary_stdio_filter::do_filter</CODE></A>: it reads characters from the <A HREF="../concepts/source.html">Source</A> <CODE>src</CODE>, appending them to <CODE>current_word_</CODE> until a non-alphabetic character, <CODE>EOF</CODE> or <CODE>WOULD_BLOCK</CODE> is encountered. The value <CODE>WOULD_BLOCK</CODE> is passed on to the caller. In the remaining cases, the dictionary is consulted to determine the appropriate replacement text.
181 </P>
183 <P>Finally, <CODE>get</CODE> is called recursively to return the first character of the current word.</P>
185 <P>As usual, the function <CODE>close</CODE> resets the Filter's state:</P>
187 <PRE class="broken_ie"> <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Source&gt;
void close(Source&)
{
current_word_.erase();
off_ = std::string::npos;
eof_ = false;
}
195 <A NAME="dictionary_output_filter"></A>
dictionary_output_filter
198 <P>You can express a dictionary filter as an <A HREF="../concepts/output_filter.html">OutputFilter</A> as follows:</P>
200 <PRE class="broken_ie"><SPAN CLASS='preprocessor'>#include</SPAN> <SPAN CLASS='literal'>&lt;algorithm&gt;</SPAN> <SPAN CLASS='comment'>// swap</SPAN>
201 <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/iostreams/concepts.hpp"><SPAN CLASS="literal">&lt;boost/iostreams/concepts.hpp&gt;</SPAN></A> <SPAN CLASS="comment">// output_filter</SPAN>
202 <SPAN CLASS='preprocessor'>#include</SPAN> <A CLASS="header" HREF="../../../../boost/iostreams/operations.hpp"><SPAN CLASS="literal">&lt;boost/iostreams/operations.hpp&gt;</SPAN></A> <SPAN CLASS="comment">// write</SPAN>
204 <SPAN CLASS='keyword'>namespace</SPAN> boost { <SPAN CLASS='keyword'>namespace</SPAN> iostreams { <SPAN CLASS='keyword'>namespace</SPAN> example {
206 <SPAN CLASS="keyword">class</SPAN> dictionary_output_filter : <SPAN CLASS="keyword"><SPAN CLASS="keyword"><SPAN CLASS="keyword">public</SPAN></SPAN></SPAN> output_filter {
typedef std::map&lt;std::string, std::string&gt; map_type;
dictionary_output_filter(dictionary& d)
: dictionary_(d), off_(std::string::npos)
{ }
213 <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Sink&gt;
bool put(Sink& dest, int c);
216 <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Sink&gt;
void close(Sink& dest);
template&lt;typename Sink&gt;
bool write_current_word(Sink& dest);
dictionary& dictionary_;
std::string current_word_;
std::string::size_type off_;
};
226 } } } <SPAN CLASS="comment">// End namespace boost::iostreams:example</SPAN></PRE>
228 <P>Let's look first at the helper function <CODE>write_current_word</CODE>:</P>
230 <PRE class="broken_ie"> <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Sink&gt;
bool write_current_word(Sink& dest)
{
using namespace std;
streamsize amt = static_cast&lt;streamsize&gt;(current_word_.size() - off_);
streamsize result =
iostreams::write(dest, current_word_.data() + off_, amt);
if (result == amt) {
current_word_.erase();
off_ = string::npos;
return true;
} else {
off_ += static_cast&lt;string::size_type&gt;(result);
return false;
}
}
248 This function attempts to write <CODE>current_word_</CODE>, beginning at the offset <CODE>off_</CODE>, to the provided <A HREF="../concepts/sink.html">Sink</A>. If the entire sequence is successfully written, <CODE>current_word_</CODE> is cleared and the function returns <CODE>true</CODE>. Otherwise the member variable <CODE>off_</CODE> is updated to point to the first unwritten character and the function fails.
251 <P>Using <CODE>write_current_word</CODE> you can implement <CODE>put</CODE> as follows:</P>
253 <PRE class="broken_ie"> <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Sink&gt;
bool put(Sink& dest, int c)
{
if (off_ != std::string::npos && !write_current_word(dest))
return false;
if (!std::isalpha((unsigned char) c)) {
dictionary_.replace(current_word_);
off_ = 0;
}
current_word_ += c;
return true;
}
268 As in the implementation of <A HREF="#dictionary_input_filter"><CODE>dictionary_input_filter::get</CODE></A>, you first check to see whether there are any characters from a previous invocation of <CODE>put</CODE> which remain to be written. If so, you attempt to write these characters using <CODE>write_current_word</CODE>. If successful, you next examine the given character <CODE>c</CODE>. If it is a non-alphabetic character, you consult the dictionary to determine the appropriate replacement text. In any case, you append <CODE>c</CODE> to <CODE>current_word_</CODE> and return <CODE>true</CODE>.
271 <P>The function <CODE>close</CODE> has more work to do in this case than simply reseting the Filter's state. Unless the last character of the unfiltered sequence happened to be a non-alphabetic character, the contents of current_word_ will not yet have been written:</P>
273 <PRE class="broken_ie"> <SPAN CLASS="keyword">template</SPAN>&lt;<SPAN CLASS="keyword">typename</SPAN> Sink&gt;
void close(Sink& dest)
{
276 <SPAN CLASS='comment'>// Reset current_word_ and off_, saving old values.</SPAN>
std::string current_word;
std::string::size_type off = 0;
current_word.swap(current_word_);
std::swap(off, off_);
282 <SPAN CLASS='comment'>// Write remaining characters to dest.</SPAN>
if (off == std::string::npos) {
dictionary_.replace(current_word);
off = 0;
}
if (!current_word.empty())
iostreams::write(
dest,
current_word.data() + off,
static_cast&lt;std::streamsize&gt;(current_word.size() - off)
);
}
295 <P>Note that you may assume that the template argument is a <A HREF="../concepts/blocking.html">Blocking</A> <A HREF="../concepts/sink.html">Sink</A>, and that you must reset the values of <CODE>current_word_</CODE> and <CODE>off_</CODE> before calling <A HREF="../functions/write.html"><CODE>write</CODE></A>, in case <A HREF="../functions/write.html"><CODE>write</CODE></A> throws an exception.</P>
