]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
2 | <html> | |
3 | <!-- | |
4 | (C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . | |
5 | Use, modification and distribution is subject to the Boost Software | |
6 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at | |
7 | http://www.boost.org/LICENSE_1_0.txt) | |
8 | --> | |
9 | <head> | |
10 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> | |
11 | <link rel="stylesheet" type="text/css" href="../../../boost.css"> | |
12 | <link rel="stylesheet" type="text/css" href="style.css"> | |
13 | <title>Serialization - Dataflow Iterators</title> | |
14 | </head> | |
15 | <body link="#0000ff" vlink="#800080"> | |
16 | <table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header"> | |
17 | <tr> | |
18 | <td valign="top" width="300"> | |
19 | <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3> | |
20 | </td> | |
21 | <td valign="top"> | |
22 | <h1 align="center">Serialization</h1> | |
23 | <h2 align="center">Dataflow Iterators</h2> | |
24 | </td> | |
25 | </tr> | |
26 | </table> | |
27 | <hr> | |
28 | <h3>Motivation</h3> | |
29 | Consider the problem of translating an arbitrary length sequence of 8 bit bytes | |
30 | to base64 text. Such a process can be summarized as: | |
31 | <p> | |
32 | source => 8 bit bytes => 6 bit integers => encode to base64 characters => insert line breaks => destination | |
33 | <p> | |
34 | We would prefer the solution that is: | |
35 | <ul> | |
36 | <li>Decomposable. so we can code, test, verify and use each (simple) stage of the conversion | |
37 | independently. | |
38 | <li>Composable. so we can use this composite as a new component somewhere else. | |
39 | <li>Efficient, so we're not required to re-implement it again. | |
40 | <li>Scalable, so that it works well for short and arbitrarily long sequences. | |
41 | </ul> | |
42 | The approach that comes closest to meeting these requirements is that described | |
43 | and implemented with <a href="../../iterator/doc/index.html">Iterator Adaptors</a>. | |
44 | The fundamental feature of an Iterator Adaptor template that makes it interesting to | |
45 | us is that it takes as a parameter a base iterator from which it derives its | |
46 | input. This suggests that something like the following might be possible. | |
47 | <pre><code> | |
48 | typedef | |
49 | insert_linebreaks< // insert line breaks every 76 characters | |
50 | base64_from_binary< // convert binary values to base64 characters | |
51 | transform_width< // retrieve 6 bit integers from a sequence of 8 bit bytes | |
52 | const char *, | |
53 | 6, | |
54 | 8 | |
55 | > | |
56 | > | |
57 | ,76 | |
58 | > | |
59 | base64_text; // compose all the above operations in to a new iterator | |
60 | ||
61 | std::copy( | |
62 | base64_text(address), | |
63 | base64_text(address + count), | |
64 | ostream_iterator<CharType>(os) | |
65 | ); | |
66 | </code></pre> | |
67 | Indeed, this seems to be exactly the kind of problem that iterator adaptors are | |
68 | intended to address. The Iterator Adaptor library already includes | |
69 | modules which can be configured to implement some of the operations above. For example, | |
70 | included is <a target="transform_iterator" href="../../iterator/doc/transform_iterator.html"> | |
71 | transform_iterator</a>, which can be used to implement 6 bit integer => base64 code. | |
72 | ||
73 | <h3>Dataflow Iterators</h3> | |
74 | Unfortunately, not all iterators which inherit from Iterator Adaptors are guaranteed | |
75 | to meet the composability goals stated above. To accomplish this purpose, they have | |
76 | to be written with some additional considerations in mind. | |
77 | ||
78 | We define a Dataflow Iterator as an class inherited from <code style="white-space: normal">iterator_adaptor</code> which | |
79 | fulfills a small set of additional requirements. | |
80 | ||
81 | <h4>Templated Constructors</h4> | |
82 | <p> | |
83 | Templated constructor have the form: | |
84 | <pre><code> | |
85 | template<class T> | |
86 | dataflow_iterator(T start) : | |
87 | iterator_adaptor(Base(start)) | |
88 | {} | |
89 | </code></pre> | |
90 | When these constructors are applied to our example of above, the following code is generated: | |
91 | <pre><code> | |
92 | std::copy( | |
93 | insert_linebreaks( | |
94 | base64_from_binary( | |
95 | transform_width( | |
96 | address | |
97 | ), | |
98 | ) | |
99 | ), | |
100 | insert_linebreaks( | |
101 | base64_from_binary( | |
102 | transform_width( | |
103 | address + count | |
104 | ) | |
105 | ) | |
106 | ) | |
107 | ostream_iterator<char>(os) | |
108 | ); | |
109 | </code></pre> | |
110 | The recursive application of this template is what automatically generates the | |
111 | constructor <code style="white-space: normal">base64_text(const char *)</code> in our example above. The original | |
112 | Iterator Adaptors include a <code style="white-space: normal">make_xxx_iterator</code> to fulfill this function. | |
113 | However, I believe these are unwieldy to use compared to the above solution using | |
114 | Templated constructors. | |
115 | <p> | |
116 | Unfortunately, some systems which fail to properly support partial function template | |
117 | ordering cannot support the concept of a templated constructor as implemented above. | |
118 | A special "wrapper" macro has been created to work around this problem. With this "wrapper" | |
119 | the above example is modified to: | |
120 | <pre><code> | |
121 | std::copy( | |
122 | base64_text(BOOST_MAKE_PFTO_WRAPPER(address)), | |
123 | base64_text(BOOST_MAKE_PFTO_WRAPPER(address + count)), | |
124 | ostream_iterator<char>(os) | |
125 | ); | |
126 | </code></pre> | |
127 | This macro is defined in <a target="pfto" href="../../../boost/serialization/pfto.hpp"><boost/serialization/pfto.hpp></a>. | |
128 | For more information about this topic, check the source. | |
129 | ||
130 | <h4>Dereferencing</h4> | |
131 | Dereferencing some iterators can cause problems. For example, a natural | |
132 | way to write a <code style="white-space: normal">remove_whitespace</code> iterator is to increment past the initial | |
133 | whitespaces when the iterator is constructed. This will fail if the iterator passed to the | |
134 | constructor "points" to the end of a string. The | |
135 | <a target="filter_iterator" href="../../iterator/doc/filter_iterator.html"> | |
136 | <code style="white-space: normal">filter_iterator</code></a> is implemented | |
137 | in this way so it can't be used in our context. So, for implementation of this iterator, | |
138 | space removal is deferred until the iterator actually is dereferenced. | |
139 | ||
140 | <h4>Comparison</h4> | |
141 | The default implementation of iterator equality of <code style="white-space: normal">iterator_adaptor</code> just | |
142 | invokes the equality operator on the base iterators. Generally this is satisfactory. | |
143 | However, this implies that other operations (E. G. dereference) do not prematurely | |
144 | increment the base iterator. Avoiding this can be surprisingly tricky in some cases. | |
145 | (E.G. transform_width) | |
146 | ||
147 | <p> | |
148 | Iterators which fulfill the above requirements should be composable and the above sample | |
149 | code should implement our binary to base64 conversion. | |
150 | ||
151 | <h3>Iterators Included in the Library</h3> | |
152 | Dataflow iterators for the serialization library are all defined in the hamespace | |
153 | <code style="white-space: normal">boost::archive::iterators</code> included here are: | |
154 | <dl class="index"> | |
155 | <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/base64_from_binary.hpp"> | |
156 | base64_from_binary</a></dt> | |
157 | <dd>transforms a sequence of integers to base64 text</dd> | |
158 | ||
159 | <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/binary_from_base64.hpp"> | |
160 | binary_from_base64</a></dt> | |
161 | <dd>transforms a sequence of base64 characters to a sequence of integers</dd> | |
162 | ||
163 | <dt><a target="insert_linebreaks" href="../../../boost/archive/iterators/insert_linebreaks.hpp"> | |
164 | insert_linebreaks</a></dt> | |
165 | <dd>given a sequence, creates a sequence with newline characters inserted</dd> | |
166 | ||
167 | <dt><a target="mb_from_wchar" href="../../../boost/archive/iterators/mb_from_wchar.hpp"> | |
168 | mb_from_wchar</a></dt> | |
169 | <dd>transforms a sequence of wide characters to a sequence of multi-byte characters</dd> | |
170 | ||
171 | <dt><a target="remove_whitespace" href="../../../boost/archive/iterators/remove_whitespace.hpp"> | |
172 | remove_whitespace</a></dt> | |
173 | <dd>given a sequence of characters, returns a sequence with the white characters | |
174 | removed. This is a derivation from the <code style="white-space: normal">boost::filter_iterator</code></dd> | |
175 | ||
176 | <dt><a target="transform_width" href="../../../boost/archive/iterators/transform_width.hpp"> | |
177 | transform_width</a></dt> | |
178 | <dd>transforms a sequence of x bit elements into a sequence of y bit elements. This | |
179 | is a key component in iterators which translate to and from base64 text.</dd> | |
180 | ||
181 | <dt><a target="wchar_from_mb" href="../../../boost/archive/iterators/wchar_from_mb.hpp"> | |
182 | wchar_from_mb</a></dt> | |
183 | <dd>transform a sequence of multi-byte characters in the current locale to wide characters.</dd> | |
184 | ||
185 | <dt><a target="xml_escape" href="../../../boost/archive/iterators/xml_escape.hpp"> | |
186 | xml_escape</a></dt> | |
187 | <dd>escapes xml meta-characters from xml text</dd> | |
188 | ||
189 | <dt><a target="xml_unescape" href="../../../boost/archive/iterators/xml_unescape.hpp"> | |
190 | xml_unescape</a></dt> | |
191 | <dd>unescapes xml escape sequences to create a sequence of normal text<dd> | |
192 | </dl> | |
193 | <p> | |
194 | The standard stream iterators don't quite work for us. On systems which implement <code style="white-space: normal">wchar_t</code> | |
195 | as unsigned short integers (E.G. VC 6) they didn't function as I expected. I also made some | |
196 | adjustments to be consistent with our concept of Dataflow Iterators. Like the rest of our | |
197 | iterators, they are found in the namespace <code style="white-space: normal">boost::archive::interators</code> to avoid | |
198 | conflicts with the standard library versions. | |
199 | <dl class = "index"> | |
200 | <dt><a target="istream_iterator" href="../../../boost/archive/iterators/istream_iterator.hpp"> | |
201 | istream_iterator</a></dt> | |
202 | <dt><a target="ostream_iterator" href="../../../boost/archive/iterators/ostream_iterator.hpp"> | |
203 | ostream_iterator</a></dt> | |
204 | </dl> | |
205 | ||
206 | <hr> | |
207 | <p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. | |
208 | Distributed under the Boost Software License, Version 1.0. (See | |
209 | accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
210 | </i></p> | |
211 | </body> | |
212 | </html> |