[ceph.git] / ceph / src / boost / libs / endian / doc / choosing_approach.html

<html>

<head>
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Choosing Approach</title>
<link href="styles.css" rel="stylesheet">
</head>

<body>

<table border="0" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="100%">
  <tr>
    <td width="339">
<a href="../../../index.html">
<img src="../../../boost.png" alt="Boost logo" align="middle" border="0" width="277" height="86"></a></td>
    <td align="middle" width="1253">
    <font size="6"><b>Choosing the Approach</b></font></td>
  </tr>
</table>

<table border="0" cellpadding="5" cellspacing="0" style="border-collapse: collapse"
  bordercolor="#111111" bgcolor="#D7EEFF" width="100%">
  <tr>
    <td><b>
    <a href="index.html">Endian Home</a>&nbsp;&nbsp;&nbsp;&nbsp;
    <a href="conversion.html">Conversion Functions</a>&nbsp;&nbsp;&nbsp;&nbsp;
    <a href="arithmetic.html">Arithmetic Types</a>&nbsp;&nbsp;&nbsp;&nbsp;
    <a href="buffers.html">Buffer Types</a>&nbsp;&nbsp;&nbsp;&nbsp;
    <a href="choosing_approach.html">Choosing Approach</a></b></td>
  </tr>
</table>
<p></p>

<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" align="right">
  <tr>
    <td width="100%" bgcolor="#D7EEFF" align="center">
      <i><b>Contents</b></i></td>
  </tr>
  <tr>
    <td width="100%" bgcolor="#E8F5FF">
<a href="#Introduction">Introduction</a><br>
<a href="#Choosing">Choosing between conversion functions,</a><br>
  &nbsp;  <a href="#Choosing">buffer types, and  arithmetic types</a><br>
&nbsp;&nbsp;&nbsp;<a href="#Characteristics">Characteristics</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Endianness-invariants">Endianness invariants</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Conversion-explicitness">Conversion explicitness</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Arithmetic-operations">Arithmetic operations</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Sizes">Sizes</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Alignments">Alignments</a><br>
&nbsp;&nbsp;&nbsp;<a href="#Design-patterns">Design patterns</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#As-needed">Convert only as needed (i.e. lazy)</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Anticipating-need">Convert in anticipation of need</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Convert-generally-as-needed-locally-in-anticipation">Generally 
as needed, locally in anticipation</a><br>
&nbsp;&nbsp;&nbsp;<a href="#Use-cases">Use case examples</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Porting-endian-unaware-codebase">Porting endian unaware codebase</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Porting-endian-aware-codebase">Porting endian aware codebase</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Reliability-arithmetic-speed">Reliability and arithmetic-speed</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Reliability-ease-of-use">Reliability and ease-of-use</a></td>
  </tr>
  </table>

<h2><a name="Introduction">Introduction</a></h2>

<p>Deciding which is the best endianness approach (conversion functions, buffer 
types, or arithmetic types) for a particular application involves complex 
engineering trade-offs. It is hard to assess those trade-offs without some 
understanding of the different interfaces, so you might want to read the
<a href="conversion.html">conversion functions</a>, <a href="buffers.html">
buffer types</a>, and <a href="arithmetic.html">arithmetic types</a> pages 
before diving into this page.</p>

<h2><a name="Choosing">Choosing</a> between  conversion functions,  buffer types, 
and  arithmetic types</h2>

<p>The best approach to endianness for a particular application  depends on  the interaction between 
the application&#39;s needs and the characteristics of each of the three  approaches.</p>

<p><b>Recommendation:</b> If you are new to endianness, uncertain, or don&#39;t want to invest 
the time to 
study 
engineering trade-offs, use <a href="arithmetic.html">endian arithmetic types</a>. They are safe, easy 
to use, and easy to maintain. Use the
<a href="#Anticipating-need"> <i>
anticipating need</i></a> design pattern locally around performance hot spots 
like lengthy loops, if needed.</p>

<h3><a name="Background">Background</a> </h3>

<p>A dealing with endianness usually implies a program portability or a data 
portability requirement, and often both. That means real programs dealing with 
endianness are usually complex, so the examples shown here would really be 
written as multiple functions spread across multiple translation units. They 
would involve interfaces that can not be altered as they are supplied by 
third-parties or the standard library. </p>

<h3><a name="Characteristics">Characteristics</a></h3>

<p>The characteristics that differentiate the three approaches to endianness are the endianness 
invariants, conversion explicitness, arithmetic operations, sizes available, and 
alignment requirements.</p>

<h4><a name="Endianness-invariants">Endianness invariants</a></h4>

<blockquote>

<p><b>Endian conversion functions</b> use objects of the ordinary C++ arithmetic 
types like <code>int</code> or <code>unsigned short</code> to hold values. That 
breaks the implicit invariant that the C++ language rules apply. The usual 
language rules only apply if the endianness of the object is currently set to the native endianness for the platform. That can 
make it very hard to reason about logic flow, and result in difficult to 
find bugs.</p>

<p>For example:</p>

<blockquote>
  <pre>struct data_t  // big endian
{
  int32_t   v1;  // description ...
  int32_t   v2;  // description ...
  ... additional character data members (i.e. non-endian)
  int32_t   v3;  // description ...
};

data_t data;

read(data);
big_to_native_inplace(data.v1);
big_to_native_inplace(data.v2);

... 

++v1;
third_party::func(data.v2);

... 

native_to_big_inplace(data.v1);
native_to_big_inplace(data.v2);
write(data);
</pre>
  <p>The programmer didn&#39;t bother to convert <code>data.v3</code> to native 
  endianness because that member isn&#39;t used. A later maintainer needs to pass
  <code>data.v3</code> to the third-party function, so adds <code>third_party::func(data.v3);</code> 
  somewhere deep in the code. This causes a silent failure because the usual 
  invariant that an object of type <code>int32_t</code> holds a value as 
  described by the C++ core language does not apply.</p>
</blockquote>
<p><b>Endian buffer and arithmetic types</b> hold values internally as arrays of 
characters with an invariant that the endianness of the array never changes. 
That makes these types easier to use and programs easier to maintain. </p>
<p>Here is the same example, using an endian arithmetic type:</p>
<blockquote>
  <pre>struct data_t
{
  big_int32_t   v1;  // description ...
  big_int32_t   v2;  // description ...
  ... additional character data members (i.e. non-endian)
  big_int32_t   v3;  // description ...
};

data_t data;

read(data);

... 

++v1;
third_party::func(data.v2);

... 

write(data);
</pre>
  <p>A later maintainer can add <code>third_party::func(data.v3)</code>and it 
  will just-work.</p>
</blockquote>

</blockquote>

<h4><a name="Conversion-explicitness">Conversion explicitness</a></h4>

<blockquote>

<p><b>Endian conversion functions</b> and <b>buffer types</b> never perform 
implicit conversions. This gives users explicit control of when conversion 
occurs, and may help avoid unnecessary conversions.</p>

<p><b>Endian arithmetic types</b> perform conversion implicitly. That makes 
these types very easy to use, but can result in unnecessary conversions. Failure 
to hoist conversions out of inner loops can bring a performance penalty.</p>

</blockquote>

<h4><a name="Arithmetic-operations">Arithmetic operations</a></h4>

<blockquote>

<p><b>Endian conversion functions</b> do not supply arithmetic 
operations, but this is not a concern since this approach uses ordinary C++ 
arithmetic types to hold values.</p>

<p><b>Endian buffer types</b> do not supply arithmetic operations. Although this 
approach avoids unnecessary conversions, it can result in the introduction of 
additional variables and confuse maintenance programmers.</p>

<p><b>Endian</b> <b>arithmetic types</b> do supply arithmetic operations. They 
are very easy to use if lots of arithmetic is involved. </p>

</blockquote>

<h4><a name="Sizes">Sizes</a></h4>

<blockquote>

<p><b>Endianness conversion functions</b> only support 1, 2, 4, and 8 byte 
integers. That&#39;s sufficient for many applications.</p>

<p><b>Endian buffer and arithmetic types</b> support 1, 2, 3, 4, 5, 6, 7, and 8 
byte integers. For an application where memory use or I/O speed is the limiting 
factor, using sizes tailored to application needs can be  useful.</p>

</blockquote>

<h4><a name="Alignments">Alignments</a></h4>

<blockquote>

<p><b>Endianness conversion functions</b> only support aligned integer and 
floating-point types. That&#39;s sufficient for most applications.</p>

<p><b>Endian buffer and arithmetic types</b> support both aligned and unaligned 
integer and floating-point types. Unaligned types are rarely needed, but when 
needed they are often very useful and workarounds are painful. For example,</p>

<blockquote>
  <p>Non-portable code like this:<blockquote>
      <pre>struct S {
  uint16_t a;&nbsp; // big endian
  uint32_t b;&nbsp; // big endian
} __attribute__ ((packed));</pre>
    </blockquote>
    <p>Can be replaced with portable code like this:</p>
    <blockquote>
      <pre>struct S {
  big_uint16_ut a;
  big_uint32_ut b;
};</pre>
    </blockquote>
      </blockquote>

</blockquote>

<h3><a name="Design-patterns">Design patterns</a></h3>

<p>Applications often traffic in endian data as records or packets containing 
multiple endian data elements. For simplicity, we will just call them records.</p>

<p>If desired endianness differs from native endianness, a conversion has to be 
performed. When should that conversion occur? Three design patterns have 
evolved.</p>

<h4><a name="As-needed">Convert only as needed</a> (i.e. lazy)</h4>

<p>This pattern defers conversion to the point in the code where the data 
element is actually used.</p>

<p>This pattern is appropriate when which endian element is actually used varies 
greatly according to record content or other circumstances</p>

<h4><a name="Anticipating-need">Convert in anticipation of need</a></h4>

<p>This pattern performs conversion to native endianness in anticipation of use, 
such as immediately after reading records. If needed, conversion to the output 
endianness is performed after all possible needs have passed, such as just 
before writing records.</p>

<p>One implementation of this pattern is to create a proxy record with 
endianness converted to native in a read function, and expose only that proxy to 
the rest of the implementation. If a write function, if needed, handles the 
conversion from native to the desired output endianness.</p>

<p>This pattern is appropriate when all endian elements in a record are 
typically used regardless of record content or other circumstances</p>

<h4><a name="Convert-generally-as-needed-locally-in-anticipation">Convert 
only as needed, except locally in anticipation of need</a></h4>

<p>This pattern in general defers conversion but for specific local needs does 
anticipatory conversion. Although particularly appropriate when coupled with the endian buffer 
or arithmetic types, it also works well with the conversion functions.</p>

<p>Example:</p>

<blockquote>
  <pre>struct data_t
{
  big_int32_t   v1;
  big_int32_t   v2;
  big_int32_t   v3;
};

data_t data;

read(data);

...
++v1;
...

int32_t v3_temp = data.v3;  // hoist conversion out of loop

for (int32_t i = 0; i &lt; <i><b>large-number</b></i>; ++i)
{
  ... <i><b>lengthy computation that accesses </b></i>v3_temp<i><b> many times</b></i> ...
}
data.v3 = v3_temp; 

write(data);
</pre>
</blockquote>

<p dir="ltr">In general the above pseudo-code leaves conversion up to the endian 
arithmetic type <code>big_int32_t</code>. But to avoid conversion inside the 
loop, a temporary is created before the loop is entered, and then used to set 
the new value of <code>data.v3</code> after the loop is complete.</p>

<blockquote>

<p dir="ltr">Question: Won&#39;t the compiler&#39;s optimizer hoist the conversion out 
of the loop anyhow?</p>

<p dir="ltr">Answer: VC++ 2015 Preview, and probably others, does not, even for 
a toy test program. Although the savings is small (two register <code>
<span style="font-size: 85%">bswap</span></code> instructions), the cost might 
be significant if the loop is repeated enough times. On the other hand, the 
program may be so dominated by I/O time that even a lengthy loop will be 
immaterial.</p>

</blockquote>

<h3><a name="Use-cases">Use case examples</a></h3>

<h4><a name="Porting-endian-unaware-codebase">Porting endian unaware codebase</a></h4>

<p>An existing codebase runs on  big endian systems. It does not 
currently deal with endianness. The codebase needs to be modified so it can run 
on&nbsp; little endian systems under various operating systems. To ease 
transition and protect value of existing files, external data will continue to 
be maintained as big endian.</p>

<p dir="ltr">The <a href="arithmetic.html">endian 
arithmetic approach</a> is recommended to meet these needs. A relatively small 
number of header files dealing with binary I/O layouts need to change types. For 
example,&nbsp;
<code>short</code> or <code>int16_t</code> would change to <code>big_int16_t</code>. No 
changes are required for <code>.cpp</code> files.</p>

<h4><a name="Porting-endian-aware-codebase">Porting endian aware codebase</a></h4>

<p>An existing codebase runs on little-endian Linux systems. It already 
deals with endianness via
<a href="http://man7.org/linux/man-pages/man3/endian.3.html">Linux provided 
functions</a>. Because of a business merger, the codebase has to be quickly 
modified for Windows and possibly other operating systems, while still 
supporting Linux. The codebase is reliable and the programmers are all 
well-aware of endian issues. </p>

<p dir="ltr">These factors all argue for an <a href="conversion.html">endian conversion 
approach</a> that just mechanically changes the calls to <code>htobe32</code>, 
etc. to <code>boost::endian::native_to_big</code>, etc. and replaces <code>&lt;endian.h&gt;</code> 
with <code>&lt;boost/endian/conversion.hpp&gt;</code>.</p>

<h4><a name="Reliability-arithmetic-speed">Reliability and arithmetic-speed</a></h4>

<p>A new, complex, multi-threaded application is to be developed that must run 
on little endian machines, but do big endian network I/O. The developers believe 
computational speed for endian variable is critical but have seen numerous bugs 
result from inability to reason about endian conversion state. They are also 
worried that future maintenance changes could inadvertently introduce a lot of 
slow conversions if full-blown endian arithmetic types are used.</p>

<p>The <a href="buffers.html">endian buffers</a> approach is made-to-order for 
this use case.</p>

<h4><a name="Reliability-ease-of-use">Reliability and ease-of-use</a></h4>

<p>A new, complex, multi-threaded application is to be developed that must run 
on little endian machines, but do big endian network I/O. The developers believe 
computational speed for endian variables is <b>not critical</b> but have seen 
numerous bugs result from inability to reason about endian conversion state. 
They are also concerned about ease-of-use both during development and long-term 
maintenance.</p>

<p>Removing concern about conversion speed and adding concern about ease-of-use 
tips the balance strongly in favor the <a href="arithmetic.html">endian 
arithmetic approach</a>.</p>

<hr>
<p>Last revised:
<!--webbot bot="Timestamp" s-type="EDITED" s-format="%d %B, %Y" startspan -->19 January, 2015<!--webbot bot="Timestamp" endspan i-checksum="38903" --></p>
<p>© Copyright Beman Dawes, 2011, 2013, 2014</p>
<p>Distributed under the Boost Software License, Version 1.0. See
<a href="http://www.boost.org/LICENSE_1_0.txt">www.boost.org/ LICENSE_1_0.txt</a></p>

<p>&nbsp;</p>

</body>

</html>
Commit	Line	Data
7c673cae FG	1	<html>
	2
	3	<head>
	4	<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
	5	<meta name="ProgId" content="FrontPage.Editor.Document">
	6	<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
	7	<title>Choosing Approach</title>
	8	<link href="styles.css" rel="stylesheet">
	9	</head>
	10
	11	<body>
	12
	13	<table border="0" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" width="100%">
	14	<tr>
	15	<td width="339">
	16	<a href="../../../index.html">
	17	<img src="../../../boost.png" alt="Boost logo" align="middle" border="0" width="277" height="86"></a></td>
	18	<td align="middle" width="1253">
	19	<font size="6"><b>Choosing the Approach</b></font></td>
	20	</tr>
	21	</table>
	22
	23	<table border="0" cellpadding="5" cellspacing="0" style="border-collapse: collapse"
	24	bordercolor="#111111" bgcolor="#D7EEFF" width="100%">
	25	<tr>
	26	<td><b>
	27	<a href="index.html">Endian Home</a>
	28	<a href="conversion.html">Conversion Functions</a>
	29	<a href="arithmetic.html">Arithmetic Types</a>
	30	<a href="buffers.html">Buffer Types</a>
	31	<a href="choosing_approach.html">Choosing Approach</a></b></td>
	32	</tr>
	33	</table>
	34	<p></p>
	35
	36	<table border="1" cellpadding="5" cellspacing="0" style="border-collapse: collapse" bordercolor="#111111" align="right">
	37	<tr>
	38	<td width="100%" bgcolor="#D7EEFF" align="center">
	39	<i><b>Contents</b></i></td>
	40	</tr>
	41	<tr>
	42	<td width="100%" bgcolor="#E8F5FF">
	43	<a href="#Introduction">Introduction</a><br>
	44	<a href="#Choosing">Choosing between conversion functions,</a><br>
	45	<a href="#Choosing">buffer types, and arithmetic types</a><br>
	46	<a href="#Characteristics">Characteristics</a><br>
	47	<a href="#Endianness-invariants">Endianness invariants</a><br>
	48	<a href="#Conversion-explicitness">Conversion explicitness</a><br>
	49	<a href="#Arithmetic-operations">Arithmetic operations</a><br>
	50	<a href="#Sizes">Sizes</a><br>
	51	<a href="#Alignments">Alignments</a><br>
	52	<a href="#Design-patterns">Design patterns</a><br>
	53	<a href="#As-needed">Convert only as needed (i.e. lazy)</a><br>
	54	<a href="#Anticipating-need">Convert in anticipation of need</a><br>
	55	<a href="#Convert-generally-as-needed-locally-in-anticipation">Generally
	56	as needed, locally in anticipation</a><br>
	57	<a href="#Use-cases">Use case examples</a><br>
	58	<a href="#Porting-endian-unaware-codebase">Porting endian unaware codebase</a><br>
	59	<a href="#Porting-endian-aware-codebase">Porting endian aware codebase</a><br>
	60	<a href="#Reliability-arithmetic-speed">Reliability and arithmetic-speed</a><br>
	61	<a href="#Reliability-ease-of-use">Reliability and ease-of-use</a></td>
	62	</tr>
	63	</table>
	64
65	<h2><a name="Introduction">Introduction</a></h2>
66
67	<p>Deciding which is the best endianness approach (conversion functions, buffer
68	types, or arithmetic types) for a particular application involves complex
69	engineering trade-offs. It is hard to assess those trade-offs without some
70	understanding of the different interfaces, so you might want to read the
71	<a href="conversion.html">conversion functions</a>, <a href="buffers.html">
72	buffer types</a>, and <a href="arithmetic.html">arithmetic types</a> pages
73	before diving into this page.</p>
74
75	<h2><a name="Choosing">Choosing</a> between conversion functions, buffer types,
76	and arithmetic types</h2>
77
78	<p>The best approach to endianness for a particular application depends on the interaction between
79	the application's needs and the characteristics of each of the three approaches.</p>
80
81	<p><b>Recommendation:</b> If you are new to endianness, uncertain, or don't want to invest
82	the time to
83	study
84	engineering trade-offs, use <a href="arithmetic.html">endian arithmetic types</a>. They are safe, easy
85	to use, and easy to maintain. Use the
86	<a href="#Anticipating-need"> <i>
87	anticipating need</i></a> design pattern locally around performance hot spots
88	like lengthy loops, if needed.</p>
89
90	<h3><a name="Background">Background</a> </h3>
91
92	<p>A dealing with endianness usually implies a program portability or a data
93	portability requirement, and often both. That means real programs dealing with
94	endianness are usually complex, so the examples shown here would really be
95	written as multiple functions spread across multiple translation units. They
96	would involve interfaces that can not be altered as they are supplied by
97	third-parties or the standard library. </p>
98
99	<h3><a name="Characteristics">Characteristics</a></h3>
100
101	<p>The characteristics that differentiate the three approaches to endianness are the endianness
102	invariants, conversion explicitness, arithmetic operations, sizes available, and
103	alignment requirements.</p>
104
105	<h4><a name="Endianness-invariants">Endianness invariants</a></h4>
106
107	<blockquote>
108
109	<p><b>Endian conversion functions</b> use objects of the ordinary C++ arithmetic
110	types like <code>int</code> or <code>unsigned short</code> to hold values. That
111	breaks the implicit invariant that the C++ language rules apply. The usual
112	language rules only apply if the endianness of the object is currently set to the native endianness for the platform. That can
113	make it very hard to reason about logic flow, and result in difficult to
114	find bugs.</p>
115
116	<p>For example:</p>
117
118	<blockquote>
119	<pre>struct data_t // big endian
120	{
121	int32_t v1; // description ...
122	int32_t v2; // description ...
123	... additional character data members (i.e. non-endian)
124	int32_t v3; // description ...
125	};
126
127	data_t data;
128
129	read(data);
130	big_to_native_inplace(data.v1);
131	big_to_native_inplace(data.v2);
132
133	...
134
135	++v1;
136	third_party::func(data.v2);
137
138	...
139
140	native_to_big_inplace(data.v1);
141	native_to_big_inplace(data.v2);
142	write(data);
143	</pre>
144	<p>The programmer didn't bother to convert <code>data.v3</code> to native
145	endianness because that member isn't used. A later maintainer needs to pass
146	<code>data.v3</code> to the third-party function, so adds <code>third_party::func(data.v3);</code>
147	somewhere deep in the code. This causes a silent failure because the usual
148	invariant that an object of type <code>int32_t</code> holds a value as
149	described by the C++ core language does not apply.</p>
150	</blockquote>
151	<p><b>Endian buffer and arithmetic types</b> hold values internally as arrays of
152	characters with an invariant that the endianness of the array never changes.
153	That makes these types easier to use and programs easier to maintain. </p>
154	<p>Here is the same example, using an endian arithmetic type:</p>
155	<blockquote>
156	<pre>struct data_t
157	{
158	big_int32_t v1; // description ...
159	big_int32_t v2; // description ...
160	... additional character data members (i.e. non-endian)
161	big_int32_t v3; // description ...
162	};
163
164	data_t data;
165
166	read(data);
167
168	...
169
170	++v1;
171	third_party::func(data.v2);
172
173	...
174
175	write(data);
176	</pre>
177	<p>A later maintainer can add <code>third_party::func(data.v3)</code>and it
178	will just-work.</p>
179	</blockquote>
180
181	</blockquote>
182
183	<h4><a name="Conversion-explicitness">Conversion explicitness</a></h4>
184
185	<blockquote>
186
187	<p><b>Endian conversion functions</b> and <b>buffer types</b> never perform
188	implicit conversions. This gives users explicit control of when conversion
189	occurs, and may help avoid unnecessary conversions.</p>
190
191	<p><b>Endian arithmetic types</b> perform conversion implicitly. That makes
192	these types very easy to use, but can result in unnecessary conversions. Failure
193	to hoist conversions out of inner loops can bring a performance penalty.</p>
194
195	</blockquote>
196
197	<h4><a name="Arithmetic-operations">Arithmetic operations</a></h4>
198
199	<blockquote>
200
201	<p><b>Endian conversion functions</b> do not supply arithmetic
202	operations, but this is not a concern since this approach uses ordinary C++
203	arithmetic types to hold values.</p>
204
205	<p><b>Endian buffer types</b> do not supply arithmetic operations. Although this
206	approach avoids unnecessary conversions, it can result in the introduction of
207	additional variables and confuse maintenance programmers.</p>
208
209	<p><b>Endian</b> <b>arithmetic types</b> do supply arithmetic operations. They
210	are very easy to use if lots of arithmetic is involved. </p>
211
212	</blockquote>
213
214	<h4><a name="Sizes">Sizes</a></h4>
215
216	<blockquote>
217
218	<p><b>Endianness conversion functions</b> only support 1, 2, 4, and 8 byte
219	integers. That's sufficient for many applications.</p>
220
221	<p><b>Endian buffer and arithmetic types</b> support 1, 2, 3, 4, 5, 6, 7, and 8
222	byte integers. For an application where memory use or I/O speed is the limiting
223	factor, using sizes tailored to application needs can be useful.</p>
224
225	</blockquote>
226
227	<h4><a name="Alignments">Alignments</a></h4>
228
229	<blockquote>
230
231	<p><b>Endianness conversion functions</b> only support aligned integer and
232	floating-point types. That's sufficient for most applications.</p>
233
234	<p><b>Endian buffer and arithmetic types</b> support both aligned and unaligned
235	integer and floating-point types. Unaligned types are rarely needed, but when
236	needed they are often very useful and workarounds are painful. For example,</p>
237
238	<blockquote>
239	<p>Non-portable code like this:<blockquote>
240	<pre>struct S {
241	uint16_t a;  // big endian
242	uint32_t b;  // big endian
243	} __attribute__ ((packed));</pre>
244	</blockquote>
245	<p>Can be replaced with portable code like this:</p>
246	<blockquote>
247	<pre>struct S {
248	big_uint16_ut a;
249	big_uint32_ut b;
250	};</pre>
251	</blockquote>
252	</blockquote>
253
254	</blockquote>
255
256	<h3><a name="Design-patterns">Design patterns</a></h3>
257
258	<p>Applications often traffic in endian data as records or packets containing
259	multiple endian data elements. For simplicity, we will just call them records.</p>
260
261	<p>If desired endianness differs from native endianness, a conversion has to be
262	performed. When should that conversion occur? Three design patterns have
263	evolved.</p>
264
265	<h4><a name="As-needed">Convert only as needed</a> (i.e. lazy)</h4>
266
267	<p>This pattern defers conversion to the point in the code where the data
268	element is actually used.</p>
269
270	<p>This pattern is appropriate when which endian element is actually used varies
271	greatly according to record content or other circumstances</p>
272
273	<h4><a name="Anticipating-need">Convert in anticipation of need</a></h4>
274
275	<p>This pattern performs conversion to native endianness in anticipation of use,
276	such as immediately after reading records. If needed, conversion to the output
277	endianness is performed after all possible needs have passed, such as just
278	before writing records.</p>
279
280	<p>One implementation of this pattern is to create a proxy record with
281	endianness converted to native in a read function, and expose only that proxy to
282	the rest of the implementation. If a write function, if needed, handles the
283	conversion from native to the desired output endianness.</p>
284
285	<p>This pattern is appropriate when all endian elements in a record are
286	typically used regardless of record content or other circumstances</p>
287
288	<h4><a name="Convert-generally-as-needed-locally-in-anticipation">Convert
289	only as needed, except locally in anticipation of need</a></h4>
290
291	<p>This pattern in general defers conversion but for specific local needs does
292	anticipatory conversion. Although particularly appropriate when coupled with the endian buffer
293	or arithmetic types, it also works well with the conversion functions.</p>
294
295	<p>Example:</p>
296
297	<blockquote>
298	<pre>struct data_t
299	{
300	big_int32_t v1;
301	big_int32_t v2;
302	big_int32_t v3;
303	};
304
305	data_t data;
306
307	read(data);
308
309	...
310	++v1;
311	...
312
313	int32_t v3_temp = data.v3; // hoist conversion out of loop
314
315	for (int32_t i = 0; i < <i><b>large-number</b></i>; ++i)
316	{
317	... <i><b>lengthy computation that accesses </b></i>v3_temp<i><b> many times</b></i> ...
318	}
319	data.v3 = v3_temp;
320
321	write(data);
322	</pre>
323	</blockquote>
324
325	<p dir="ltr">In general the above pseudo-code leaves conversion up to the endian
326	arithmetic type <code>big_int32_t</code>. But to avoid conversion inside the
327	loop, a temporary is created before the loop is entered, and then used to set
328	the new value of <code>data.v3</code> after the loop is complete.</p>
329
330	<blockquote>
331
332	<p dir="ltr">Question: Won't the compiler's optimizer hoist the conversion out
333	of the loop anyhow?</p>
334
335	<p dir="ltr">Answer: VC++ 2015 Preview, and probably others, does not, even for
336	a toy test program. Although the savings is small (two register <code>
337	<span style="font-size: 85%">bswap</span></code> instructions), the cost might
338	be significant if the loop is repeated enough times. On the other hand, the
339	program may be so dominated by I/O time that even a lengthy loop will be
340	immaterial.</p>
341
342	</blockquote>
343
344	<h3><a name="Use-cases">Use case examples</a></h3>
345
346	<h4><a name="Porting-endian-unaware-codebase">Porting endian unaware codebase</a></h4>
347
348	<p>An existing codebase runs on big endian systems. It does not
349	currently deal with endianness. The codebase needs to be modified so it can run
350	on  little endian systems under various operating systems. To ease
351	transition and protect value of existing files, external data will continue to
352	be maintained as big endian.</p>
353
354	<p dir="ltr">The <a href="arithmetic.html">endian
355	arithmetic approach</a> is recommended to meet these needs. A relatively small
356	number of header files dealing with binary I/O layouts need to change types. For
357	example,
358	<code>short</code> or <code>int16_t</code> would change to <code>big_int16_t</code>. No
359	changes are required for <code>.cpp</code> files.</p>
360
361	<h4><a name="Porting-endian-aware-codebase">Porting endian aware codebase</a></h4>
362
363	<p>An existing codebase runs on little-endian Linux systems. It already
364	deals with endianness via
365	<a href="http://man7.org/linux/man-pages/man3/endian.3.html">Linux provided
366	functions</a>. Because of a business merger, the codebase has to be quickly
367	modified for Windows and possibly other operating systems, while still
368	supporting Linux. The codebase is reliable and the programmers are all
369	well-aware of endian issues. </p>
370
371	<p dir="ltr">These factors all argue for an <a href="conversion.html">endian conversion
372	approach</a> that just mechanically changes the calls to <code>htobe32</code>,
373	etc. to <code>boost::endian::native_to_big</code>, etc. and replaces <code><endian.h></code>
374	with <code><boost/endian/conversion.hpp></code>.</p>
375
376	<h4><a name="Reliability-arithmetic-speed">Reliability and arithmetic-speed</a></h4>
377
378	<p>A new, complex, multi-threaded application is to be developed that must run
379	on little endian machines, but do big endian network I/O. The developers believe
380	computational speed for endian variable is critical but have seen numerous bugs
381	result from inability to reason about endian conversion state. They are also
382	worried that future maintenance changes could inadvertently introduce a lot of
383	slow conversions if full-blown endian arithmetic types are used.</p>
384
385	<p>The <a href="buffers.html">endian buffers</a> approach is made-to-order for
386	this use case.</p>
387
388	<h4><a name="Reliability-ease-of-use">Reliability and ease-of-use</a></h4>
389
390	<p>A new, complex, multi-threaded application is to be developed that must run
391	on little endian machines, but do big endian network I/O. The developers believe
392	computational speed for endian variables is <b>not critical</b> but have seen
393	numerous bugs result from inability to reason about endian conversion state.
394	They are also concerned about ease-of-use both during development and long-term
395	maintenance.</p>
396
397	<p>Removing concern about conversion speed and adding concern about ease-of-use
398	tips the balance strongly in favor the <a href="arithmetic.html">endian
399	arithmetic approach</a>.</p>
400
401	<hr>
402	<p>Last revised:
403	<!--webbot bot="Timestamp" s-type="EDITED" s-format="%d %B, %Y" startspan -->19 January, 2015<!--webbot bot="Timestamp" endspan i-checksum="38903" --></p>
404	<p>© Copyright Beman Dawes, 2011, 2013, 2014</p>
405	<p>Distributed under the Boost Software License, Version 1.0. See
406	<a href="http://www.boost.org/LICENSE_1_0.txt">www.boost.org/ LICENSE_1_0.txt</a></p>
407
408	<p> </p>
409
410	</body>
411
412	</html>