]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" |
2 | "http://www.w3.org/TR/html4/loose.dtd"> | |
3 | ||
4 | <html> | |
5 | <head> | |
6 | <meta http-equiv="Content-Language" content="en-us"> | |
7 | <meta http-equiv="Content-Type" content="text/html; charset=us-ascii"> | |
8 | <link rel="stylesheet" type="text/css" href="../../../../boost.css"> | |
9 | ||
10 | <title>Rounding Policies</title> | |
11 | </head> | |
12 | ||
13 | <body lang="en"> | |
14 | <h1>Rounding Policies</h1> | |
15 | ||
16 | <p>In order to be as general as possible, the library uses a class to | |
17 | compute all the necessary functions rounded upward or downward. This class | |
18 | is the first parameter of <code>policies</code>, it is also the type named | |
19 | <code>rounding</code> in the policy definition of | |
20 | <code>interval</code>.</p> | |
21 | ||
22 | <p>By default, it is <code>interval_lib::rounded_math<T></code>. The | |
23 | class <code>interval_lib::rounded_math</code> is already specialized for | |
24 | the standard floating types (<code>float</code> , <code>double</code> and | |
25 | <code>long double</code>). So if the base type of your intervals is not one | |
26 | of these, a good solution would probably be to provide a specialization of | |
27 | this class. But if the default specialization of | |
28 | <code>rounded_math<T></code> for <code>float</code>, | |
29 | <code>double</code>, or <code>long double</code> is not what you seek, or | |
30 | you do not want to specialize | |
31 | <code>interval_lib::rounded_math<T></code> (say because you prefer to | |
32 | work in your own namespace) you can also define your own rounding policy | |
33 | and pass it directly to <code>interval_lib::policies</code>.</p> | |
34 | ||
35 | <h2>Requirements</h2> | |
36 | ||
37 | <p>Here comes what the class is supposed to provide. The domains are | |
38 | written next to their respective functions (as you can see, the functions | |
39 | do not have to worry about invalid values, but they have to handle infinite | |
40 | arguments).</p> | |
41 | <pre> | |
42 | /* Rounding requirements */ | |
43 | struct rounding { | |
44 | // default constructor, destructor | |
45 | rounding(); | |
46 | ~rounding(); | |
47 | // mathematical operations | |
48 | T add_down(T, T); // [-∞;+∞][-∞;+∞] | |
49 | T add_up (T, T); // [-∞;+∞][-∞;+∞] | |
50 | T sub_down(T, T); // [-∞;+∞][-∞;+∞] | |
51 | T sub_up (T, T); // [-∞;+∞][-∞;+∞] | |
52 | T mul_down(T, T); // [-∞;+∞][-∞;+∞] | |
53 | T mul_up (T, T); // [-∞;+∞][-∞;+∞] | |
54 | T div_down(T, T); // [-∞;+∞]([-∞;+∞]-{0}) | |
55 | T div_up (T, T); // [-∞;+∞]([-∞;+∞]-{0}) | |
56 | T sqrt_down(T); // ]0;+∞] | |
57 | T sqrt_up (T); // ]0;+∞] | |
58 | T exp_down(T); // [-∞;+∞] | |
59 | T exp_up (T); // [-∞;+∞] | |
60 | T log_down(T); // ]0;+∞] | |
61 | T log_up (T); // ]0;+∞] | |
62 | T cos_down(T); // [0;2π] | |
63 | T cos_up (T); // [0;2π] | |
64 | T tan_down(T); // ]-π/2;π/2[ | |
65 | T tan_up (T); // ]-π/2;π/2[ | |
66 | T asin_down(T); // [-1;1] | |
67 | T asin_up (T); // [-1;1] | |
68 | T acos_down(T); // [-1;1] | |
69 | T acos_up (T); // [-1;1] | |
70 | T atan_down(T); // [-∞;+∞] | |
71 | T atan_up (T); // [-∞;+∞] | |
72 | T sinh_down(T); // [-∞;+∞] | |
73 | T sinh_up (T); // [-∞;+∞] | |
74 | T cosh_down(T); // [-∞;+∞] | |
75 | T cosh_up (T); // [-∞;+∞] | |
76 | T tanh_down(T); // [-∞;+∞] | |
77 | T tanh_up (T); // [-∞;+∞] | |
78 | T asinh_down(T); // [-∞;+∞] | |
79 | T asinh_up (T); // [-∞;+∞] | |
80 | T acosh_down(T); // [1;+∞] | |
81 | T acosh_up (T); // [1;+∞] | |
82 | T atanh_down(T); // [-1;1] | |
83 | T atanh_up (T); // [-1;1] | |
84 | T median(T, T); // [-∞;+∞][-∞;+∞] | |
85 | T int_down(T); // [-∞;+∞] | |
86 | T int_up (T); // [-∞;+∞] | |
87 | // conversion functions | |
88 | T conv_down(U); | |
89 | T conv_up (U); | |
90 | // unprotected rounding class | |
91 | typedef ... unprotected_rounding; | |
92 | }; | |
93 | </pre> | |
94 | ||
95 | <p>The constructor and destructor of the rounding class have a very | |
96 | important semantic requirement: they are responsible for setting and | |
97 | resetting the rounding modes of the computation on T. For instance, if T is | |
98 | a standard floating point type and floating point computation is performed | |
99 | according to the Standard IEEE 754, the constructor can save the current | |
100 | rounding state, each <code>_up</code> (resp. <code>_down</code>) function | |
101 | will round up (resp. down), and the destructor will restore the saved | |
102 | rounding state. Indeed this is the behavior of the default rounding | |
103 | policy.</p> | |
104 | ||
105 | <p>The meaning of all the mathematical functions up until | |
106 | <code>atanh_up</code> is clear: each function returns number representable | |
107 | in the type <code>T</code> which is a lower bound (for <code>_down</code>) | |
108 | or upper bound (for <code>_up</code>) on the true mathematical result of | |
109 | the corresponding function. The function <code>median</code> computes the | |
110 | average of its two arguments rounded to its nearest representable number. | |
111 | The functions <code>int_down</code> and <code>int_up</code> compute the | |
112 | nearest integer smaller or bigger than their argument. Finally, | |
113 | <code>conv_down</code> and <code>conv_up</code> are responsible of the | |
114 | conversions of values of other types to the base number type: the first one | |
115 | must round down the value and the second one must round it up.</p> | |
116 | ||
117 | <p>The type <code>unprotected_rounding</code> allows to remove all | |
118 | controls. For reasons why one might to do this, see the <a href= | |
119 | "#Protection">protection</a> paragraph below.</p> | |
120 | ||
121 | <h2>Overview of the provided classes</h2> | |
122 | ||
123 | <p>A lot of classes are provided. The classes are organized by level. At | |
124 | the bottom is the class <code>rounding_control</code>. At the next level | |
125 | come <code>rounded_arith_exact</code>, <code>rounded_arith_std</code> and | |
126 | <code>rounded_arith_opp</code>. Then there are | |
127 | <code>rounded_transc_dummy</code>, <code>rounded_transc_exact</code>, | |
128 | <code>rounded_transc_std</code> and <code>rounded_transc_opp</code>. And | |
129 | finally are <code>save_state</code> and <code>save_state_nothing</code>. | |
130 | Each of these classes provide a set of members that are required by the | |
131 | classes of the next level. For example, a <code>rounded_transc_...</code> | |
132 | class needs the members of a <code>rounded_arith_...</code> class.</p> | |
133 | ||
134 | <p>When they exist in two versions <code>_std</code> and <code>_opp</code>, | |
135 | the first one does switch the rounding mode each time, and the second one | |
136 | tries to keep it oriented toward plus infinity. The main purpose of the | |
137 | <code>_opp</code> version is to speed up the computations through the use | |
138 | of the "opposite trick" (see the <a href="#perf">performance notes</a>). | |
139 | This version requires the rounding mode to be upward before entering any | |
140 | computation functions of the class. It guarantees that the rounding mode | |
141 | will still be upward at the exit of the functions.</p> | |
142 | ||
143 | <p>Please note that it is really a very bad idea to mix the | |
144 | <code>_opp</code> version with the <code>_std</code> since they do not have | |
145 | compatible properties.</p> | |
146 | ||
147 | <p>There is a third version named <code>_exact</code> which computes the | |
148 | functions without changing the rounding mode. It is an "exact" version | |
149 | because it is intended for a base type that produces exact results.</p> | |
150 | ||
151 | <p>The last version is the <code>_dummy</code> version. It does not do any | |
152 | computations but still produces compatible results.</p> | |
153 | ||
154 | <p>Please note that it is possible to use the "exact" version for an | |
155 | inexact base type, e.g. <code>float</code> or <code>double</code>. In that | |
156 | case, the inclusion property is no longer guaranteed, but this can be | |
157 | useful to speed up the computation when the inclusion property is not | |
158 | desired strictly. For instance, in computer graphics, a small error due to | |
159 | floating-point roundoff is acceptable as long as an approximate version of | |
160 | the inclusion property holds.</p> | |
161 | ||
162 | <p>Here comes what each class defines. Later, when they will be described | |
163 | more thoroughly, these members will not be repeated. Please come back here | |
164 | in order to see them. Inheritance is also used to avoid repetitions.</p> | |
165 | <pre> | |
166 | template <class T> | |
167 | struct rounding_control | |
168 | { | |
169 | typedef ... rounding_mode; | |
170 | void set_rounding_mode(rounding_mode); | |
171 | void get_rounding_mode(rounding_mode&); | |
172 | void downward (); | |
173 | void upward (); | |
174 | void to_nearest(); | |
175 | T to_int(T); | |
176 | T force_rounding(T); | |
177 | }; | |
178 | ||
179 | template <class T, class Rounding> | |
180 | struct rounded_arith_... : Rounding | |
181 | { | |
182 | void init(); | |
183 | T add_down(T, T); | |
184 | T add_up (T, T); | |
185 | T sub_down(T, T); | |
186 | T sub_up (T, T); | |
187 | T mul_down(T, T); | |
188 | T mul_up (T, T); | |
189 | T div_down(T, T); | |
190 | T div_up (T, T); | |
191 | T sqrt_down(T); | |
192 | T sqrt_up (T); | |
193 | T median(T, T); | |
194 | T int_down(T); | |
195 | T int_up (T); | |
196 | }; | |
197 | ||
198 | template <class T, class Rounding> | |
199 | struct rounded_transc_... : Rounding | |
200 | { | |
201 | T exp_down(T); | |
202 | T exp_up (T); | |
203 | T log_down(T); | |
204 | T log_up (T); | |
205 | T cos_down(T); | |
206 | T cos_up (T); | |
207 | T tan_down(T); | |
208 | T tan_up (T); | |
209 | T asin_down(T); | |
210 | T asin_up (T); | |
211 | T acos_down(T); | |
212 | T acos_up (T); | |
213 | T atan_down(T); | |
214 | T atan_up (T); | |
215 | T sinh_down(T); | |
216 | T sinh_up (T); | |
217 | T cosh_down(T); | |
218 | T cosh_up (T); | |
219 | T tanh_down(T); | |
220 | T tanh_up (T); | |
221 | T asinh_down(T); | |
222 | T asinh_up (T); | |
223 | T acosh_down(T); | |
224 | T acosh_up (T); | |
225 | T atanh_down(T); | |
226 | T atanh_up (T); | |
227 | }; | |
228 | ||
229 | template <class Rounding> | |
230 | struct save_state_... : Rounding | |
231 | { | |
232 | save_state_...(); | |
233 | ~save_state_...(); | |
234 | typedef ... unprotected_rounding; | |
235 | }; | |
236 | </pre> | |
237 | ||
238 | <h2>Synopsis</h2> | |
239 | <pre> | |
240 | namespace boost { | |
241 | namespace numeric { | |
242 | namespace interval_lib { | |
243 | ||
244 | <span style="color: #FF0000">/* basic rounding control */</span> | |
245 | template <class T> struct rounding_control; | |
246 | ||
247 | <span style="color: #FF0000">/* arithmetic functions rounding */</span> | |
248 | template <class T, class Rounding = rounding_control<T> > struct rounded_arith_exact; | |
249 | template <class T, class Rounding = rounding_control<T> > struct rounded_arith_std; | |
250 | template <class T, class Rounding = rounding_control<T> > struct rounded_arith_opp; | |
251 | ||
252 | <span style="color: #FF0000">/* transcendental functions rounding */</span> | |
253 | template <class T, class Rounding> struct rounded_transc_dummy; | |
254 | template <class T, class Rounding = rounded_arith_exact<T> > struct rounded_transc_exact; | |
255 | template <class T, class Rounding = rounded_arith_std<T> > struct rounded_transc_std; | |
256 | template <class T, class Rounding = rounded_arith_opp<T> > struct rounded_transc_opp; | |
257 | ||
258 | <span style="color: #FF0000">/* rounding-state-saving classes */</span> | |
259 | template <class Rounding> struct save_state; | |
260 | template <class Rounding> struct save_state_nothing; | |
261 | ||
262 | <span style="color: #FF0000">/* default policy for type T */</span> | |
263 | template <class T> struct rounded_math; | |
264 | template <> struct rounded_math<float>; | |
265 | template <> struct rounded_math<double>; | |
266 | ||
267 | <span style= | |
268 | "color: #FF0000">/* some metaprogramming to convert a protected to unprotected rounding */</span> | |
269 | template <class I> struct unprotect; | |
270 | ||
271 | } // namespace interval_lib | |
272 | } // namespace numeric | |
273 | } // namespace boost | |
274 | </pre> | |
275 | ||
276 | <h2>Description of the provided classes</h2> | |
277 | ||
278 | <p>We now describe each class in the order they appear in the definition of | |
279 | a rounding policy (this outermost-to-innermost order is the reverse order | |
280 | from the synopsis).</p> | |
281 | ||
282 | <h3 id="Protection">Protection control</h3> | |
283 | ||
284 | <p>Protection refers to the fact that the interval operations will be | |
285 | surrounded by rounding mode controls. Unprotecting a class means to remove | |
286 | all the rounding controls. Each rounding policy provides a type | |
287 | <code>unprotected_rounding</code>. The required type | |
288 | <code>unprotected_rounding</code> gives another rounding class that enables | |
289 | to work when nested inside rounding. For example, the first three lines | |
290 | below should all produce the same result (because the first operation is | |
291 | the rounding constructor, and the last is its destructor, which take care | |
292 | of setting the rounding modes); and the last line is allowed to have an | |
293 | undefined behavior (since no rounding constructor or destructor is ever | |
294 | called).</p> | |
295 | <pre> | |
296 | T c; { rounding rnd; c = rnd.add_down(a, b); } | |
297 | T c; { rounding rnd1; { rounding rnd2; c = rnd2.add_down(a, b); } } | |
298 | T c; { rounding rnd1; { rounding::unprotected_rounding rnd2; c = rnd2.add_down(a, b); } } | |
299 | T d; { rounding::unprotected_rounding rnd; d = rnd.add_down(a, b); } | |
300 | </pre> | |
301 | ||
302 | <p>Naturally <code>rounding::unprotected_rounding</code> may simply be | |
303 | <code>rounding</code> itself. But it can improve performance if it is a | |
304 | simplified version with empty constructor and destructor. In order to avoid | |
305 | undefined behaviors, in the library, an object of type | |
306 | <code>rounding::unprotected_rounding</code> is guaranteed to be created | |
307 | only when an object of type <code>rounding</code> is already alive. See the | |
308 | <a href="#perf">performance notes</a> for some additional details.</p> | |
309 | ||
310 | <p>The support library defines a metaprogramming class template | |
311 | <code>unprotect</code> which takes an interval type <code>I</code> and | |
312 | returns an interval type <code>unprotect<I>::type</code> where the | |
313 | rounding policy has been unprotected. Some information about the types: | |
314 | <code>interval<T, interval_lib::policies<Rounding, _> | |
315 | >::traits_type::rounding</code> <b>is</b> the same type as | |
316 | <code>Rounding</code>, and <code>unprotect<interval<T, | |
317 | interval_lib::policies<Rounding, _> > >::type</code> <b>is</b> | |
318 | the same type as <code>interval<T, | |
319 | interval_lib::policies<Rounding::unprotected, _> ></code>.</p> | |
320 | ||
321 | <h3>State saving</h3> | |
322 | ||
323 | <p>First comes <code>save_state</code>. This class is responsible for | |
324 | saving the current rounding mode and calling init in its constructor, and | |
325 | for restoring the saved rounding mode in its destructor. This class also | |
326 | defines the <code>unprotected_rounding</code> type.</p> | |
327 | ||
328 | <p>If the rounding mode does not require any state-saving or | |
329 | initialization, <code>save_state_nothing</code> can be used instead of | |
330 | <code>save_state</code>.</p> | |
331 | ||
332 | <h3>Transcendental functions</h3> | |
333 | ||
334 | <p>The classes <code>rounded_transc_exact</code>, | |
335 | <code>rounded_transc_std</code> and <code>rounded_transc_opp</code> expect | |
336 | the std namespace to provide the functions exp log cos tan acos asin atan | |
337 | cosh sinh tanh acosh asinh atanh. For the <code>_std</code> and | |
338 | <code>_opp</code> versions, all these functions should respect the current | |
339 | rounding mode fixed by a call to downward or upward.</p> | |
340 | ||
341 | <p><strong>Please note:</strong> Unfortunately, the latter is rarely the | |
342 | case. It is the reason why a class <code>rounded_transc_dummy</code> is | |
343 | provided which does not depend on the functions from the std namespace. | |
344 | There is no magic, however. The functions of | |
345 | <code>rounded_transc_dummy</code> do not compute anything. They only return | |
346 | valid values. For example, <code>cos_down</code> always returns -1. In this | |
347 | way, we do verify the inclusion property for the default implementation, | |
348 | even if this has strictly no value for the user. In order to have useful | |
349 | values, another policy should be used explicitely, which will most likely | |
350 | lead to a violation of the inclusion property. In this way, we ensure that | |
351 | the violation is clearly pointed out to the user who then knows what he | |
352 | stands against. This class could have been used as the default | |
353 | transcendental rounding class, but it was decided it would be better for | |
354 | the compilation to fail due to missing declarations rather than succeed | |
355 | thanks to valid but unusable functions.</p> | |
356 | ||
357 | <h3>Basic arithmetic functions</h3> | |
358 | ||
359 | <p>The classes <code>rounded_arith_std</code> and | |
360 | <code>rounded_arith_opp</code> expect the operators + - * / and the | |
361 | function <code>std::sqrt</code> to respect the current rounding mode.</p> | |
362 | ||
363 | <p>The class <code>rounded_arith_exact</code> requires | |
364 | <code>std::floor</code> and <code>std::ceil</code> to be defined since it | |
365 | can not rely on <code>to_int</code>.</p> | |
366 | ||
367 | <h3>Rounding control</h3> | |
368 | ||
369 | <p>The functions defined by each of the previous classes did not need any | |
370 | explanation. For example, the behavior of <code>add_down</code> is to | |
371 | compute the sum of two numbers rounded downward. For | |
372 | <code>rounding_control</code>, the situation is a bit more complex.</p> | |
373 | ||
374 | <p>The basic function is <code>force_rounding</code> which returns its | |
375 | argument correctly rounded accordingly to the current rounding mode if it | |
376 | was not already the case. This function is necessary to handle delayed | |
377 | rounding. Indeed, depending on the way the computations are done, the | |
378 | intermediate results may be internally stored in a more precise format and | |
379 | it can lead to a wrong rounding. So the function enforces the rounding. | |
380 | <a href="#extreg">Here</a> is an example of what happens when the rounding | |
381 | is not enforced.</p> | |
382 | ||
383 | <p>The function <code>get_rounding_mode</code> returns the current rounding | |
384 | mode, <code>set_rounding_mode</code> sets the rounding mode back to a | |
385 | previous value returned by <code>get_rounding_mode</code>. | |
386 | <code>downward</code>, <code>upward</code> and <code>to_nearest</code> sets | |
387 | the rounding mode in one of the three directions. This rounding mode should | |
388 | be global to all the functions that use the type <code>T</code>. For | |
389 | example, after a call to <code>downward</code>, | |
390 | <code>force_rounding(x+y)</code> is expected to return the sum rounded | |
391 | toward -∞.</p> | |
392 | ||
393 | <p>The function <code>to_int</code> computes the nearest integer | |
394 | accordingly to the current rounding mode.</p> | |
395 | ||
396 | <p>The non-specialized version of <code>rounding_control</code> does not do | |
397 | anything. The functions for the rounding mode are empty, and | |
398 | <code>to_int</code> and <code>force_rounding</code> are identity functions. | |
399 | The <code>pi_</code> constant functions return suitable integers (for | |
400 | example, <code>pi_up</code> returns <code>T(4)</code>).</p> | |
401 | ||
402 | <p>The class template <code>rounding_control</code> is specialized for | |
403 | <code>float</code>, <code>double</code> and <code>long double</code> in | |
404 | order to best use the floating point unit of the computer.</p> | |
405 | ||
406 | <h2>Template class <tt>rounded_math</tt></h2> | |
407 | ||
408 | <p>The default policy (aka <code>rounded_math<T></code>) is simply | |
409 | defined as:</p> | |
410 | <pre> | |
411 | template <class T> struct rounded_math<T> : save_state_nothing<rounded_arith_exact<T> > {}; | |
412 | </pre> | |
413 | ||
414 | <p>and the specializations for <code>float</code>, <code>double</code> and | |
415 | <code>long double</code> use <code>rounded_arith_opp</code>, as in:</p> | |
416 | <pre> | |
417 | template <> struct rounded_math<float> : save_state<rounded_arith_opp<float> > {}; | |
418 | template <> struct rounded_math<double> : save_state<rounded_arith_opp<double> > {}; | |
419 | template <> struct rounded_math<long double> : save_state<rounded_arith_opp<long double> > {}; | |
420 | </pre> | |
421 | ||
422 | <h2 id="perf">Performance Issues</h2> | |
423 | ||
424 | <p>This paragraph deals mostly with the performance of the library with | |
425 | intervals using the floating-point unit (FPU) of the computer. Let's | |
426 | consider the sum of [<i>a</i>,<i>b</i>] and [<i>c</i>,<i>d</i>] as an | |
427 | example. The result is [<code>down</code>(<i>a</i>+<i>c</i>), | |
428 | <code>up</code>(<i>b</i>+<i>d</i>)], where <code>down</code> and | |
429 | <code>up</code> indicate the rounding mode needed.</p> | |
430 | ||
431 | <h3>Rounding Mode Switch</h3> | |
432 | ||
433 | <p>If the FPU is able to use a different rounding mode for each operation, | |
434 | there is no problem. For example, it's the case for the Alpha processor: | |
435 | each floating-point instruction can specify a different rounding mode. | |
436 | However, the IEEE-754 Standard does not require such a behavior. So most of | |
437 | the FPUs only provide some instructions to set the rounding mode for all | |
438 | subsequent operations. And generally, these instructions need to flush the | |
439 | pipeline of the FPU.</p> | |
440 | ||
441 | <p>In this situation, the time needed to sum [<i>a</i>,<i>b</i>] and | |
442 | [<i>c</i>,<i>d</i>] is far worse than the time needed to calculate | |
443 | <i>a</i>+<i>b</i> and <i>c</i>+<i>d</i> since the two additions cannot be | |
444 | parallelized. Consequently, the objective is to diminish the number of | |
445 | rounding mode switches.</p> | |
446 | ||
447 | <p>If this library is not used to provide exact computations, but only for | |
448 | pair arithmetic, the solution is quite simple: do not use rounding. In that | |
449 | case, doing the sum [<i>a</i>,<i>b</i>] and [<i>c</i>,<i>d</i>] will be as | |
450 | fast as computing <i>a</i>+<i>b</i> and <i>c</i>+<i>d</i>. Everything is | |
451 | perfect.</p> | |
452 | ||
453 | <p>However, if exact computations are required, such a solution is totally | |
454 | unthinkable. So, are we penniless? No, there is still a trick available. | |
455 | Indeed, down(<i>a</i>+<i>c</i>) = -up(-<i>a</i>-<i>c</i>) if the unary | |
456 | minus is an exact operation. It is now possible to calculate the whole sum | |
457 | with the same rounding mode. Generally, the cost of the mode switching is | |
458 | worse than the cost of the sign changes.</p> | |
459 | ||
460 | <h4>Speeding up consecutive operations</h4> | |
461 | ||
462 | <p>The interval addition is not the only operation; most of the interval | |
463 | operations can be computed by setting the rounding direction of the FPU | |
464 | only once. So the operations of the floating point rounding policy assume | |
465 | that the direction is correctly set. This assumption is usually not true in | |
466 | a program (the user and the standard library expect the rounding direction | |
467 | to be to nearest), so these operations have to be enclosed in a shell that | |
468 | sets the floating point environment. This protection is done by the | |
469 | constructor and destructor of the rounding policy.</p> | |
470 | ||
471 | <p>Les us now consider the case of two consecutive interval additions: | |
472 | [<i>a</i>,<i>b</i>] + [<i>c</i>,<i>d</i>] + [<i>e</i>,<i>f</i>]. The | |
473 | generated code should look like:</p> | |
474 | <pre> | |
475 | init_rounding_mode(); // rounding object construction during the first addition | |
476 | t1 = -(-a - c); | |
477 | t2 = b + d; | |
478 | restore_rounding_mode(); // rounding object destruction | |
479 | init_rounding_mode(); // rounding object construction during the second addition | |
480 | x = -(-t1 - e); | |
481 | y = t2 + f; | |
482 | restore_rounding_mode(); // rounding object destruction | |
483 | // the result is the interval [x,y] | |
484 | </pre> | |
485 | ||
486 | <p>Between the two operations, the rounding direction is restored, and then | |
487 | initialized again. Ideally, compilers should be able to optimize this | |
488 | useless code away. But unfortunately they are not, and this slows the code | |
489 | down by an order of magnitude. In order to avoid this bottleneck, the user | |
490 | can tell to the interval operations that they do not need to be protected | |
491 | anymore. It will then be up to the user to protect the interval | |
492 | computations. The compiler will then be able to generate such a code:</p> | |
493 | <pre> | |
494 | init_rounding_mode(); // done by the user | |
495 | x = -(-a - c - e); | |
496 | y = b + d + f; | |
497 | restore_rounding_mode(); // done by the user | |
498 | </pre> | |
499 | ||
500 | <p>The user will have to create a rounding object. And as long as this | |
501 | object is alive, unprotected versions of the interval operations can be | |
502 | used. They are selected by using an interval type with a specific rounding | |
503 | policy. If the initial interval type is <code>I</code>, then | |
504 | <code>I::traits_type::rounding</code> is the type of the rounding object, | |
505 | and <code>interval_lib::unprotect<I>::type</code> is the type of the | |
506 | unprotected interval type.</p> | |
507 | ||
508 | <p>Because the rounding mode of the FPU is changed during the life of the | |
509 | rounding object, any arithmetic floating point operation that does not | |
510 | involve the interval library can lead to unexpected results. And | |
511 | reciprocally, using unprotected interval operation when no rounding object | |
512 | is alive will produce intervals that are not guaranteed anymore to contain | |
513 | the real result.</p> | |
514 | ||
515 | <h4 id="perfexp">Example</h4> | |
516 | ||
517 | <p>Here is an example of Horner's scheme to compute the value of a polynom. | |
518 | The rounding mode switches are disabled for the whole computation.</p> | |
519 | <pre> | |
520 | // I is an interval class, the polynom is a simple array | |
521 | template<class I> | |
522 | I horner(const I& x, const I p[], int n) { | |
523 | ||
524 | // save and initialize the rounding mode | |
525 | typename I::traits_type::rounding rnd; | |
526 | ||
527 | // define the unprotected version of the interval type | |
528 | typedef typename boost::numeric::interval_lib::unprotect<I>::type R; | |
529 | ||
530 | const R& a = x; | |
531 | R y = p[n - 1]; | |
532 | for(int i = n - 2; i >= 0; i--) { | |
533 | y = y * a + (const R&)(p[i]); | |
534 | } | |
535 | return y; | |
536 | ||
537 | // restore the rounding mode with the destruction of rnd | |
538 | } | |
539 | </pre> | |
540 | ||
541 | <p>Please note that a rounding object is specially created in order to | |
542 | protect all the interval computations. Each interval of type I is converted | |
543 | in an interval of type R before any operations. If this conversion is not | |
544 | done, the result is still correct, but the interest of this whole | |
545 | optimization has disappeared. Whenever possible, it is good to convert to | |
546 | <code>const R&</code> instead of <code>R</code>: indeed, the function | |
547 | could already be called inside an unprotection block so the types | |
548 | <code>R</code> and <code>I</code> would be the same interval, no need for a | |
549 | conversion.</p> | |
550 | ||
551 | <h4>Uninteresting remark</h4> | |
552 | ||
553 | <p>It was said at the beginning that the Alpha processors can use a | |
554 | specific rounding mode for each operation. However, due to the instruction | |
555 | format, the rounding toward plus infinity is not available. Only the | |
556 | rounding toward minus infinity can be used. So the trick using the change | |
557 | of sign becomes essential, but there is no need to save and restore the | |
558 | rounding mode on both sides of an operation.</p> | |
559 | ||
560 | <h3 id="extreg">Extended Registers</h3> | |
561 | ||
562 | <p>There is another problem besides the cost of the rounding mode switch. | |
563 | Some FPUs use extended registers (for example, float computations will be | |
564 | done with double registers, or double computations with long double | |
565 | registers). Consequently, many problems can arise.</p> | |
566 | ||
567 | <p>The first one is due to to the extended precision of the mantissa. The | |
568 | rounding is also done on this extended precision. And consequently, we | |
569 | still have down(<i>a</i>+<i>b</i>) = -up(-<i>a</i>-<i>b</i>) in the | |
570 | extended registers. But back to the standard precision, we now have | |
571 | down(<i>a</i>+<i>b</i>) < -up(-<i>a</i>-<i>b</i>) instead of an | |
572 | equality. A solution could be not to use this method. But there still are | |
573 | other problems, with the comparisons between numbers for example.</p> | |
574 | ||
575 | <p>Naturally, there is also a problem with the extended precision of the | |
576 | exponent. To illustrate this problem, let <i>m</i> be the biggest number | |
577 | before +<i>inf</i>. If we calculate 2*[<i>m</i>,<i>m</i>], the answer | |
578 | should be [<i>m</i>,<i>inf</i>]. But due to the extended registers, the FPU | |
579 | will first store [<i>2m</i>,<i>2m</i>] and then convert it to | |
580 | [<i>inf</i>,<i>inf</i>] at the end of the calculus (when the rounding mode | |
581 | is toward +<i>inf</i>). So the answer is no more accurate.</p> | |
582 | ||
583 | <p>There is only one solution: to force the FPU to convert the extended | |
584 | values back to standard precision after each operation. Some FPUs provide | |
585 | an instruction able to do this conversion (for example the PowerPC | |
586 | processors). But for the FPUs that do not provide it (the x86 processors), | |
587 | the only solution is to write the values to memory and read them back. Such | |
588 | an operation is obviously very expensive.</p> | |
589 | ||
590 | <h2>Some Examples</h2> | |
591 | ||
592 | <p>Here come several cases:</p> | |
593 | ||
594 | <ul> | |
595 | <li>if you need precise computations with the <code>float</code> or | |
596 | <code>double</code> types, use the default | |
597 | <code>rounded_math<T></code>;</li> | |
598 | ||
599 | <li>for fast wide intervals without any rounding nor precision, use | |
600 | <code>save_state_nothing<rounded_transc_exact<T> | |
601 | ></code>;</li> | |
602 | ||
603 | <li>for an exact type (like int or rational with a little help for | |
604 | infinite and NaN values) without support for transcendental functions, | |
605 | the solution could be | |
606 | <code>save_state_nothing<rounded_transc_dummy<T, | |
607 | rounded_arith_exact<T> > ></code> or directly | |
608 | <code>save_state_nothing<rounded_arith_exact<T> | |
609 | ></code>;</li> | |
610 | ||
611 | <li>if it is a more complex case than the previous ones, the best thing | |
612 | is probably to directly define your own policy.</li> | |
613 | </ul> | |
614 | <hr> | |
615 | ||
616 | <p><a href="http://validator.w3.org/check?uri=referer"><img border="0" src= | |
617 | "../../../../doc/images/valid-html401.png" alt="Valid HTML 4.01 Transitional" | |
618 | height="31" width="88"></a></p> | |
619 | ||
620 | <p>Revised | |
621 | <!--webbot bot="Timestamp" s-type="EDITED" s-format="%Y-%m-%d" startspan -->2006-12-24<!--webbot bot="Timestamp" endspan i-checksum="12172" --></p> | |
622 | ||
623 | <p><i>Copyright © 2002 Guillaume Melquiond, Sylvain Pion, Hervé | |
624 | Brönnimann, Polytechnic University<br> | |
625 | Copyright © 2004-2005 Guillaume Melquiond, ENS Lyon</i></p> | |
626 | ||
627 | <p><i>Distributed under the Boost Software License, Version 1.0. (See | |
628 | accompanying file <a href="../../../../LICENSE_1_0.txt">LICENSE_1_0.txt</a> | |
629 | or copy at <a href= | |
630 | "http://www.boost.org/LICENSE_1_0.txt">http://www.boost.org/LICENSE_1_0.txt</a>)</i></p> | |
631 | </body> | |
632 | </html> |