3 <meta http-equiv=
"Content-Type" content=
"text/html; charset=US-ASCII">
4 <title>Comparison of Cube Root Finding Algorithms
</title>
5 <link rel=
"stylesheet" href=
"../../../math.css" type=
"text/css">
6 <meta name=
"generator" content=
"DocBook XSL Stylesheets V1.77.1">
7 <link rel=
"home" href=
"../../../index.html" title=
"Math Toolkit 2.5.1">
8 <link rel=
"up" href=
"../root_comparison.html" title=
"Comparison of Root Finding Algorithms">
9 <link rel=
"prev" href=
"../root_comparison.html" title=
"Comparison of Root Finding Algorithms">
10 <link rel=
"next" href=
"root_n_comparison.html" title=
"Comparison of Nth-root Finding Algorithms">
12 <body bgcolor=
"white" text=
"black" link=
"#0000FF" vlink=
"#840084" alink=
"#0000FF">
13 <table cellpadding=
"2" width=
"100%"><tr>
14 <td valign=
"top"><img alt=
"Boost C++ Libraries" width=
"277" height=
"86" src=
"../../../../../../../boost.png"></td>
15 <td align=
"center"><a href=
"../../../../../../../index.html">Home
</a></td>
16 <td align=
"center"><a href=
"../../../../../../../libs/libraries.htm">Libraries
</a></td>
17 <td align=
"center"><a href=
"http://www.boost.org/users/people.html">People
</a></td>
18 <td align=
"center"><a href=
"http://www.boost.org/users/faq.html">FAQ
</a></td>
19 <td align=
"center"><a href=
"../../../../../../../more/index.htm">More
</a></td>
22 <div class=
"spirit-nav">
23 <a accesskey=
"p" href=
"../root_comparison.html"><img src=
"../../../../../../../doc/src/images/prev.png" alt=
"Prev"></a><a accesskey=
"u" href=
"../root_comparison.html"><img src=
"../../../../../../../doc/src/images/up.png" alt=
"Up"></a><a accesskey=
"h" href=
"../../../index.html"><img src=
"../../../../../../../doc/src/images/home.png" alt=
"Home"></a><a accesskey=
"n" href=
"root_n_comparison.html"><img src=
"../../../../../../../doc/src/images/next.png" alt=
"Next"></a>
26 <div class=
"titlepage"><div><div><h4 class=
"title">
27 <a name=
"math_toolkit.roots.root_comparison.cbrt_comparison"></a><a class=
"link" href=
"cbrt_comparison.html" title=
"Comparison of Cube Root Finding Algorithms">Comparison
28 of Cube Root Finding Algorithms
</a>
29 </h4></div></div></div>
31 In the table below, the cube root of
28 was computed for three
<a href=
"http://en.cppreference.com/w/cpp/language/types" target=
"_top">fundamental
32 types
</a> floating-point types, and one
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>
33 type
<a href=
"../../../../../../../libs/multiprecision/doc/html/boost_multiprecision/tut/floats/cpp_bin_float.html" target=
"_top">cpp_bin_float
</a>
34 using
50 decimal digit precision, using four algorithms.
37 The 'exact' answer was computed using a
100 decimal digit type:
39 <pre class=
"programlisting"><span class=
"identifier">cpp_bin_float_100
</span> <span class=
"identifier">full_answer
</span> <span class=
"special">(
</span><span class=
"string">"3.036588971875662519420809578505669635581453977248111123242141654169177268411884961770250390838097895"</span><span class=
"special">);
</span>
42 Times were measured using
<a href=
"../../../../../../../libs/timer/doc/index.html" target=
"_top">Boost.Timer
</a>
43 using
<code class=
"computeroutput"><span class=
"keyword">class
</span> <span class=
"identifier">cpu_timer
</span></code>.
45 <div class=
"itemizedlist"><ul class=
"itemizedlist" style=
"list-style-type: disc; ">
47 <span class=
"emphasis"><em>Its
</em></span> is the number of iterations taken to find
51 <span class=
"emphasis"><em>Times
</em></span> is the CPU time-taken in arbitrary units.
54 <span class=
"emphasis"><em>Norm
</em></span> is a normalized time, in comparison to the
55 quickest algorithm (with value
1.00).
58 <span class=
"emphasis"><em>Dis
</em></span> is the distance from the nearest representation
59 of the 'exact' root in bits. Distance from the 'exact' answer is measured
60 by using function
<a href=
"../../../../../../../libs/math/doc/html/math_toolkit/next_float/float_distance.html" target=
"_top">Boost.Math
61 float_distance
</a>. One or two bits distance means that all results
62 are effectively 'correct'. Zero means 'exact' - the nearest
<a href=
"http://en.wikipedia.org/wiki/Floating_point#Representable_numbers.2C_conversion_and_rounding" target=
"_top">representable
</a>
63 value for the floating-point type.
67 The cube-root function is a simple function, and is a contrived example
68 for root-finding. It does allow us to investigate some of the factors controlling
69 efficiency that may be extrapolated to more complex functions.
72 The program used was
<a href=
"../../../../../example/root_finding_algorithms.cpp" target=
"_top">root_finding_algorithms.cpp
</a>.
73 100000 evaluations of each floating-point type and algorithm were used
74 and the CPU times were judged from repeat runs to have an uncertainty of
75 10 %. Comparing MSVC for
<code class=
"computeroutput"><span class=
"keyword">double
</span></code>
76 and
<code class=
"computeroutput"><span class=
"keyword">long
</span> <span class=
"keyword">double
</span></code>
77 (which are identical on this patform) may give a guide to uncertainty of
81 The requested precision was set as follows:
83 <div class=
"informaltable"><table class=
"table">
109 numeric_limits
<T
>::digits -
2
121 floor(numeric_limits
<T
>::digits *
0.6)
133 floor(numeric_limits
<T
>::digits *
0.4)
145 floor(numeric_limits
<T
>::digits *
0.4)
151 <div class=
"itemizedlist"><ul class=
"itemizedlist" style=
"list-style-type: disc; ">
152 <li class=
"listitem">
153 The C++ Standard cube root function
<a href=
"http://en.cppreference.com/w/cpp/numeric/math/cbrt" target=
"_top">std::cbrt
</a>
154 is only defined for built-in or fundamental types, so cannot be used
155 with any User-Defined floating-point types like
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>.
156 This, and that the cube function is so impeccably-behaved, allows the
157 implementer to use many tricks to achieve a fast computation. On some
158 platforms,
<code class=
"computeroutput"><span class=
"identifier">std
</span><span class=
"special">::
</span><span class=
"identifier">cbrt
</span></code> appeared several times as quick
159 as the more general
<code class=
"computeroutput"><span class=
"identifier">boost
</span><span class=
"special">::
</span><span class=
"identifier">math
</span><span class=
"special">::
</span><span class=
"identifier">cbrt
</span></code>,
160 on other platforms / compiler options
<code class=
"computeroutput"><span class=
"identifier">boost
</span><span class=
"special">::
</span><span class=
"identifier">math
</span><span class=
"special">::
</span><span class=
"identifier">cbrt
</span></code>
161 is noticeably faster. In general, the results are highly dependent
162 on the code-generation / processor architecture selection compiler
163 options used. One can assume that the standard library will have been
164 compiled with options
<span class=
"emphasis"><em>nearly
</em></span> optimal for the platform
165 it was installed on, where as the user has more choice over the options
166 used for Boost.Math. Pick something too general/conservative and performance
167 suffers, while selecting options that make use of the latest instruction
168 set opcodes speed's things up noticeably.
170 <li class=
"listitem">
171 Two compilers in optimise mode were compared: GCC
4.9.1 using Netbeans
172 IDS and Microsoft Visual Studio
2013 (Update
1) on the same hardware.
173 The number of iterations seemed consistent, but the relative run-times
174 surprisingly different.
176 <li class=
"listitem">
177 <code class=
"computeroutput"><span class=
"identifier">boost
</span><span class=
"special">::
</span><span class=
"identifier">math
</span><span class=
"special">::
</span><span class=
"identifier">cbrt
</span></code> allows use with
<span class=
"emphasis"><em>any
178 user-defined floating-point type
</em></span>, conveniently
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>.
179 It too can take some advantage of the good-behaviour of the cube function,
180 compared to the more general implementation in the nth root-finding
181 examples. For example, it uses a polynomial approximation to generate
182 a better guess than dividing the exponent by three, and can avoid the
183 complex checks in
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.newton">Newton-Raphson
184 iteration
</a> required to prevent the search going wildly off-track.
185 For a known precision, it may also be possible to fix the number of
186 iterations, allowing inlining and loop unrolling. It also algebraically
187 simplifies the Halley steps leading to a big reduction in the number
188 of floating point operations required compared to a
"black box"
189 implementation that calculates the derivatives seperately and then
190 combines them in the Halley code. Typically, it was found that computation
191 using type
<code class=
"computeroutput"><span class=
"keyword">double
</span></code> took
192 a few times longer when using the various root-finding algorithms directly
193 rather than the hand coded/optimized
<code class=
"computeroutput"><span class=
"identifier">cbrt
</span></code>
196 <li class=
"listitem">
197 The importance of getting a good guess can be seen by the iteration
198 count for the multiprecision case: here we
"cheat" a little
199 and use the cube-root calculated to double precision as the initial
200 guess. The limitation of this tactic is that the range of possible
201 (exponent) values may be less than the multiprecision type.
203 <li class=
"listitem">
204 For
<a href=
"http://en.cppreference.com/w/cpp/language/types" target=
"_top">fundamental
205 types
</a>, there was little to choose between the three derivative
206 methods, but for
<a href=
"../../../../../../../libs/multiprecision/doc/html/boost_multiprecision/tut/floats/cpp_bin_float.html" target=
"_top">cpp_bin_float
</a>,
207 <a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.newton">Newton-Raphson
208 iteration
</a> was twice as fast. Note that the cube-root is an extreme
209 test case as the cost of calling the functor is so cheap that the runtimes
210 are largely dominated by the complexity of the iteration code.
212 <li class=
"listitem">
213 Compiling with optimisation halved computation times, and any differences
214 between algorithms became nearly negligible. The optimisation speed-up
215 of the
<a href=
"http://portal.acm.org/citation.cfm?id=210111" target=
"_top">TOMS
216 Algorithm
748: enclosing zeros of continuous functions
</a> was
217 especially noticable.
219 <li class=
"listitem">
220 Using a multiprecision type like
<code class=
"computeroutput"><span class=
"identifier">cpp_bin_float_50
</span></code>
221 for a precision of
50 decimal digits took a lot longer, as expected
222 because most computation uses software rather than
64-bit floating-point
223 hardware. Speeds are often more than
50 times slower.
225 <li class=
"listitem">
226 Using
<code class=
"computeroutput"><span class=
"identifier">cpp_bin_float_50
</span></code>,
227 <a href=
"http://portal.acm.org/citation.cfm?id=210111" target=
"_top">TOMS Algorithm
228 748: enclosing zeros of continuous functions
</a> was much slower
229 showing the benefit of using derivatives.
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.newton">Newton-Raphson
230 iteration
</a> was found to be twice as quick as either of the second-derivative
231 methods: this is an extreme case though, the function and its derivatives
232 are so cheap to compute that we're really measuring the complexity
233 of the boilerplate root-finding code.
235 <li class=
"listitem">
236 For multiprecision types only one or two extra
<span class=
"emphasis"><em>iterations
</em></span>
237 are needed to get the remaining
35 digits, whatever the algorithm used.
238 (The time taken was of course much greater for these types).
240 <li class=
"listitem">
241 Using a
100 decimal-digit type only doubled the time and required only
242 a very few more iterations, so the cost of extra precision is mainly
243 the underlying cost of computing more digits, not in the way the algorithm
244 works. This confirms previous observations using
<a href=
"http://www.shoup.net/ntl/" target=
"_top">NTL
245 A Library for doing Number Theory
</a> high-precision types.
249 <a name=
"math_toolkit.roots.root_comparison.cbrt_comparison.h0"></a>
250 <span class=
"phrase"><a name=
"math_toolkit.roots.root_comparison.cbrt_comparison.program_root_finding_algorithms_"></a></span><a class=
"link" href=
"cbrt_comparison.html#math_toolkit.roots.root_comparison.cbrt_comparison.program_root_finding_algorithms_">Program
251 root_finding_algorithms.cpp, Microsoft Visual C++ version
12.0, Dinkumware
252 standard library version
610, Win32, x64
<br> 1000000 evaluations of each
253 of
5 root_finding algorithms.
</a>
256 <a name=
"math_toolkit.roots.root_comparison.cbrt_comparison.cbrt_4"></a><p class=
"title"><b>Table
 12.1.
 Cube root(
28) for float, double, long double and cpp_bin_float_50
</b></p>
257 <div class=
"table-contents"><table class=
"table" summary=
"Cube root(28) for float, double, long double and cpp_bin_float_50">
332 <td class=
"auto-generated"> </td>
333 <td class=
"auto-generated"> </td>
449 <span class=
"blue">1.0</span>
471 <span class=
"blue">1.0</span>
493 <span class=
"blue">1.0</span>
544 <span class=
"red">5.0</span>
566 <span class=
"red">9.3</span>
588 <span class=
"red">9.3</span>
610 <span class=
"red">15.
</span>
705 <span class=
"blue">1.0</span>
873 <span class=
"red">4.3</span>
909 <br class=
"table-break"><h6>
910 <a name=
"math_toolkit.roots.root_comparison.cbrt_comparison.h1"></a>
911 <span class=
"phrase"><a name=
"math_toolkit.roots.root_comparison.cbrt_comparison.program_root_finding_algorithms0"></a></span><a class=
"link" href=
"cbrt_comparison.html#math_toolkit.roots.root_comparison.cbrt_comparison.program_root_finding_algorithms0">Program
912 root_finding_algorithms.cpp, GNU C++ version
4.9.2, GNU libstdc++ version
913 20141030, Win32, x64
<br> 1000000 evaluations of each of
5 root_finding
917 <a name=
"math_toolkit.roots.root_comparison.cbrt_comparison.cbrt_4_0"></a><p class=
"title"><b>Table
 12.2.
 Cube root(
28) for float, double, long double and cpp_bin_float_50
</b></p>
918 <div class=
"table-contents"><table class=
"table" summary=
"Cube root(28) for float, double, long double and cpp_bin_float_50">
993 <td class=
"auto-generated"> </td>
994 <td class=
"auto-generated"> </td>
1110 <span class=
"blue">1.0</span>
1132 <span class=
"blue">1.0</span>
1154 <span class=
"blue">1.0</span>
1227 <span class=
"red">8.7</span>
1249 <span class=
"red">13.
</span>
1271 <span class=
"red">14.
</span>
1366 <span class=
"blue">1.0</span>
1439 <span class=
"red">4.7</span>
1534 <span class=
"red">6.0</span>
1570 <br class=
"table-break">
1572 <table xmlns:
rev=
"http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width=
"100%"><tr>
1573 <td align=
"left"></td>
1574 <td align=
"right"><div class=
"copyright-footer">Copyright
© 2006-
2010,
2012-
2014 Nikhar Agrawal,
1575 Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert
1576 Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan R
åde, Gautam Sewani,
1577 Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang
<p>
1578 Distributed under the Boost Software License, Version
1.0. (See accompanying
1579 file LICENSE_1_0.txt or copy at
<a href=
"http://www.boost.org/LICENSE_1_0.txt" target=
"_top">http://www.boost.org/LICENSE_1_0.txt
</a>)
1584 <div class=
"spirit-nav">
1585 <a accesskey=
"p" href=
"../root_comparison.html"><img src=
"../../../../../../../doc/src/images/prev.png" alt=
"Prev"></a><a accesskey=
"u" href=
"../root_comparison.html"><img src=
"../../../../../../../doc/src/images/up.png" alt=
"Up"></a><a accesskey=
"h" href=
"../../../index.html"><img src=
"../../../../../../../doc/src/images/home.png" alt=
"Home"></a><a accesskey=
"n" href=
"root_n_comparison.html"><img src=
"../../../../../../../doc/src/images/next.png" alt=
"Next"></a>