3 <meta http-equiv=
"Content-Type" content=
"text/html; charset=US-ASCII">
4 <title>Comparison of Nth-root Finding Algorithms
</title>
5 <link rel=
"stylesheet" href=
"../../../math.css" type=
"text/css">
6 <meta name=
"generator" content=
"DocBook XSL Stylesheets V1.77.1">
7 <link rel=
"home" href=
"../../../index.html" title=
"Math Toolkit 2.5.1">
8 <link rel=
"up" href=
"../root_comparison.html" title=
"Comparison of Root Finding Algorithms">
9 <link rel=
"prev" href=
"cbrt_comparison.html" title=
"Comparison of Cube Root Finding Algorithms">
10 <link rel=
"next" href=
"elliptic_comparison.html" title=
"Comparison of Elliptic Integral Root Finding Algoritghms">
12 <body bgcolor=
"white" text=
"black" link=
"#0000FF" vlink=
"#840084" alink=
"#0000FF">
13 <table cellpadding=
"2" width=
"100%"><tr>
14 <td valign=
"top"><img alt=
"Boost C++ Libraries" width=
"277" height=
"86" src=
"../../../../../../../boost.png"></td>
15 <td align=
"center"><a href=
"../../../../../../../index.html">Home
</a></td>
16 <td align=
"center"><a href=
"../../../../../../../libs/libraries.htm">Libraries
</a></td>
17 <td align=
"center"><a href=
"http://www.boost.org/users/people.html">People
</a></td>
18 <td align=
"center"><a href=
"http://www.boost.org/users/faq.html">FAQ
</a></td>
19 <td align=
"center"><a href=
"../../../../../../../more/index.htm">More
</a></td>
22 <div class=
"spirit-nav">
23 <a accesskey=
"p" href=
"cbrt_comparison.html"><img src=
"../../../../../../../doc/src/images/prev.png" alt=
"Prev"></a><a accesskey=
"u" href=
"../root_comparison.html"><img src=
"../../../../../../../doc/src/images/up.png" alt=
"Up"></a><a accesskey=
"h" href=
"../../../index.html"><img src=
"../../../../../../../doc/src/images/home.png" alt=
"Home"></a><a accesskey=
"n" href=
"elliptic_comparison.html"><img src=
"../../../../../../../doc/src/images/next.png" alt=
"Next"></a>
26 <div class=
"titlepage"><div><div><h4 class=
"title">
27 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison"></a><a class=
"link" href=
"root_n_comparison.html" title=
"Comparison of Nth-root Finding Algorithms">Comparison
28 of Nth-root Finding Algorithms
</a>
29 </h4></div></div></div>
31 A second example compares four generalized nth-root finding algorithms
32 for various n-th roots (
5,
7 and
13) of a single value
28.0, for four floating-point
33 types,
<code class=
"computeroutput"><span class=
"keyword">float
</span></code>,
<code class=
"computeroutput"><span class=
"keyword">double
</span></code>,
<code class=
"computeroutput"><span class=
"keyword">long
</span>
34 <span class=
"keyword">double
</span></code> and a
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>
35 type
<code class=
"computeroutput"><span class=
"identifier">cpp_bin_float_50
</span></code>.
36 In each case the target accuracy was set using our
"recomended"
37 accuracy limits (or at least limits that make a good starting point - which
38 is likely to give close to full accuracy without resorting to unnecessary
41 <div class=
"informaltable"><table class=
"table">
67 numeric_limits
<T
>::digits -
2
79 floor(numeric_limits
<T
>::digits *
0.6)
91 floor(numeric_limits
<T
>::digits *
0.4)
103 floor(numeric_limits
<T
>::digits *
0.4)
110 Tests used Microsoft Visual Studio
2013 (Update
1) and GCC
4.9.1 using
111 source code
<a href=
"../../../../../example/root_n_finding_algorithms.cpp" target=
"_top">root_n_finding_algorithms.cpp
</a>.
114 The timing uncertainty (especially using MSVC) is at least
5% of normalized
118 To pick out the 'best' and 'worst' algorithms are highlighted in blue and
119 red. More than one result can be 'best' when normalized times are indistinguishable
120 within the uncertainty.
123 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.h0"></a>
124 <span class=
"phrase"><a name=
"math_toolkit.roots.root_comparison.root_n_comparison.program_root_n_finding_algorithm"></a></span><a class=
"link" href=
"root_n_comparison.html#math_toolkit.roots.root_comparison.root_n_comparison.program_root_n_finding_algorithm">Program
125 root_n_finding_algorithms.cpp, Microsoft Visual C++ version
12.0, Dinkumware
126 standard library version
610, Win32 Compiled in optimise mode., _X86_SSE2
</a>
129 Fraction of full accuracy
1
132 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_5"></a><p class=
"title"><b>Table
 12.3.
 5th root(
28) for float, double, long double and cpp_bin_float_50
133 types, using _X86_SSE2
</b></p>
134 <div class=
"table-contents"><table class=
"table" summary=
"5th root(28) for float, double, long double and cpp_bin_float_50
135 types, using _X86_SSE2">
210 <td class=
"auto-generated"> </td>
211 <td class=
"auto-generated"> </td>
393 <span class=
"red">7.52</span>
422 <span class=
"blue">1.00</span>
444 <span class=
"blue">1.00</span>
466 <span class=
"blue">1.00</span>
488 <span class=
"blue">1.00</span>
517 <span class=
"blue">1.02</span>
612 <span class=
"blue">1.04</span>
692 <br class=
"table-break"><div class=
"table">
693 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_7"></a><p class=
"title"><b>Table
 12.4.
 7th root(
28) for float, double, long double and cpp_bin_float_50
694 types, using _X86_SSE2
</b></p>
695 <div class=
"table-contents"><table class=
"table" summary=
"7th root(28) for float, double, long double and cpp_bin_float_50
696 types, using _X86_SSE2">
771 <td class=
"auto-generated"> </td>
772 <td class=
"auto-generated"> </td>
954 <span class=
"red">7.09</span>
983 <span class=
"blue">1.00</span>
1005 <span class=
"blue">1.00</span>
1027 <span class=
"blue">1.00</span>
1049 <span class=
"blue">1.00</span>
1253 <br class=
"table-break"><div class=
"table">
1254 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_11"></a><p class=
"title"><b>Table
 12.5.
 11th root(
28) for float, double, long double and cpp_bin_float_50
1255 types, using _X86_SSE2
</b></p>
1256 <div class=
"table-contents"><table class=
"table" summary=
"11th root(28) for float, double, long double and cpp_bin_float_50
1257 types, using _X86_SSE2">
1332 <td class=
"auto-generated"> </td>
1333 <td class=
"auto-generated"> </td>
1515 <span class=
"red">8.88</span>
1544 <span class=
"blue">1.00</span>
1566 <span class=
"blue">1.00</span>
1588 <span class=
"blue">1.00</span>
1610 <span class=
"blue">1.00</span>
1639 <span class=
"blue">1.02</span>
1814 <br class=
"table-break"><h5>
1815 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.h1"></a>
1816 <span class=
"phrase"><a name=
"math_toolkit.roots.root_comparison.root_n_comparison.program_root_n_finding_algorith0"></a></span><a class=
"link" href=
"root_n_comparison.html#math_toolkit.roots.root_comparison.root_n_comparison.program_root_n_finding_algorith0">Program
1817 root_n_finding_algorithms.cpp, Microsoft Visual C++ version
12.0, Dinkumware
1818 standard library version
610, Win32 Compiled in optimise mode., _X64_AVX
</a>
1821 Fraction of full accuracy
1
1824 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_5_0"></a><p class=
"title"><b>Table
 12.6.
 5th root(
28) for float, double, long double and cpp_bin_float_50
1825 types, using _X64_AVX
</b></p>
1826 <div class=
"table-contents"><table class=
"table" summary=
"5th root(28) for float, double, long double and cpp_bin_float_50
1827 types, using _X64_AVX">
1902 <td class=
"auto-generated"> </td>
1903 <td class=
"auto-generated"> </td>
2085 <span class=
"red">7.51</span>
2114 <span class=
"blue">1.00</span>
2136 <span class=
"blue">1.00</span>
2158 <span class=
"blue">1.00</span>
2180 <span class=
"blue">1.00</span>
2384 <br class=
"table-break"><div class=
"table">
2385 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_7_0"></a><p class=
"title"><b>Table
 12.7.
 7th root(
28) for float, double, long double and cpp_bin_float_50
2386 types, using _X64_AVX
</b></p>
2387 <div class=
"table-contents"><table class=
"table" summary=
"7th root(28) for float, double, long double and cpp_bin_float_50
2388 types, using _X64_AVX">
2463 <td class=
"auto-generated"> </td>
2464 <td class=
"auto-generated"> </td>
2646 <span class=
"red">6.81</span>
2675 <span class=
"blue">1.00</span>
2697 <span class=
"blue">1.00</span>
2719 <span class=
"blue">1.00</span>
2741 <span class=
"blue">1.00</span>
2945 <br class=
"table-break"><div class=
"table">
2946 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_11_0"></a><p class=
"title"><b>Table
 12.8.
 11th root(
28) for float, double, long double and cpp_bin_float_50
2947 types, using _X64_AVX
</b></p>
2948 <div class=
"table-contents"><table class=
"table" summary=
"11th root(28) for float, double, long double and cpp_bin_float_50
2949 types, using _X64_AVX">
3024 <td class=
"auto-generated"> </td>
3025 <td class=
"auto-generated"> </td>
3207 <span class=
"red">8.85</span>
3236 <span class=
"blue">1.00</span>
3258 <span class=
"blue">1.00</span>
3280 <span class=
"blue">1.00</span>
3302 <span class=
"blue">1.00</span>
3506 <br class=
"table-break"><h5>
3507 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.h2"></a>
3508 <span class=
"phrase"><a name=
"math_toolkit.roots.root_comparison.root_n_comparison.program_root_n_finding_algorith1"></a></span><a class=
"link" href=
"root_n_comparison.html#math_toolkit.roots.root_comparison.root_n_comparison.program_root_n_finding_algorith1">Program
3509 root_n_finding_algorithms.cpp, GNU C++ version
4.9.2, GNU libstdc++ version
3510 20141030, Win32 Compiled in optimise mode., _X64_SSE2
</a>
3513 Fraction of full accuracy
1
3516 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_5_1"></a><p class=
"title"><b>Table
 12.9.
 5th root(
28) for float, double, long double and cpp_bin_float_50
3517 types, using _X64_SSE2
</b></p>
3518 <div class=
"table-contents"><table class=
"table" summary=
"5th root(28) for float, double, long double and cpp_bin_float_50
3519 types, using _X64_SSE2">
3594 <td class=
"auto-generated"> </td>
3595 <td class=
"auto-generated"> </td>
3777 <span class=
"red">7.56</span>
3806 <span class=
"blue">1.00</span>
3828 <span class=
"blue">1.00</span>
3850 <span class=
"blue">1.00</span>
3872 <span class=
"blue">1.00</span>
4076 <br class=
"table-break"><div class=
"table">
4077 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_7_1"></a><p class=
"title"><b>Table
 12.10.
 7th root(
28) for float, double, long double and cpp_bin_float_50
4078 types, using _X64_SSE2
</b></p>
4079 <div class=
"table-contents"><table class=
"table" summary=
"7th root(28) for float, double, long double and cpp_bin_float_50
4080 types, using _X64_SSE2">
4155 <td class=
"auto-generated"> </td>
4156 <td class=
"auto-generated"> </td>
4338 <span class=
"red">7.10</span>
4367 <span class=
"blue">1.00</span>
4389 <span class=
"blue">1.00</span>
4411 <span class=
"blue">1.00</span>
4433 <span class=
"blue">1.00</span>
4637 <br class=
"table-break"><div class=
"table">
4638 <a name=
"math_toolkit.roots.root_comparison.root_n_comparison.root_11_1"></a><p class=
"title"><b>Table
 12.11.
 11th root(
28) for float, double, long double and cpp_bin_float_50
4639 types, using _X64_SSE2
</b></p>
4640 <div class=
"table-contents"><table class=
"table" summary=
"11th root(28) for float, double, long double and cpp_bin_float_50
4641 types, using _X64_SSE2">
4716 <td class=
"auto-generated"> </td>
4717 <td class=
"auto-generated"> </td>
4899 <span class=
"red">8.83</span>
4928 <span class=
"blue">1.00</span>
4950 <span class=
"blue">1.00</span>
4972 <span class=
"blue">1.00</span>
4994 <span class=
"blue">1.00</span>
5023 <span class=
"blue">1.02</span>
5198 <br class=
"table-break"><p>
5199 Some tentative conclusions can be drawn from this limited exercise.
5201 <div class=
"itemizedlist"><ul class=
"itemizedlist" style=
"list-style-type: disc; ">
5202 <li class=
"listitem">
5203 Perhaps surprisingly, there is little difference between the various
5204 algorithms for
<a href=
"http://en.cppreference.com/w/cpp/language/types" target=
"_top">fundamental
5205 types
</a> floating-point types. Using the first derivatives (
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.newton">Newton-Raphson iteration
</a>)
5206 is usually the best, but while the improvement over the no-derivative
5207 <a href=
"http://portal.acm.org/citation.cfm?id=210111" target=
"_top">TOMS Algorithm
5208 748: enclosing zeros of continuous functions
</a> is considerable
5209 in number of iterations, but little in execution time. This reflects
5210 the fact that the function we are finding the root for is trivial to
5211 evaluate, so runtimetimes are dominated by the time taken by the boilerplate
5212 code in each method.
5214 <li class=
"listitem">
5215 The extra cost of evaluating the second derivatives (
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.halley">Halley
</a>
5216 or
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.schroder">Schr
öder
</a>)
5217 is usually too much for any net benefit: as with the cube root, these
5218 functors are so cheap to evaluate that the runtime is largely dominated
5219 by the complexity of the root finding method.
5221 <li class=
"listitem">
5222 For a
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>
5223 floating-point type, the
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.newton">Newton-Raphson
5224 iteration
</a> is a clear winner with a several-fold gain over
<a href=
"http://portal.acm.org/citation.cfm?id=210111" target=
"_top">TOMS Algorithm
748:
5225 enclosing zeros of continuous functions
</a>, and again no improvement
5226 from the second-derivative algorithms.
5228 <li class=
"listitem">
5229 The run-time of
50 decimal-digit
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>
5230 is about
30-fold greater than
<code class=
"computeroutput"><span class=
"keyword">double
</span></code>.
5232 <li class=
"listitem">
5233 The column 'dis' showing the number of bits distance from the correct
5234 result. The Newton-Raphson algorithm shows a bit or two better accuracy
5235 than
<a href=
"http://portal.acm.org/citation.cfm?id=210111" target=
"_top">TOMS
5236 Algorithm
748: enclosing zeros of continuous functions
</a>.
5238 <li class=
"listitem">
5239 The goodness of the 'guess' is especially crucial for
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>.
5240 Separate experiments show that evaluating the 'guess' using
<code class=
"computeroutput"><span class=
"keyword">double
</span></code> allows convergence to the final
5241 exact result in one or two iterations. So in this contrived example,
5242 crudely dividing the exponent by N for a 'guess', it would be far better
5243 to use a
<code class=
"computeroutput"><span class=
"identifier">pow
</span><span class=
"special"><</span><span class=
"keyword">double
</span><span class=
"special">></span></code>
5244 or , if more precise
<code class=
"computeroutput"><span class=
"identifier">pow
</span><span class=
"special"><</span><span class=
"keyword">long
</span> <span class=
"keyword">double
</span><span class=
"special">></span></code>,
5245 function to estimate a 'guess'. The limitation of this tactic is that
5246 the range of possible (exponent) values may be less than the multiprecision
5249 <li class=
"listitem">
5250 Using floating-point extension
<a href=
"http://en.wikipedia.org/wiki/SSE2" target=
"_top">SSE2
5251 instructions
</a> made a modest ten-percent speedup.
5253 <li class=
"listitem">
5254 Using MSVC, there was some improvement using
64-bit, markedly for
5255 <a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>.
5257 <li class=
"listitem">
5258 The GCC compiler
4.9.1 using
64-bit was at least five-folder faster
5259 that
32-bit, apparently reflecting better optimization.
5263 Clearly, your mileage
<span class=
"bold"><strong>will vary
</strong></span>, but in
5264 summary,
<a class=
"link" href=
"../roots_deriv.html#math_toolkit.roots.roots_deriv.newton">Newton-Raphson
5265 iteration
</a> seems the first choice of algorithm, and effort to find
5266 a good 'guess' the first speed-up target, especially for
<a href=
"../../../../../../../libs/multiprecision/doc/html/index.html" target=
"_top">Boost.Multiprecision
</a>.
5267 And of course, compiler optimisation is crucial for speed.
5270 <table xmlns:
rev=
"http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width=
"100%"><tr>
5271 <td align=
"left"></td>
5272 <td align=
"right"><div class=
"copyright-footer">Copyright
© 2006-
2010,
2012-
2014 Nikhar Agrawal,
5273 Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert
5274 Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan R
åde, Gautam Sewani,
5275 Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang
<p>
5276 Distributed under the Boost Software License, Version
1.0. (See accompanying
5277 file LICENSE_1_0.txt or copy at
<a href=
"http://www.boost.org/LICENSE_1_0.txt" target=
"_top">http://www.boost.org/LICENSE_1_0.txt
</a>)
5282 <div class=
"spirit-nav">
5283 <a accesskey=
"p" href=
"cbrt_comparison.html"><img src=
"../../../../../../../doc/src/images/prev.png" alt=
"Prev"></a><a accesskey=
"u" href=
"../root_comparison.html"><img src=
"../../../../../../../doc/src/images/up.png" alt=
"Up"></a><a accesskey=
"h" href=
"../../../index.html"><img src=
"../../../../../../../doc/src/images/home.png" alt=
"Home"></a><a accesskey=
"n" href=
"elliptic_comparison.html"><img src=
"../../../../../../../doc/src/images/next.png" alt=
"Next"></a>