]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [section:root_finding_examples Examples of Root-Finding (with and without derivatives)] |
2 | ||
3 | [import ../../example/root_finding_example.cpp] | |
4 | [import ../../example/root_finding_n_example.cpp] | |
5 | [import ../../example/root_finding_multiprecision_example.cpp] | |
6 | ||
7 | The examples demonstrate how to use the various tools for | |
8 | [@http://en.wikipedia.org/wiki/Root-finding_algorithm root finding]. | |
9 | ||
10 | We start with the simple cube root function `cbrt` ( C++ standard function name | |
11 | [@http://en.cppreference.com/w/cpp/numeric/math/cbrt cbrt]) | |
12 | showing root finding __cbrt_no_derivatives. | |
13 | ||
14 | We then show how use of derivatives can improve the speed of convergence. | |
15 | ||
16 | (But these examples are only a demonstration and do not try to make | |
17 | the ultimate improvements of an 'industrial-strength' | |
18 | implementation, for example, of `boost::math::cbrt`, mainly by using a better computed initial 'guess' | |
19 | at [@boost:/libs/math/include/boost/math/special_functions/cbrt.hpp cbrt.hpp]). | |
20 | ||
21 | Then we show how a higher root (__fifth_root) [super 5][radic] can be computed, | |
22 | and in | |
23 | [@../../example/root_finding_n_example.cpp root_finding_n_example.cpp] | |
24 | a generic method for the __nth_root that constructs the derivatives at compile-time. | |
25 | ||
26 | These methods should be applicable to other functions that can be differentiated easily. | |
27 | ||
28 | [section:cbrt_eg Finding the Cubed Root With and Without Derivatives] | |
29 | ||
30 | First some `#includes` that will be needed. | |
31 | ||
32 | [root_finding_include_1] | |
33 | ||
34 | [tip For clarity, `using` statements are provided to list what functions are being used in this example: | |
35 | you can, of course, partly or fully qualify the names in other ways. | |
36 | (For your application, you may wish to extract some parts into header files, | |
37 | but you should never use `using` statements globally in header files).] | |
38 | ||
39 | Let's suppose we want to find the root of a number ['a], and to start, compute the cube root. | |
40 | ||
41 | So the equation we want to solve is: | |
42 | ||
43 | __spaces ['f(x) = x[cubed] -a] | |
44 | ||
45 | We will first solve this without using any information | |
46 | about the slope or curvature of the cube root function. | |
47 | ||
48 | Fortunately, the cube-root function is 'Really Well Behaved' in that it is monotonic | |
49 | and has only one root (we leave negative values 'as an exercise for the student'). | |
50 | ||
51 | We then show how adding what we can know about this function, first just the slope | |
52 | or 1st derivative ['f'(x)], will speed homing in on the solution. | |
53 | ||
54 | Lastly, we show how adding the curvature ['f''(x)] too will speed convergence even more. | |
55 | ||
56 | [h3:cbrt_no_derivatives Cube root function without derivatives] | |
57 | ||
58 | First we define a function object (functor): | |
59 | ||
60 | [root_finding_noderiv_1] | |
61 | ||
62 | Implementing the cube-root function itself is fairly trivial now: | |
63 | the hardest part is finding a good approximation to begin with. | |
64 | In this case we'll just divide the exponent by three. | |
65 | (There are better but more complex guess algorithms used in 'real life'.) | |
66 | ||
67 | [root_finding_noderiv_2] | |
68 | ||
69 | This snippet from `main()` in [@../../example/root_finding_example.cpp root_finding_example.cpp] | |
70 | shows how it can be used. | |
71 | ||
72 | [root_finding_main_1] | |
73 | ||
74 | [pre | |
75 | cbrt_noderiv(27) = 3 | |
76 | cbrt_noderiv(28) = 3.0365889718756618 | |
77 | ] | |
78 | ||
79 | The result of `bracket_and_solve_root` is a [@http://www.cplusplus.com/reference/utility/pair/ pair] | |
80 | of values that could be displayed. | |
81 | ||
82 | The number of bits separating them can be found using `float_distance(r.first, r.second)`. | |
83 | The distance is zero (closest representable) for 3[super 3] = 27 | |
84 | but `float_distance(r.first, r.second) = 3` for cube root of 28 with this function. | |
85 | The result (avoiding overflow) is midway between these two values. | |
86 | ||
87 | [h3:cbrt_1st_derivative Cube root function with 1st derivative (slope)] | |
88 | ||
89 | We now solve the same problem, but using more information about the function, | |
90 | to show how this can speed up finding the best estimate of the root. | |
91 | ||
92 | For the root function, the 1st differential (the slope of the tangent to a curve at any point) is known. | |
93 | ||
94 | This algorithm is similar to this [@http://en.wikipedia.org/wiki/Nth_root_algorithm nth root algorithm]. | |
95 | ||
96 | If you need some reminders, then | |
97 | [@http://en.wikipedia.org/wiki/Derivative#Derivatives_of_elementary_functions derivatives of elementary functions] | |
98 | may help. | |
99 | ||
100 | Using the rule that the derivative of ['x[super n]] for positive n (actually all nonzero n) is ['n x[super n-1]], | |
101 | allows us to get the 1st differential as ['3x[super 2]]. | |
102 | ||
103 | To see how this extra information is used to find a root, view | |
104 | [@http://en.wikipedia.org/wiki/Newton%27s_method Newton-Raphson iterations] | |
105 | and the [@http://en.wikipedia.org/wiki/Newton%27s_method#mediaviewer/File:NewtonIteration_Ani.gif animation]. | |
106 | ||
107 | We define a better functor `cbrt_functor_deriv` that returns | |
108 | both the evaluation of the function to solve, along with its first derivative: | |
109 | ||
110 | To '['return]' two values, we use a [@http://en.cppreference.com/w/cpp/utility/pair std::pair] | |
111 | of floating-point values. | |
112 | ||
113 | [root_finding_1_deriv_1] | |
114 | ||
115 | The result of [@boost:/libs/math/include/boost/math/tools/roots.hpp `newton_raphson_iterate`] | |
116 | function is a single value. | |
117 | ||
118 | [tip There is a compromise between accuracy and speed when chosing the value of `digits`. | |
119 | It is tempting to simply chose `std::numeric_limits<T>::digits`, | |
120 | but this may mean some inefficient and unnecessary iterations as the function thrashes around | |
121 | trying to locate the last bit. In theory, since the precision doubles with each step | |
122 | it is sufficient to stop when half the bits are correct: as the last step will have doubled | |
123 | that to full precision. Of course the function has no way to tell if that is actually the case | |
124 | unless it does one more step to be sure. In practice setting the precision to slightly more | |
125 | than `std::numeric_limits<T>::digits / 2` is a good choice.] | |
126 | ||
127 | Note that it is up to the caller of the function to check the iteration count | |
128 | after the call to see if iteration stoped as a result of running out of iterations | |
129 | rather than meeting the required precision. | |
130 | ||
131 | Using the test data in [@../../test/test_cbrt.cpp /test/test_cbrt.cpp] this found the cube root | |
132 | exact to the last digit in every case, and in no more than 6 iterations at double | |
133 | precision. However, you will note that a high precision was used in this | |
134 | example, exactly what was warned against earlier on in these docs! In this | |
135 | particular case it is possible to compute ['f(x)] exactly and without undue | |
136 | cancellation error, so a high limit is not too much of an issue. | |
137 | ||
138 | However, reducing the limit to `std::numeric_limits<T>::digits * 2 / 3` gave full | |
139 | precision in all but one of the test cases (and that one was out by just one bit). | |
140 | The maximum number of iterations remained 6, but in most cases was reduced by one. | |
141 | ||
142 | Note also that the above code omits a probable optimization by computing z[sup2] | |
143 | and reusing it, omits error handling, and does not handle | |
144 | negative values of z correctly. (These are left as the customary exercise for the reader!) | |
145 | ||
146 | The `boost::math::cbrt` function also includes these and other improvements: | |
147 | most importantly it uses a much better initial guess which reduces the iteration count to | |
148 | just 1 in almost all cases. | |
149 | ||
150 | [h3:cbrt_2_derivatives Cube root with 1st & 2nd derivative (slope & curvature)] | |
151 | ||
152 | Next we define yet another even better functor `cbrt_functor_2deriv` that returns | |
153 | both the evaluation of the function to solve, | |
154 | along with its first [*and second] derivative: | |
155 | ||
156 | __spaces['f''(x) = 6x] | |
157 | ||
158 | using information about both slope and curvature to speed convergence. | |
159 | ||
160 | To [''return'] three values, we use a `tuple` of three floating-point values: | |
161 | [root_finding_2deriv_1] | |
162 | ||
163 | The function `halley_iterate` also returns a single value, | |
164 | and the number of iterations will reveal if it met the convergence criterion set by `get_digits`. | |
165 | ||
166 | The no-derivative method gives a result of | |
167 | ||
168 | cbrt_noderiv(28) = 3.0365889718756618 | |
169 | ||
170 | with a 3 bits distance between the bracketed values, whereas the derivative methods both converge to a single value | |
171 | ||
172 | cbrt_2deriv(28) = 3.0365889718756627 | |
173 | ||
174 | which we can compare with the [@boost:/libs/math/doc/html/math_toolkit/powers/cbrt.html boost::math::cbrt] | |
175 | ||
176 | cbrt(28) = 3.0365889718756627 | |
177 | ||
178 | Note that the iterations are set to stop at just one-half of full precision, | |
179 | and yet, even so, not one of the test cases had a single bit wrong. | |
180 | What's more, the maximum number of iterations was now just 4. | |
181 | ||
182 | Just to complete the picture, we could have called | |
183 | [link math_toolkit.roots.roots_deriv.schroder `schroder_iterate`] in the last | |
184 | example: and in fact it makes no difference to the accuracy or number of iterations | |
185 | in this particular case. However, the relative performance of these two methods | |
186 | may vary depending upon the nature of ['f(x)], and the accuracy to which the initial | |
187 | guess can be computed. There appear to be no generalisations that can be made | |
188 | except "try them and see". | |
189 | ||
190 | Finally, had we called `cbrt` with [@http://shoup.net/ntl/doc/RR.txt NTL::RR] | |
191 | set to 1000 bit precision (about 300 decimal digits), | |
192 | then full precision can be obtained with just 7 iterations. | |
193 | To put that in perspective, | |
194 | an increase in precision by a factor of 20, has less than doubled the number of | |
195 | iterations. That just goes to emphasise that most of the iterations are used | |
196 | up getting the first few digits correct: after that these methods can churn out | |
197 | further digits with remarkable efficiency. | |
198 | ||
199 | Or to put it another way: ['nothing beats a really good initial guess!] | |
200 | ||
201 | Full code of this example is at | |
202 | [@../../example/root_finding_example.cpp root_finding_example.cpp], | |
203 | ||
204 | [endsect] | |
205 | ||
206 | [section:lambda Using C++11 Lambda's] | |
207 | ||
208 | Since all the root finding functions accept a function-object, they can be made to | |
209 | work (often in a lot less code) with C++11 lambda's. Here's the much reduced code for our "toy" cube root function: | |
210 | ||
211 | [root_finding_2deriv_lambda] | |
212 | ||
213 | Full code of this example is at | |
214 | [@../../example/root_finding_example.cpp root_finding_example.cpp], | |
215 | ||
216 | [endsect] | |
217 | ||
218 | [section:5th_root_eg Computing the Fifth Root] | |
219 | ||
220 | Let's now suppose we want to find the [*fifth root] of a number ['a]. | |
221 | ||
222 | The equation we want to solve is : | |
223 | ||
224 | __spaces['f](x) = ['x[super 5] -a] | |
225 | ||
226 | If your differentiation is a little rusty | |
227 | (or you are faced with an function whose complexity makes differentiation daunting), | |
228 | then you can get help, for example, from the invaluable | |
229 | [@http://www.wolframalpha.com/ WolframAlpha site.] | |
230 | ||
231 | For example, entering the commmand: `differentiate x ^ 5` | |
232 | ||
233 | or the Wolfram Language command: ` D[x ^ 5, x]` | |
234 | ||
235 | gives the output: `d/dx(x ^ 5) = 5 x ^ 4` | |
236 | ||
237 | and to get the second differential, enter: `second differentiate x ^ 5` | |
238 | ||
239 | or the Wolfram Language command: `D[x ^ 5, { x, 2 }]` | |
240 | ||
241 | to get the output: `d ^ 2 / dx ^ 2(x ^ 5) = 20 x ^ 3` | |
242 | ||
243 | To get a reference value, we can enter: [^fifth root 3126] | |
244 | ||
245 | or: `N[3126 ^ (1 / 5), 50]` | |
246 | ||
247 | to get a result with a precision of 50 decimal digits: | |
248 | ||
249 | 5.0003199590478625588206333405631053401128722314376 | |
250 | ||
251 | (We could also get a reference value using __multiprecision_root). | |
252 | ||
253 | The 1st and 2nd derivatives of x[super 5] are: | |
254 | ||
255 | __spaces['f]\'(x) = 5x[super 4] | |
256 | ||
257 | __spaces['f]\'\'(x) = 20x[super 3] | |
258 | ||
259 | [root_finding_fifth_functor_2deriv] | |
260 | [root_finding_fifth_2deriv] | |
261 | ||
262 | Full code of this example is at | |
263 | [@../../example/root_finding_example.cpp root_finding_example.cpp] and | |
264 | [@../../example/root_finding_n_example.cpp root_finding_n_example.cpp]. | |
265 | ||
266 | [endsect] | |
267 | ||
268 | [section:multiprecision_root Root-finding using Boost.Multiprecision] | |
269 | ||
270 | The apocryphally astute reader might, by now, be asking "How do we know if this computes the 'right' answer?". | |
271 | ||
272 | For most values, there is, sadly, no 'right' answer. | |
273 | This is because values can only rarely be ['exactly represented] by C++ floating-point types. | |
274 | What we do want is the 'best' representation - one that is the nearest __representable value. | |
275 | (For more about how numbers are represented see __floating_point). | |
276 | ||
277 | Of course, we might start with finding an external reference source like | |
278 | __WolframAlpha, as above, but this is not always possible. | |
279 | ||
280 | Another way to reassure is to compute 'reference' values at higher precision | |
281 | with which to compare the results of our iterative computations using built-in like `double`. | |
282 | They should agree within the tolerance that was set. | |
283 | ||
284 | The result of `static_cast`ing to `double` from a higher-precision type like `cpp_bin_float_50` is guaranteed | |
285 | to be the [*nearest representable] `double` value. | |
286 | ||
287 | For example, the cube root functions in our example for `cbrt(28.)` compute | |
288 | ||
289 | `std::cbrt<double>(28.) = 3.0365889718756627` | |
290 | ||
291 | WolframAlpha says `3.036588971875662519420809578505669635581453977248111123242141...` | |
292 | ||
293 | `static_cast<double>(3.03658897187566251942080957850) = 3.0365889718756627` | |
294 | ||
295 | This example `cbrt(28.) = 3.0365889718756627` | |
296 | ||
297 | [tip To ensure that all potentially significant decimal digits are displayed use `std::numeric_limits<T>::max_digits10` | |
298 | (or if not available on older platforms or compilers use `2+std::numeric_limits<double>::digits*3010/10000`).[br] | |
299 | ||
300 | Ideally, values should agree to `std::numeric-limits<T>::digits10` decimal digits. | |
301 | ||
302 | This also means that a 'reference' value to be [*input] or `static_cast` should have | |
303 | at least `max_digits10` decimal digits (17 for 64-bit `double`). | |
304 | ] | |
305 | ||
306 | If we wish to compute [*higher-precision values] then, on some platforms, we may be able to use `long double` | |
307 | with a higher precision than `double` to compare with the very common `double` | |
308 | and/or a more efficient built-in quad floating-point type like `__float128`. | |
309 | ||
310 | Almost all platforms can easily use __multiprecision, | |
311 | for example, __cpp_dec_float or a binary type __cpp_bin_float types, | |
312 | to compute values at very much higher precision. | |
313 | ||
314 | [note With multiprecision types, it is debatable whether to use the type `T` for computing the initial guesses. | |
315 | Type `double` is like to be accurate enough for the method used in these examples. | |
316 | This would limit the exponent range of possible values to that of `double`. | |
317 | There is also the cost of conversion to and from type `T` to consider. | |
318 | In these examples, `double` is used via `typedef double guess_type`.] | |
319 | ||
320 | Since the functors and functions used above are templated on the value type, | |
321 | we can very simply use them with any of the __multiprecision types. As a reminder, | |
322 | here's our toy cube root function using 2 derivatives and C++11 lambda functions to find the root: | |
323 | ||
324 | [root_finding_2deriv_lambda] | |
325 | ||
326 | Some examples below are 50 decimal digit decimal and binary types | |
327 | (and on some platforms a much faster `float128` or `quad_float` type ) | |
328 | that we can use with these includes: | |
329 | ||
330 | [root_finding_multiprecision_include_1] | |
331 | ||
332 | Some using statements simplify their use: | |
333 | ||
334 | [root_finding_multiprecision_example_1] | |
335 | ||
336 | They can be used thus: | |
337 | ||
338 | [root_finding_multiprecision_example_2] | |
339 | ||
340 | A reference value computed by __WolframAlpha is | |
341 | ||
342 | N[2^(1/3), 50] 1.2599210498948731647672106072782283505702514647015 | |
343 | ||
344 | which agrees exactly. | |
345 | ||
346 | To [*show] values to their full precision, it is necessary to adjust the `std::ostream` `precision` to suit the type, for example: | |
347 | ||
348 | [root_finding_multiprecision_show_1] | |
349 | ||
350 | [root_finding_multiprecision_example_3] | |
351 | ||
352 | which outputs: | |
353 | ||
354 | [pre | |
355 | cbrt(2) = 1.2599210498948731647672106072782283505702514647015 | |
356 | ||
357 | value = 2, cube root =1.25992104989487 | |
358 | value = 2, cube root =1.25992104989487 | |
359 | value = 2, cube root =1.2599210498948731647672106072782283505702514647015 | |
360 | ] | |
361 | ||
362 | [tip Be [*very careful] about the floating-point type `T` that is passed to the root-finding function. | |
363 | Carelessly passing a integer by writing | |
364 | `cpp_dec_float_50 r = cbrt_2deriv(2);` or `show_cube_root(2);` | |
365 | will provoke many warnings and compile errors. | |
366 | ||
367 | Even `show_cube_root(2.F);` will produce warnings because `typedef double guess_type` defines the type | |
368 | used to compute the guess and bracket values as `double`. | |
369 | ||
370 | Even more treacherous is passing a `double` as in `cpp_dec_float_50 r = cbrt_2deriv(2.);` | |
371 | which silently gives the 'wrong' result, computing a `double` result and [*then] converting to `cpp_dec_float_50`! | |
372 | All digits beyond `max_digits10` will be incorrect. | |
373 | Making the `cbrt` type explicit with `cbrt_2deriv<cpp_dec_float_50>(2.);` will give you the desired 50 decimal digit precision result. | |
374 | ] [/tip] | |
375 | ||
376 | Full code of this example is at | |
377 | [@../../example/root_finding_multiprecision_example.cpp root_finding_multiprecision_example.cpp]. | |
378 | ||
379 | [endsect] | |
380 | ||
381 | [section:nth_root Generalizing to Compute the nth root] | |
382 | ||
383 | If desired, we can now further generalize to compute the ['n]th root by computing the derivatives [*at compile-time] | |
384 | using the rules for differentiation and `boost::math::pow<N>` | |
385 | where template parameter `N` is an integer and a compile time constant. Our functor and function now have an additional template parameter `N`, | |
386 | for the root required. | |
387 | ||
388 | [note Since the powers and derivatives are fixed at compile time, the resulting code is as efficient as as if hand-coded as the cube and fifth-root examples above. | |
389 | A good compiler should also optimise any repeated multiplications.] | |
390 | ||
391 | Our ['n]th root functor is | |
392 | ||
393 | [root_finding_nth_functor_2deriv] | |
394 | ||
395 | and our ['n]th root function is | |
396 | ||
397 | [root_finding_nth_function_2deriv] | |
398 | ||
399 | [root_finding_n_example_2] | |
400 | ||
401 | produces an output similar to this | |
402 | ||
403 | [root_finding_example_output_1] | |
404 | ||
405 | [tip Take care with the type passed to the function. It is best to pass a `double` or greater-precision floating-point type. | |
406 | ||
407 | Passing an integer value, for example, `nth_2deriv<5>(2)` will be rejected, while `nth_2deriv<5, double>(2)` converts the integer to `double`. | |
408 | ||
409 | Avoid passing a `float` value that will provoke warnings (actually spurious) from the compiler about potential loss of data, | |
410 | as noted above.] | |
411 | ||
412 | [warning Asking for unreasonable roots, for example, `show_nth_root<1000000>(2.);` may lead to | |
413 | [@http://en.wikipedia.org/wiki/Loss_of_significance Loss of significance] like | |
414 | `Type double value = 2, 1000000th root = 1.00000069314783`. | |
415 | Use of the the `pow` function is more sensible for this unusual need. | |
416 | ] | |
417 | ||
418 | Full code of this example is at | |
419 | [@../../example/root_finding_n_example.cpp root_finding_n_example.cpp]. | |
420 | ||
421 | [endsect] | |
422 | ||
423 | [section:elliptic_eg A More complex example - Inverting the Elliptic Integrals] | |
424 | ||
425 | The arc length of an ellipse with radii ['a] and ['b] is given by: | |
426 | ||
427 | [pre L(a, b) = 4aE(k)] | |
428 | ||
429 | with: | |
430 | ||
431 | [pre k = [sqrt](1 - b[super 2]/a[super 2])] | |
432 | ||
433 | where ['E(k)] is the complete elliptic integral of the second kind - see __ellint_2. | |
434 | ||
435 | Let's suppose we know the arc length and one radii, we can then calculate the other | |
436 | radius by inverting the formula above. We'll begin by encoding the above formula | |
437 | into a functor that our root-finding algorithms can call. | |
438 | ||
439 | Note that while not | |
440 | completely obvious from the formula above, the function is completely symmetrical | |
441 | in the two radii - which can be interchanged at will - in this case we need to | |
442 | make sure that `a >= b` so that we don't accidentally take the square root of a negative number: | |
443 | ||
444 | [import ../../example/root_elliptic_finding.cpp] | |
445 | ||
446 | [elliptic_noderv_func] | |
447 | ||
448 | We'll also need a decent estimate to start searching from, the approximation: | |
449 | ||
450 | [pre L(a, b) [approx] 4[sqrt](a[super 2] + b[super 2])] | |
451 | ||
452 | Is easily inverted to give us what we need, which using derivative-free root | |
453 | finding leads to the algorithm: | |
454 | ||
455 | [elliptic_root_noderiv] | |
456 | ||
457 | This function generally finds the root within 8-10 iterations, so given that the runtime | |
458 | is completely dominated by the cost of calling the ellliptic integral it would be nice to | |
459 | reduce that count somewhat. We'll try to do that by using a derivative-based method; | |
460 | the derivatives of this function are rather hard to work out by hand, but fortunately | |
461 | [@http://www.wolframalpha.com/input/?i=d%2Fda+\[4+*+a+*+EllipticE%281+-+b^2%2Fa^2%29\] | |
462 | Wolfram Alpha] can do the grunt work for us to give: | |
463 | ||
464 | [pre d/da L(a, b) = 4(a[super 2]E(k) - b[super 2]K(k)) / (a[super 2] - b[super 2])] | |
465 | ||
466 | Note that now we have [*two] elliptic integral calls to get the derivative, so our | |
467 | functor will be at least twice as expensive to call as the derivative-free one above: | |
468 | we'll have to reduce the iteration count quite substantially to make a difference! | |
469 | ||
470 | Here's the revised functor: | |
471 | ||
472 | [elliptic_1deriv_func] | |
473 | ||
474 | The root-finding code is now almost the same as before, but we'll make use of | |
475 | Newton-iteration to get the result: | |
476 | ||
477 | [elliptic_1deriv] | |
478 | ||
479 | The number of iterations required for `double` precision is now usually around 4 - | |
480 | so we've slightly more than halved the number of iterations, but made the | |
481 | functor twice as expensive to call! | |
482 | ||
483 | Interestingly though, the second derivative requires no more expensive | |
484 | elliptic integral calls than the first does, in other words it comes | |
485 | essentially "for free", in which case we might as well make use of it | |
486 | and use Halley-iteration. This is quite a typical situation when | |
487 | inverting special-functions. Here's the revised functor: | |
488 | ||
489 | [elliptic_2deriv_func] | |
490 | ||
491 | The actual root-finding code is almost the same as before, except we can | |
492 | use Halley, rather than Newton iteration: | |
493 | ||
494 | [elliptic_2deriv] | |
495 | ||
496 | While this function uses only slightly fewer iterations (typically around 3) | |
497 | to find the root, compared to the original derivative-free method, we've moved from | |
498 | 8-10 elliptic integral calls to 6. | |
499 | ||
500 | Full code of this example is at | |
501 | [@../../example/root_elliptic_finding.cpp root_elliptic_finding.cpp]. | |
502 | ||
503 | [endsect] | |
504 | ||
505 | ||
506 | [endsect] [/section:root_examples Examples of Root Finding (with and without derivatives)] | |
507 | ||
508 | [section:bad_guess The Effect of a Poor Initial Guess] | |
509 | ||
510 | It's instructive to take our "toy" example algorithms, and use deliberately bad initial guesses to see how the | |
511 | various root finding algorithms fair. We'll start with the cubed root, and using the cube root of 500 as the test case: | |
512 | ||
513 | [table | |
514 | [[Initial Guess=][-500% ([approx]1.323)][-100% ([approx]3.97)][-50% ([approx]3.96)][-20% ([approx]6.35)][-10% ([approx]7.14)][-5% ([approx]7.54)][5% ([approx]8.33)][10% ([approx]8.73)][20% ([approx]9.52)][50% ([approx]11.91)][100% ([approx]15.87)][500 ([approx]47.6)]] | |
515 | [[bracket_and_solve_root][12][8][8][10][11][11][11][11][11][11][7][13]] | |
516 | [[newton_iterate][12][7][7][5][5][4][4][5][5][6][7][9]] | |
517 | [[halley_iterate][7][4][4][3][3][3][3][3][3][4][4][6]] | |
518 | [[schroder_iterate][11][6][6][4][3][3][3][3][4][5][5][8]] | |
519 | ] | |
520 | ||
521 | As you can see `bracket_and_solve_root` is relatively insensitive to starting location - as long as you don't start many orders of magnitude away from the root it will | |
522 | take roughly the same number of steps to bracket the root and solve it. On the other hand the derivative-based methods are slow to start, but once they have some digits | |
523 | correct they increase precision exceptionally fast: they are therefore quite sensitive to the initial starting location. | |
524 | ||
525 | The next table shows the number of iterations required to find the second radius of an ellipse with first radius 50 and arc-length 500: | |
526 | ||
527 | [table | |
528 | [[Initial Guess=][-500% ([approx]20.6)][-100% ([approx]61.81)][-50% ([approx]61.81)][-20% ([approx]98.9)][-10% ([approx]111.3)][-5% ([approx]117.4)][5% ([approx]129.8)][10% ([approx]136)][20% ([approx]148.3)][50% ([approx]185.4)][100% ([approx]247.2)][500 ([approx]741.7)]] | |
529 | [[bracket_and_solve_root][11][5][5][8][8][7][7][8][9][8][6][10]] | |
530 | [[newton_iterate][4][4][4][3][3][3][3][3][3][4][4][4]] | |
531 | [[halley_iterate][4][3][3][3][3][2][2][3][3][3][3][3]] | |
532 | [[schroder_iterate][4][3][3][3][3][2][2][3][3][3][3][3]] | |
533 | ] | |
534 | ||
535 | Interestingly this function is much more resistant to a poor initial guess when using derivatives. | |
536 | ||
537 | [endsect] | |
538 | ||
539 | [section:bad_roots Examples Where Root Finding Goes Wrong] | |
540 | ||
541 | There are many reasons why root root finding can fail, here are just a few of the more common examples: | |
542 | ||
543 | [h3 Local Minima] | |
544 | ||
545 | If you start in the wrong place, such as z[sub 0] here: | |
546 | ||
547 | [$../roots/bad_root_1.svg] | |
548 | ||
549 | Then almost any root-finding algorithm will descend into a local minima rather than find the root. | |
550 | ||
551 | [h3 Flatlining] | |
552 | ||
553 | In this example, we're starting from a location (z[sub 0]) where the first derivative is essentially zero: | |
554 | ||
555 | [$../roots/bad_root_2.svg] | |
556 | ||
557 | In this situation the next iteration will shoot off to infinity (assuming we're using derivatives that is). Our | |
558 | code guards against this by insisting that the root is always bracketed, and then never stepping outside those bounds. | |
559 | In a case like this, no root finding algorithm can do better than bisecting until the root is found. | |
560 | ||
561 | Note that there is no scale on the graph, we have seen examples of this situation occur in practice ['even when | |
562 | several decimal places of the initial guess z[sub 0] are correct.] | |
563 | ||
564 | This is really a special case of a more common situation where root finding with derivatives is ['divergent]. Consider | |
565 | starting at z[sub 0] in this case: | |
566 | ||
567 | [$../roots/bad_root_4.svg] | |
568 | ||
569 | An initial Newton step would take you further from the root than you started, as will all subsequent steps. | |
570 | ||
571 | [h3 Micro-stepping / Non-convergence] | |
572 | ||
573 | Consider starting at z[sub 0] in this situation: | |
574 | ||
575 | [$../roots/bad_root_3.svg] | |
576 | ||
577 | The first derivative is essentially infinite, and the second close to zero (and so offers no correction if we use it), | |
578 | as a result we take a very small first step. In the worst case situation, the first step is so small | |
579 | - perhaps even so small that subtracting from z[sub 0] has no effect at the current working precision - that our algorithm | |
580 | will assume we are at the root already and terminate. Otherwise we will take lot's of very small steps which never converge | |
581 | on the root: our algorithms will protect against that by reverting to bisection. | |
582 | ||
583 | An example of this situation would be trying to find the root of e[super -1/z[super 2]] - this function has a single | |
584 | root at ['z = 0], but for ['z[sub 0] < 0] neither Newton nor Halley steps will ever converge on the root, and for ['z[sub 0] > 0] | |
585 | the steps are actually divergent. | |
586 | ||
587 | [endsect] | |
588 | ||
589 | [/ | |
590 | Copyright 2015 John Maddock and Paul A. Bristow. | |
591 | Distributed under the Boost Software License, Version 1.0. | |
592 | (See accompanying file LICENSE_1_0.txt or copy at | |
593 | http://www.boost.org/LICENSE_1_0.txt). | |
594 | ] | |
595 |