ceph/src/boost/doc/test/remez.qbk

   1 [section:remez Sample Article (The Remez Method)]
   2
   3 The [@http://en.wikipedia.org/wiki/Remez_algorithm Remez algorithm]
   4 is a methodology for locating the minimax rational approximation
   5 to a function.  This short article gives a brief overview of the method, but
   6 it should not be regarded as a thorough theoretical treatment, for that you
   7 should consult your favorite textbook.
   8
   9 Imagine that you want to approximate some function f(x) by way of a rational
  10 function R(x), where R(x) may be either a polynomial P(x) or a ratio of two
  11 polynomials P(x)/Q(x) (a rational function).  Initially we'll concentrate on the
  12 polynomial case, as it's by far the easier to deal with, later we'll extend
  13 to the full rational function case.
  14
  15 We want to find the "best" rational approximation, where
  16 "best" is defined to be the approximation that has the least deviation
  17 from f(x).  We can measure the deviation by way of an error function:
  18
  19 E[sub abs](x) = f(x) - R(x)
  20
  21 which is expressed in terms of absolute error, but we can equally use
  22 relative error:
  23
  24 E[sub rel](x) = (f(x) - R(x)) / |f(x)|
  25
  26 And indeed in general we can scale the error function in any way we want, it
  27 makes no difference to the maths, although the two forms above cover almost
  28 every practical case that you're likely to encounter.
  29
  30 The minimax rational function R(x) is then defined to be the function that
  31 yields the smallest maximal value of the error function.  Chebyshev showed
  32 that there is a unique minimax solution for R(x) that has the following
  33 properties:
  34
  35 * If R(x) is a polynomial of degree N, then there are N+2 unknowns:
  36 the N+1 coefficients of the polynomial, and maximal value of the error
  37 function.
  38 * The error function has N+1 roots, and N+2 extrema (minima and maxima).
  39 * The extrema alternate in sign, and all have the same magnitude.
  40
  41 That means that if we know the location of the extrema of the error function
  42 then we can write N+2 simultaneous equations:
  43
  44 R(x[sub i]) + (-1)[super i]E = f(x[sub i])
  45
  46 where E is the maximal error term, and x[sub i] are the abscissa values of the
  47 N+2 extrema of the error function.  It is then trivial to solve the simultaneous
  48 equations to obtain the polynomial coefficients and the error term.
  49
  50 ['Unfortunately we don't know where the extrema of the error function are located!]
  51
  52 [h4 The Remez Method]
  53
  54 The Remez method is an iterative technique which, given a broad range of
  55 assumptions, will converge on the extrema of the error function, and therefore
  56 the minimax solution.
  57
  58 In the following discussion we'll use a concrete example to illustrate
  59 the Remez method: an approximation to the function e[super x][space] over
  60 the range \[-1, 1\].
  61
  62 Before we can begin the Remez method, we must obtain an initial value
  63 for the location of the extrema of the error function.  We could "guess"
  64 these, but a much closer first approximation can be obtained by first
  65 constructing an interpolated polynomial approximation to f(x).
  66
  67 In order to obtain the N+1 coefficients of the interpolated polynomial
  68 we need N+1 points (x[sub 0]...x[sub N]): with our interpolated form
  69 passing through each of those points
  70 that yields N+1 simultaneous equations:
  71
  72 f(x[sub i]) = P(x[sub i]) = c[sub 0] + c[sub 1]x[sub i] ... + c[sub N]x[sub i][super N]
  73
  74 Which can be solved for the coefficients c[sub 0]...c[sub N] in P(x).
  75
  76 Obviously this is not a minimax solution, indeed our only guarantee is that f(x) and
  77 P(x) touch at N+1 locations, away from those points the error may be arbitrarily
  78 large.  However, we would clearly like this initial approximation to be as close to
  79 f(x) as possible, and it turns out that using the zeros of an orthogonal polynomial
  80 as the initial interpolation points is a good choice.  In our example we'll use the
  81 zeros of a Chebyshev polynomial as these are particularly easy to calculate,
  82 interpolating for a polynomial of degree 4, and measuring /relative error/
  83 we get the following error function:
  84
  85 [$images/remez-2.png]
  86
  87 Which has a peak relative error of 1.2x10[super -3].
  88
  89 While this is a pretty good approximation already, judging by the
  90 shape of the error function we can clearly do better.  Before starting
  91 on the Remez method propper, we have one more step to perform: locate
  92 all the extrema of the error function, and store
  93 these locations as our initial ['Chebyshev control points].
  94
  95 [note
  96 In the simple case of a polynomial approximation, by interpolating through
  97 the roots of a Chebyshev polynomial we have in fact created a ['Chebyshev
  98 approximation] to the function: in terms of /absolute error/
  99 this is the best a priori choice for the interpolated form we can
 100 achieve, and typically is very close to the minimax solution.
 101
 102 However, if we want to optimise for /relative error/, or if the approximation
 103 is a rational function, then the initial Chebyshev solution can be quite far
 104 from the ideal minimax solution.
 105
 106 A more technical discussion of the theory involved can be found in this
 107 [@http://math.fullerton.edu/mathews/n2003/ChebyshevPolyMod.html online course].]
 108
 109 [h4 Remez Step 1]
 110
 111 The first step in the Remez method, given our current set of
 112 N+2 Chebyshev control points x[sub i], is to solve the N+2 simultaneous
 113 equations:
 114
 115 P(x[sub i]) + (-1)[super i]E = f(x[sub i])
 116
 117 To obtain the error term E, and the coefficients of the polynomial P(x).
 118
 119 This gives us a new approximation to f(x) that has the same error /E/ at
 120 each of the control points, and whose error function ['alternates in sign]
 121 at the control points.  This is still not necessarily the minimax
 122 solution though: since the control points may not be at the extrema of the error
 123 function.  After this first step here's what our approximation's error
 124 function looks like:
 125
 126 [$images/remez-3.png]
 127
 128 Clearly this is still not the minimax solution since the control points
 129 are not located at the extrema, but the maximum relative error has now
 130 dropped to 5.6x10[super -4].
 131
 132 [h4 Remez Step 2]
 133
 134 The second step is to locate the extrema of the new approximation, which we do
 135 in two stages:  first, since the error function changes sign at each
 136 control point, we must have N+1 roots of the error function located between
 137 each pair of N+2 control points.  Once these roots are found by standard root finding
 138 techniques, we know that N extrema are bracketed between each pair of
 139 roots, plus two more between the endpoints of the range and the first and last roots.
 140 The N+2 extrema can then be found using standard function minimisation techniques.
 141
 142 We now have a choice: multi-point exchange, or single point exchange.
 143
 144 In single point exchange, we move the control point nearest to the largest extrema to
 145 the absissa value of the extrema.
 146
 147 In multi-point exchange we swap all the current control points, for the locations
 148 of the extrema.
 149
 150 In our example we perform multi-point exchange.
 151
 152 [h4 Iteration]
 153
 154 The Remez method then performs steps 1 and 2 above iteratively until the control
 155 points are located at the extrema of the error function: this is then
 156 the minimax solution.
 157
 158 For our current example, two more iterations converges on a minimax
 159 solution with a peak relative error of
 160 5x10[super -4] and an error function that looks like:
 161
 162 [$images/remez-4.png]
 163
 164 [h4 Rational Approximations]
 165
 166 If we wish to extend the Remez method to a rational approximation of the form
 167
 168 f(x) = R(x) = P(x) / Q(x)
 169
 170 where P(x) and Q(x) are polynomials, then we proceed as before, except that now
 171 we have N+M+2 unknowns if P(x) is of order N and Q(x) is of order M.  This assumes
 172 that Q(x) is normalised so that it's leading coefficient is 1, giving
 173 N+M+1 polynomial coefficients in total, plus the error term E.
 174
 175 The simultaneous equations to be solved are now:
 176
 177 P(x[sub i]) / Q(x[sub i]) + (-1)[super i]E = f(x[sub i])
 178
 179 Evaluated at the N+M+2 control points x[sub i].
 180
 181 Unfortunately these equations are non-linear in the error term E: we can only
 182 solve them if we know E, and yet E is one of the unknowns!
 183
 184 The method usually adopted to solve these equations is an iterative one: we guess the
 185 value of E, solve the equations to obtain a new value for E (as well as the polynomial
 186 coefficients), then use the new value of E as the next guess.  The method is
 187 repeated until E converges on a stable value.
 188
 189 These complications extend the running time required for the development
 190 of rational approximations quite considerably. It is often desirable
 191 to obtain a rational rather than polynomial approximation none the less:
 192 rational approximations will often match more difficult to approximate
 193 functions, to greater accuracy, and with greater efficiency, than their
 194 polynomial alternatives.  For example, if we takes our previous example
 195 of an approximation to e[super x], we obtained 5x10[super -4] accuracy
 196 with an order 4 polynomial.  If we move two of the unknowns into the denominator
 197 to give a pair of order 2 polynomials, and re-minimise, then the peak relative error drops
 198 to 8.7x10[super -5].  That's a 5 fold increase in accuracy, for the same number
 199 of terms overall.
 200
 201 [h4 Practical Considerations]
 202
 203 Most treatises on approximation theory stop at this point.  However, from
 204 a practical point of view, most of the work involves finding the right
 205 approximating form, and then persuading the Remez method to converge
 206 on a solution.
 207
 208 So far we have used a direct approximation:
 209
 210 f(x) = R(x)
 211
 212 But this will converge to a useful approximation only if f(x) is smooth.  In
 213 addition round-off errors when evaluating the rational form mean that this
 214 will never get closer than within a few epsilon of machine precision.
 215 Therefore this form of direct approximation is often reserved for situations
 216 where we want efficiency, rather than accuracy.
 217
 218 The first step in improving the situation is generally to split f(x) into
 219 a dominant part that we can compute accurately by another method, and a
 220 slowly changing remainder which can be approximated by a rational approximation.
 221 We might be tempted to write:
 222
 223 f(x) = g(x) + R(x)
 224
 225 where g(x) is the dominant part of f(x), but if f(x)\/g(x) is approximately
 226 constant over the interval of interest then:
 227
 228 f(x) = g(x)(c + R(x))
 229
 230 Will yield a much better solution: here /c/ is a constant that is the approximate
 231 value of f(x)\/g(x) and R(x) is typically tiny compared to /c/.  In this situation
 232 if R(x) is optimised for absolute error, then as long as its error is small compared
 233 to the constant /c/, that error will effectively get wiped out when R(x) is added to
 234 /c/.
 235
 236 The difficult part is obviously finding the right g(x) to extract from your
 237 function: often the asymptotic behaviour of the function will give a clue, so
 238 for example the function __erfc becomes proportional to
 239 e[super -x[super 2]]\/x as x becomes large.  Therefore using:
 240
 241 erfc(z) = (C + R(x)) e[super -x[super 2]]/x
 242
 243 as the approximating form seems like an obvious thing to try, and does indeed
 244 yield a useful approximation.
 245
 246 However, the difficulty then becomes one of converging the minimax solution.
 247 Unfortunately, it is known that for some functions the Remez method can lead
 248 to divergent behaviour, even when the initial starting approximation is quite good.
 249 Furthermore, it is not uncommon for the solution obtained in the first Remez step
 250 above to be a bad one: the equations to be solved are generally "stiff", often
 251 very close to being singular, and assuming a solution is found at all, round-off
 252 errors and a rapidly changing error function, can lead to a situation where the
 253 error function does not in fact change sign at each control point as required.
 254 If this occurs, it is fatal to the Remez method.  It is also possible to
 255 obtain solutions that are perfectly valid mathematically, but which are
 256 quite useless computationally: either because there is an unavoidable amount
 257 of roundoff error in the computation of the rational function, or because
 258 the denominator has one or more roots over the interval of the approximation.
 259 In the latter case while the approximation may have the correct limiting value at
 260 the roots, the approximation is nonetheless useless.
 261
 262 Assuming that the approximation does not have any fatal errors, and that the only
 263 issue is converging adequately on the minimax solution, the aim is to
 264 get as close as possible to the minimax solution before beginning the Remez method.
 265 Using the zeros of a Chebyshev polynomial for the initial interpolation is a
 266 good start, but may not be ideal when dealing with relative errors and\/or
 267 rational (rather than polynomial) approximations.  One approach is to skew
 268 the initial interpolation points to one end: for example if we raise the
 269 roots of the Chebyshev polynomial to a positive power greater than 1
 270 then the roots will be skewed towards the middle of the \[-1,1\] interval,
 271 while a positive power less than one
 272 will skew them towards either end.  More usefully, if we initially rescale the
 273 points over \[0,1\] and then raise to a positive power, we can skew them to the left
 274 or right.  Returning to our example of e[super x][space] over \[-1,1\], the initial
 275 interpolated form was some way from the minimax solution:
 276
 277 [$images/remez-2.png]
 278
 279 However, if we first skew the interpolation points to the left (rescale them
 280 to \[0, 1\], raise to the power 1.3, and then rescale back to \[-1,1\]) we
 281 reduce the error from 1.3x10[super -3][space]to 6x10[super -4]:
 282
 283 [$images/remez-5.png]
 284
 285 It's clearly still not ideal, but it is only a few percent away from
 286 our desired minimax solution (5x10[super -4]).
 287
 288 [h4 Remez Method Checklist]
 289
 290 The following lists some of the things to check if the Remez method goes wrong,
 291 it is by no means an exhaustive list, but is provided in the hopes that it will
 292 prove useful.
 293
 294 * Is the function smooth enough?  Can it be better separated into
 295 a rapidly changing part, and an asymptotic part?
 296 * Does the function being approximated have any "blips" in it?  Check
 297 for problems as the function changes computation method, or
 298 if a root, or an infinity has been divided out.  The telltale
 299 sign is if there is a narrow region where the Remez method will
 300 not converge.
 301 * Check you have enough accuracy in your calculations: remember that
 302 the Remez method works on the difference between the approximation
 303 and the function being approximated: so you must have more digits of
 304 precision available than the precision of the approximation
 305 being constructed.  So for example at double precision, you
 306 shouldn't expect to be able to get better than a float precision
 307 approximation.
 308 * Try skewing the initial interpolated approximation to minimise the
 309 error before you begin the Remez steps.
 310 * If the approximation won't converge or is ill-conditioned from one starting
 311 location, try starting from a different location.
 312 * If a rational function won't converge, one can minimise a polynomial
 313 (which presents no problems), then rotate one term from the numerator to
 314 the denominator and minimise again.  In theory one can continue moving
 315 terms one at a time from numerator to denominator, and then re-minimising,
 316 retaining the last set of control points at each stage.
 317 * Try using a smaller interval.  It may also be possible to optimise over
 318 one (small) interval, rescale the control points over a larger interval,
 319 and then re-minimise.
 320 * Keep absissa values small: use a change of variable to keep the abscissa
 321 over, say \[0, b\], for some smallish value /b/.
 322
 323 [h4 References]
 324
 325 The original references for the Remez Method and it's extension
 326 to rational functions are unfortunately in Russian:
 327
 328 Remez, E.Ya., ['Fundamentals of numerical methods for Chebyshev approximations],
 329 "Naukova Dumka", Kiev, 1969.
 330
 331 Remez, E.Ya., Gavrilyuk, V.T., ['Computer development of certain approaches
 332 to the approximate construction of solutions of Chebyshev problems
 333 nonlinearly depending on parameters], Ukr. Mat. Zh. 12 (1960), 324-338.
 334
 335 Gavrilyuk, V.T., ['Generalization of the first polynomial algorithm of
 336 E.Ya.Remez for the problem of constructing rational-fractional
 337 Chebyshev approximations], Ukr. Mat. Zh. 16 (1961), 575-585.
 338
 339 Some English language sources include:
 340
 341 Fraser, W., Hart, J.F., ['On the computation of rational approximations
 342 to continuous functions], Comm. of the ACM 5 (1962), 401-403, 414.
 343
 344 Ralston, A., ['Rational Chebyshev approximation by Remes' algorithms],
 345 Numer.Math. 7 (1965), no. 4, 322-330.
 346
 347 A. Ralston, ['Rational Chebyshev approximation, Mathematical
 348 Methods for Digital Computers v. 2] (Ralston A., Wilf H., eds.),
 349 Wiley, New York, 1967, pp. 264-284.
 350
 351 Hart, J.F. e.a., ['Computer approximations], Wiley, New York a.o., 1968.
 352
 353 Cody, W.J., Fraser, W., Hart, J.F., ['Rational Chebyshev approximation
 354 using linear equations], Numer.Math. 12 (1968), 242-251.
 355
 356 Cody, W.J., ['A survey of practical rational and polynomial
 357 approximation of functions], SIAM Review 12 (1970), no. 3, 400-423.
 358
 359 Barrar, R.B., Loeb, H.J., ['On the Remez algorithm for non-linear
 360 families], Numer.Math. 15 (1970), 382-391.
 361
 362 Dunham, Ch.B., ['Convergence of the Fraser-Hart algorithm for rational
 363 Chebyshev approximation], Math. Comp. 29 (1975), no. 132, 1078-1082.
 364
 365 G. L. Litvinov, ['Approximate construction of rational
 366 approximations and the effect of error autocorrection],
 367 Russian Journal of Mathematical Physics, vol.1, No. 3, 1994.
 368
 369 [endsect][/section:remez The Remez Method]
 370
 371
 372