ceph/src/boost/libs/math/doc/background/lanczos.qbk

   1 [section:lanczos The Lanczos Approximation]
   2
   3 [h4 Motivation]
   4
   5 ['Why base gamma and gamma-like functions on the Lanczos approximation?]
   6
   7 First of all I should make clear that for the gamma function
   8 over real numbers (as opposed to complex ones)
   9 the Lanczos approximation (See [@http://en.wikipedia.org/wiki/Lanczos_approximation Wikipedia or ]
  10 [@http://mathworld.wolfram.com/LanczosApproximation.html Mathworld])
  11 appears to offer no clear advantage over more traditional methods such as
  12 [@http://en.wikipedia.org/wiki/Stirling_approximation Stirling's approximation].
  13 __pugh carried out an extensive comparison of the various methods available
  14 and discovered that they were all very similar in terms of complexity
  15 and relative error.  However, the Lanczos approximation does have a couple of
  16 properties that make it worthy of further consideration:
  17
  18 * The approximation has an easy to compute truncation error that holds for
  19 all /z > 0/.  In practice that means we can use the same approximation for all
  20 /z > 0/, and be certain that no matter how large or small /z/ is, the truncation
  21 error will /at worst/ be bounded by some finite value.
  22 * The approximation has a form that is particularly amenable to analytic
  23 manipulation, in particular ratios of gamma or gamma-like functions
  24 are particularly easy to compute without resorting to logarithms.
  25
  26 It is the combination of these two properties that make the approximation
  27 attractive: Stirling's approximation is highly accurate for large z, and
  28 has some of the same analytic properties as the Lanczos approximation, but
  29 can't easily be used across the whole range of z.
  30
  31 As the simplest example, consider the ratio of two gamma functions: one could
  32 compute the result via lgamma:
  33
  34    exp(lgamma(a) - lgamma(b));
  35
  36 However, even if lgamma is uniformly accurate to 0.5ulp, the worst case
  37 relative error in the above can easily be shown to be:
  38
  39    Erel > a * log(a)/2 + b * log(b)/2
  40
  41 For small /a/ and /b/ that's not a problem, but to put the relationship another
  42 way: ['each time a and b increase in magnitude by a factor of 10, at least one
  43 decimal digit of precision will be lost.]
  44
  45 In contrast, by analytically combining like power
  46 terms in a ratio of Lanczos approximation's, these errors can be virtually eliminated
  47 for small /a/ and /b/, and kept under control for very large (or very small
  48 for that matter) /a/ and /b/.  Of course, computing large powers is itself a
  49 notoriously hard problem, but even so, analytic combinations of Lanczos
  50 approximations can make the difference between obtaining a valid result, or
  51 simply garbage.  Refer to the implementation notes for the __beta function for
  52 an example of this method in practice.  The incomplete
  53 [link math_toolkit.sf_gamma.igamma gamma_p gamma] and
  54 [link math_toolkit.sf_beta.ibeta_function beta] functions
  55 use similar analytic combinations of power terms, to combine gamma and beta
  56 functions divided by large powers into single (simpler) expressions.
  57
  58 [h4 The Approximation]
  59
  60 The Lanczos Approximation to the Gamma Function is given by:
  61
  62 [equation lanczos0]
  63
  64 Where S[sub g](z) is an infinite sum, that is convergent for all z > 0,
  65 and /g/ is an arbitrary parameter that controls the "shape" of the
  66 terms in the sum which is given by:
  67
  68 [equation lanczos0a]
  69
  70 With individual coefficients defined in closed form by:
  71
  72 [equation lanczos0b]
  73
  74 However, evaluation of the sum in that form can lead to numerical instability
  75 in the computation of the ratios of rising and falling factorials (effectively
  76 we're multiplying by a series of numbers very close to 1, so roundoff errors
  77 can accumulate quite rapidly).
  78
  79 The Lanczos approximation is therefore often written in partial fraction form
  80 with the leading constants absorbed by the coefficients in the sum:
  81
  82 [equation lanczos1]
  83
  84 where:
  85
  86 [equation lanczos2]
  87
  88 Again parameter /g/ is an arbitrarily chosen constant, and /N/ is an arbitrarily chosen
  89 number of terms to evaluate in the "Lanczos sum" part.
  90
  91 [note
  92 Some authors
  93 choose to define the sum from k=1 to N, and hence end up with N+1 coefficients.
  94 This happens to confuse both the following discussion and the code (since C++
  95 deals with half open array ranges, rather than the closed range of the sum).
  96 This convention is consistent with __godfrey, but not __pugh, so take care
  97 when referring to the literature in this field.]
  98
  99 [h4 Computing the Coefficients]
 100
 101 The coefficients C0..CN-1 need to be computed from /N/ and /g/
 102 at high precision, and then stored as part of the program.
 103 Calculation of the coefficients is performed via the method of __godfrey;
 104 let the constants be contained in a column vector P, then:
 105
 106 P = D B C F
 107
 108 where B is an NxN matrix:
 109
 110 [equation lanczos4]
 111
 112 D is an NxN matrix:
 113
 114 [equation lanczos3]
 115
 116 C is an NxN matrix:
 117
 118 [equation lanczos5]
 119
 120 and F is an N element column vector:
 121
 122 [equation lanczos6]
 123
 124 Note than the matrices B, D and C contain all integer terms and depend
 125 only on /N/, this product should be computed first, and then multiplied
 126 by /F/ as the last step.
 127
 128 [h4 Choosing the Right Parameters]
 129
 130 The trick is to choose
 131 /N/ and /g/ to give the desired level of accuracy: choosing a small value for
 132 /g/ leads to a strictly convergent series, but one which converges only slowly.
 133 Choosing a larger value of /g/ causes the terms in the series to be large
 134 and\/or divergent for about the first /g-1/ terms, and to then suddenly converge
 135 with a "crunch".
 136
 137 __pugh has determined the optimal
 138 value of /g/ for /N/ in the range /1 <= N <= 60/: unfortunately in practice choosing
 139 these values leads to cancellation errors in the Lanczos sum as the largest
 140 term in the (alternating) series is approximately 1000 times larger than the result.
 141 These optimal values appear not to be useful in practice unless the evaluation
 142 can be done with a number of guard digits /and/ the coefficients are stored
 143 at higher precision than that desired in the result.  These values are best
 144 reserved for say, computing to float precision with double precision arithmetic.
 145
 146 [table Optimal choices for N and g when computing with guard digits (source: Pugh)
 147 [[Significand Size] [N] [g][Max Error]]
 148 [[24] [6] [5.581][9.51e-12]]
 149 [[53][13][13.144565][9.2213e-23]]
 150 ]
 151
 152 The alternative described by __godfrey is to perform an exhaustive
 153 search of the /N/ and /g/ parameter space to determine the optimal combination for
 154 a given /p/ digit floating-point type.  Repeating this work found a good
 155 approximation for double precision arithmetic (close to the one __godfrey found),
 156 but failed to find really
 157 good approximations for 80 or 128-bit long doubles.  Further it was observed
 158 that the approximations obtained tended to optimised for the small values
 159 of z (1 < z < 200) used to test the implementation against the factorials.
 160 Computing ratios of gamma functions with large arguments were observed to
 161 suffer from error resulting from the truncation of the Lancozos series.
 162
 163 __pugh identified all the locations where the theoretical error of the
 164 approximation were at a minimum, but unfortunately has published only the largest
 165 of these minima.  However, he makes the observation that the minima
 166 coincide closely with the location where the first neglected term (a[sub N]) in the
 167 Lanczos series S[sub g](z) changes sign.  These locations are quite easy to
 168 locate, albeit with considerable computer time.  These "sweet spots" need
 169 only be computed once, tabulated, and then searched when required for an
 170 approximation that delivers the required precision for some fixed precision
 171 type.
 172
 173 Unfortunately, following this path failed to find a really good approximation
 174 for 128-bit long doubles, and those found for 64 and 80-bit reals required an
 175 excessive number of terms.  There are two competing issues here: high precision
 176 requires a large value of /g/, but avoiding cancellation errors in the evaluation
 177 requires a small /g/.
 178
 179 At this point note that the Lanczos sum can be converted into rational form
 180 (a ratio of two polynomials, obtained from the partial-fraction form using
 181 polynomial arithmetic),
 182 and doing so changes the coefficients so that /they are all positive/.  That
 183 means that the sum in rational form can be evaluated without cancellation
 184 error, albeit with double the number of coefficients for a given N.  Repeating
 185 the search of the "sweet spots", this time evaluating the Lanczos sum in
 186 rational form, and testing only those "sweet spots" whose theoretical error
 187 is less than the machine epsilon for the type being tested, yielded good
 188 approximations for all the types tested.  The optimal values found were quite
 189 close to the best cases reported by __pugh (just slightly larger /N/ and slightly
 190 smaller /g/ for a given precision than __pugh reports), and even though converting
 191 to rational form doubles the number of stored coefficients, it should be
 192 noted that half of them are integers (and therefore require less storage space)
 193 and the approximations require a smaller /N/ than would otherwise be required,
 194 so fewer floating point operations may be required overall.
 195
 196 The following table shows the optimal values for /N/ and /g/ when computing
 197 at fixed precision.  These should be taken as work in progress: there are no
 198 values for 106-bit significand machines (Darwin long doubles & NTL quad_float),
 199 and further optimisation of the values of /g/ may be possible.
 200 Errors given in the table
 201 are estimates of the error due to truncation of the Lanczos infinite series
 202 to /N/ terms.  They are calculated from the sum of the first five neglected
 203 terms - and are known to be rather pessimistic estimates - although it is noticeable
 204 that the best combinations of /N/ and /g/ occurred when the estimated truncation error
 205 almost exactly matches the machine epsilon for the type in question.
 206
 207 [table Optimum value for N and g when computing at fixed precision
 208 [[Significand Size][Platform/Compiler Used][N][g][Max Truncation Error]]
 209 [[24][Win32, VC++ 7.1] [6] [1.428456135094165802001953125][9.41e-007]]
 210 [[53][Win32, VC++ 7.1] [13] [6.024680040776729583740234375][3.23e-016]]
 211 [[64][Suse Linux 9 IA64, gcc-3.3.3] [17] [12.2252227365970611572265625][2.34e-024]]
 212 [[116][HP Tru64 Unix 5.1B \/ Alpha, Compaq C++ V7.1-006] [24] [20.3209821879863739013671875][4.75e-035]]
 213 ]
 214
 215 Finally note that the Lanczos approximation can be written as follows
 216 by removing a factor of exp(g) from the denominator, and then dividing
 217 all the coefficients by exp(g):
 218
 219 [equation lanczos7]
 220
 221 This form is more convenient for calculating lgamma, but for the gamma
 222 function the division by /e/ turns a possibly exact quality into an
 223 inexact value: this reduces accuracy in the common case that
 224 the input is exact, and so isn't used for the gamma function.
 225
 226 [h4 References]
 227
 228 # [#godfrey]Paul Godfrey, [@http://my.fit.edu/~gabdo/gamma.txt "A note on the computation of the convergent
 229 Lanczos complex Gamma approximation"].
 230 # [#pugh]Glendon Ralph Pugh,
 231 [@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf
 232 "An Analysis of the Lanczos Gamma Approximation"],
 233 PhD Thesis November 2004.
 234 # Viktor T. Toth,
 235 [@http://www.rskey.org/gamma.htm "Calculators and the Gamma Function"].
 236 # Mathworld, [@http://mathworld.wolfram.com/LanczosApproximation.html
 237 The Lanczos Approximation].
 238
 239 [endsect][/section:lanczos The Lanczos Approximation]
 240
 241 [/
 242   Copyright 2006 John Maddock and Paul A. Bristow.
 243   Distributed under the Boost Software License, Version 1.0.
 244   (See accompanying file LICENSE_1_0.txt or copy at
 245   http://www.boost.org/LICENSE_1_0.txt).
 246 ]