[ceph.git] / ceph / src / boost / libs / math / doc / background / lanczos.qbk

[section:lanczos The Lanczos Approximation]

[h4 Motivation]

['Why base gamma and gamma-like functions on the Lanczos approximation?]

First of all I should make clear that for the gamma function
over real numbers (as opposed to complex ones)
the Lanczos approximation (See [@http://en.wikipedia.org/wiki/Lanczos_approximation Wikipedia or ]
[@http://mathworld.wolfram.com/LanczosApproximation.html Mathworld])
appears to offer no clear advantage over more traditional methods such as
[@http://en.wikipedia.org/wiki/Stirling_approximation Stirling's approximation].
__pugh carried out an extensive comparison of the various methods available
and discovered that they were all very similar in terms of complexity
and relative error.  However, the Lanczos approximation does have a couple of
properties that make it worthy of further consideration:

* The approximation has an easy to compute truncation error that holds for 
all /z > 0/.  In practice that means we can use the same approximation for all
/z > 0/, and be certain that no matter how large or small /z/ is, the truncation 
error will /at worst/ be bounded by some finite value.
* The approximation has a form that is particularly amenable to analytic
manipulation, in particular ratios of gamma or gamma-like functions
are particularly easy to compute without resorting to logarithms.

It is the combination of these two properties that make the approximation
attractive: Stirling's approximation is highly accurate for large z, and
has some of the same analytic properties as the Lanczos approximation, but
can't easily be used across the whole range of z.

As the simplest example, consider the ratio of two gamma functions: one could 
compute the result via lgamma:

   exp(lgamma(a) - lgamma(b));

However, even if lgamma is uniformly accurate to 0.5ulp, the worst case 
relative error in the above can easily be shown to be:

   Erel > a * log(a)/2 + b * log(b)/2

For small /a/ and /b/ that's not a problem, but to put the relationship another
way: ['each time a and b increase in magnitude by a factor of 10, at least one
decimal digit of precision will be lost.]  

In contrast, by analytically combining like power
terms in a ratio of Lanczos approximation's, these errors can be virtually eliminated
for small /a/ and /b/, and kept under control for very large (or very small
for that matter) /a/ and /b/.  Of course, computing large powers is itself a
notoriously hard problem, but even so, analytic combinations of Lanczos 
approximations can make the difference between obtaining a valid result, or
simply garbage.  Refer to the implementation notes for the __beta function for
an example of this method in practice.  The incomplete 
[link math_toolkit.sf_gamma.igamma gamma_p gamma] and 
[link math_toolkit.sf_beta.ibeta_function beta] functions
use similar analytic combinations of power terms, to combine gamma and beta
functions divided by large powers into single (simpler) expressions.

[h4 The Approximation]

The Lanczos Approximation to the Gamma Function is given by:

[equation lanczos0]

Where S[sub g](z) is an infinite sum, that is convergent for all z > 0, 
and /g/ is an arbitrary parameter that controls the "shape" of the
terms in the sum which is given by:

[equation lanczos0a]

With individual coefficients defined in closed form by:

[equation lanczos0b]

However, evaluation of the sum in that form can lead to numerical instability
in the computation of the ratios of rising and falling factorials (effectively
we're multiplying by a series of numbers very close to 1, so roundoff errors
can accumulate quite rapidly).

The Lanczos approximation is therefore often written in partial fraction form
with the leading constants absorbed by the coefficients in the sum:

[equation lanczos1]

where:

[equation lanczos2]

Again parameter /g/ is an arbitrarily chosen constant, and /N/ is an arbitrarily chosen 
number of terms to evaluate in the "Lanczos sum" part.  

[note 
Some authors
choose to define the sum from k=1 to N, and hence end up with N+1 coefficients.
This happens to confuse both the following discussion and the code (since C++
deals with half open array ranges, rather than the closed range of the sum).
This convention is consistent with __godfrey, but not __pugh, so take care
when referring to the literature in this field.]

[h4 Computing the Coefficients]

The coefficients C0..CN-1 need to be computed from /N/ and /g/ 
at high precision, and then stored as part of the program.
Calculation of the coefficients is performed via the method of __godfrey;
let the constants be contained in a column vector P, then:

P = D B C F

where B is an NxN matrix:

[equation lanczos4]

D is an NxN matrix:

[equation lanczos3]

C is an NxN matrix:

[equation lanczos5]

and F is an N element column vector:

[equation lanczos6]

Note than the matrices B, D and C contain all integer terms and depend
only on /N/, this product should be computed first, and then multiplied
by /F/ as the last step.

[h4 Choosing the Right Parameters]

The trick is to choose
/N/ and /g/ to give the desired level of accuracy: choosing a small value for
/g/ leads to a strictly convergent series, but one which converges only slowly.
Choosing a larger value of /g/ causes the terms in the series to be large
and\/or divergent for about the first /g-1/ terms, and to then suddenly converge 
with a "crunch".

__pugh has determined the optimal
value of /g/ for /N/ in the range /1 <= N <= 60/: unfortunately in practice choosing
these values leads to cancellation errors in the Lanczos sum as the largest
term in the (alternating) series is approximately 1000 times larger than the result.  
These optimal values appear not to be useful in practice unless the evaluation
can be done with a number of guard digits /and/ the coefficients are stored
at higher precision than that desired in the result.  These values are best
reserved for say, computing to float precision with double precision arithmetic. 

[table Optimal choices for N and g when computing with guard digits (source: Pugh)
[[Significand Size] [N] [g][Max Error]]
[[24] [6] [5.581][9.51e-12]]
[[53][13][13.144565][9.2213e-23]]
]

The alternative described by __godfrey is to perform an exhaustive
search of the /N/ and /g/ parameter space to determine the optimal combination for
a given /p/ digit floating-point type.  Repeating this work found a good 
approximation for double precision arithmetic (close to the one __godfrey found), 
but failed to find really
good approximations for 80 or 128-bit long doubles.  Further it was observed 
that the approximations obtained tended to optimised for the small values
of z (1 < z < 200) used to test the implementation against the factorials.
Computing ratios of gamma functions with large arguments were observed to
suffer from error resulting from the truncation of the Lancozos series.

__pugh identified all the locations where the theoretical error of the 
approximation were at a minimum, but unfortunately has published only the largest
of these minima.  However, he makes the observation that the minima
coincide closely with the location where the first neglected term (a[sub N]) in the
Lanczos series S[sub g](z) changes sign.  These locations are quite easy to
locate, albeit with considerable computer time.  These "sweet spots" need
only be computed once, tabulated, and then searched when required for an
approximation that delivers the required precision for some fixed precision
type.

Unfortunately, following this path failed to find a really good approximation
for 128-bit long doubles, and those found for 64 and 80-bit reals required an
excessive number of terms.  There are two competing issues here: high precision
requires a large value of /g/, but avoiding cancellation errors in the evaluation
requires a small /g/.

At this point note that the Lanczos sum can be converted into rational form
(a ratio of two polynomials, obtained from the partial-fraction form using
polynomial arithmetic),
and doing so changes the coefficients so that /they are all positive/.  That 
means that the sum in rational form can be evaluated without cancellation 
error, albeit with double the number of coefficients for a given N.  Repeating 
the search of the "sweet spots", this time evaluating the Lanczos sum in 
rational form, and testing only those "sweet spots" whose theoretical error
is less than the machine epsilon for the type being tested, yielded good
approximations for all the types tested.  The optimal values found were quite
close to the best cases reported by __pugh (just slightly larger /N/ and slightly
smaller /g/ for a given precision than __pugh reports), and even though converting
to rational form doubles the number of stored coefficients, it should be
noted that half of them are integers (and therefore require less storage space)
and the approximations require a smaller /N/ than would otherwise be required,
so fewer floating point operations may be required overall.

The following table shows the optimal values for /N/ and /g/ when computing
at fixed precision.  These should be taken as work in progress: there are no
values for 106-bit significand machines (Darwin long doubles & NTL quad_float),
and further optimisation of the values of /g/ may be possible.
Errors given in the table
are estimates of the error due to truncation of the Lanczos infinite series
to /N/ terms.  They are calculated from the sum of the first five neglected
terms - and are known to be rather pessimistic estimates - although it is noticeable
that the best combinations of /N/ and /g/ occurred when the estimated truncation error
almost exactly matches the machine epsilon for the type in question.

[table Optimum value for N and g when computing at fixed precision
[[Significand Size][Platform/Compiler Used][N][g][Max Truncation Error]]
[[24][Win32, VC++ 7.1] [6] [1.428456135094165802001953125][9.41e-007]]
[[53][Win32, VC++ 7.1] [13] [6.024680040776729583740234375][3.23e-016]]
[[64][Suse Linux 9 IA64, gcc-3.3.3] [17] [12.2252227365970611572265625][2.34e-024]]
[[116][HP Tru64 Unix 5.1B \/ Alpha, Compaq C++ V7.1-006] [24] [20.3209821879863739013671875][4.75e-035]]
]

Finally note that the Lanczos approximation can be written as follows
by removing a factor of exp(g) from the denominator, and then dividing 
all the coefficients by exp(g):

[equation lanczos7]

This form is more convenient for calculating lgamma, but for the gamma
function the division by /e/ turns a possibly exact quality into an
inexact value: this reduces accuracy in the common case that 
the input is exact, and so isn't used for the gamma function.

[h4 References]

# [#godfrey]Paul Godfrey, [@http://my.fit.edu/~gabdo/gamma.txt "A note on the computation of the convergent
Lanczos complex Gamma approximation"].
# [#pugh]Glendon Ralph Pugh, 
[@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf 
"An Analysis of the Lanczos Gamma Approximation"], 
PhD Thesis November 2004.
# Viktor T. Toth, 
[@http://www.rskey.org/gamma.htm "Calculators and the Gamma Function"].
# Mathworld, [@http://mathworld.wolfram.com/LanczosApproximation.html 
The Lanczos Approximation].

[endsect][/section:lanczos The Lanczos Approximation]

[/ 
  Copyright 2006 John Maddock and Paul A. Bristow.
  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]
Commit	Line	Data
7c673cae FG	1	[section:lanczos The Lanczos Approximation]
	2
	3	[h4 Motivation]
	4
	5	['Why base gamma and gamma-like functions on the Lanczos approximation?]
	6
	7	First of all I should make clear that for the gamma function
	8	over real numbers (as opposed to complex ones)
	9	the Lanczos approximation (See [@http://en.wikipedia.org/wiki/Lanczos_approximation Wikipedia or ]
	10	[@http://mathworld.wolfram.com/LanczosApproximation.html Mathworld])
	11	appears to offer no clear advantage over more traditional methods such as
	12	[@http://en.wikipedia.org/wiki/Stirling_approximation Stirling's approximation].
	13	__pugh carried out an extensive comparison of the various methods available
	14	and discovered that they were all very similar in terms of complexity
	15	and relative error. However, the Lanczos approximation does have a couple of
	16	properties that make it worthy of further consideration:
	17
	18	* The approximation has an easy to compute truncation error that holds for
	19	all /z > 0/. In practice that means we can use the same approximation for all
	20	/z > 0/, and be certain that no matter how large or small /z/ is, the truncation
	21	error will /at worst/ be bounded by some finite value.
	22	* The approximation has a form that is particularly amenable to analytic
	23	manipulation, in particular ratios of gamma or gamma-like functions
	24	are particularly easy to compute without resorting to logarithms.
	25
	26	It is the combination of these two properties that make the approximation
	27	attractive: Stirling's approximation is highly accurate for large z, and
	28	has some of the same analytic properties as the Lanczos approximation, but
	29	can't easily be used across the whole range of z.
	30
	31	As the simplest example, consider the ratio of two gamma functions: one could
	32	compute the result via lgamma:
	33
	34	exp(lgamma(a) - lgamma(b));
	35
	36	However, even if lgamma is uniformly accurate to 0.5ulp, the worst case
	37	relative error in the above can easily be shown to be:
	38
	39	Erel > a * log(a)/2 + b * log(b)/2
	40
	41	For small /a/ and /b/ that's not a problem, but to put the relationship another
	42	way: ['each time a and b increase in magnitude by a factor of 10, at least one
	43	decimal digit of precision will be lost.]
	44
	45	In contrast, by analytically combining like power
	46	terms in a ratio of Lanczos approximation's, these errors can be virtually eliminated
	47	for small /a/ and /b/, and kept under control for very large (or very small
	48	for that matter) /a/ and /b/. Of course, computing large powers is itself a
	49	notoriously hard problem, but even so, analytic combinations of Lanczos
	50	approximations can make the difference between obtaining a valid result, or
	51	simply garbage. Refer to the implementation notes for the __beta function for
	52	an example of this method in practice. The incomplete
	53	[link math_toolkit.sf_gamma.igamma gamma_p gamma] and
	54	[link math_toolkit.sf_beta.ibeta_function beta] functions
	55	use similar analytic combinations of power terms, to combine gamma and beta
	56	functions divided by large powers into single (simpler) expressions.
	57
	58	[h4 The Approximation]
	59
	60	The Lanczos Approximation to the Gamma Function is given by:
	61
	62	[equation lanczos0]
	63
	64	Where S[sub g](z) is an infinite sum, that is convergent for all z > 0,
65	and /g/ is an arbitrary parameter that controls the "shape" of the
66	terms in the sum which is given by:
67
68	[equation lanczos0a]
69
70	With individual coefficients defined in closed form by:
71
72	[equation lanczos0b]
73
74	However, evaluation of the sum in that form can lead to numerical instability
75	in the computation of the ratios of rising and falling factorials (effectively
76	we're multiplying by a series of numbers very close to 1, so roundoff errors
77	can accumulate quite rapidly).
78
79	The Lanczos approximation is therefore often written in partial fraction form
80	with the leading constants absorbed by the coefficients in the sum:
81
82	[equation lanczos1]
83
84	where:
85
86	[equation lanczos2]
87
88	Again parameter /g/ is an arbitrarily chosen constant, and /N/ is an arbitrarily chosen
89	number of terms to evaluate in the "Lanczos sum" part.
90
91	[note
92	Some authors
93	choose to define the sum from k=1 to N, and hence end up with N+1 coefficients.
94	This happens to confuse both the following discussion and the code (since C++
95	deals with half open array ranges, rather than the closed range of the sum).
96	This convention is consistent with __godfrey, but not __pugh, so take care
97	when referring to the literature in this field.]
98
99	[h4 Computing the Coefficients]
100
101	The coefficients C0..CN-1 need to be computed from /N/ and /g/
102	at high precision, and then stored as part of the program.
103	Calculation of the coefficients is performed via the method of __godfrey;
104	let the constants be contained in a column vector P, then:
105
106	P = D B C F
107
108	where B is an NxN matrix:
109
110	[equation lanczos4]
111
112	D is an NxN matrix:
113
114	[equation lanczos3]
115
116	C is an NxN matrix:
117
118	[equation lanczos5]
119
120	and F is an N element column vector:
121
122	[equation lanczos6]
123
124	Note than the matrices B, D and C contain all integer terms and depend
125	only on /N/, this product should be computed first, and then multiplied
126	by /F/ as the last step.
127
128	[h4 Choosing the Right Parameters]
129
130	The trick is to choose
131	/N/ and /g/ to give the desired level of accuracy: choosing a small value for
132	/g/ leads to a strictly convergent series, but one which converges only slowly.
133	Choosing a larger value of /g/ causes the terms in the series to be large
134	and\/or divergent for about the first /g-1/ terms, and to then suddenly converge
135	with a "crunch".
136
137	__pugh has determined the optimal
138	value of /g/ for /N/ in the range /1 <= N <= 60/: unfortunately in practice choosing
139	these values leads to cancellation errors in the Lanczos sum as the largest
140	term in the (alternating) series is approximately 1000 times larger than the result.
141	These optimal values appear not to be useful in practice unless the evaluation
142	can be done with a number of guard digits /and/ the coefficients are stored
143	at higher precision than that desired in the result. These values are best
144	reserved for say, computing to float precision with double precision arithmetic.
145
146	[table Optimal choices for N and g when computing with guard digits (source: Pugh)
147	[[Significand Size] [N] [g][Max Error]]
148	[[24] [6] [5.581][9.51e-12]]
149	[[53][13][13.144565][9.2213e-23]]
150	]
151
152	The alternative described by __godfrey is to perform an exhaustive
153	search of the /N/ and /g/ parameter space to determine the optimal combination for
154	a given /p/ digit floating-point type. Repeating this work found a good
155	approximation for double precision arithmetic (close to the one __godfrey found),
156	but failed to find really
157	good approximations for 80 or 128-bit long doubles. Further it was observed
158	that the approximations obtained tended to optimised for the small values
159	of z (1 < z < 200) used to test the implementation against the factorials.
160	Computing ratios of gamma functions with large arguments were observed to
161	suffer from error resulting from the truncation of the Lancozos series.
162
163	__pugh identified all the locations where the theoretical error of the
164	approximation were at a minimum, but unfortunately has published only the largest
165	of these minima. However, he makes the observation that the minima
166	coincide closely with the location where the first neglected term (a[sub N]) in the
167	Lanczos series S[sub g](z) changes sign. These locations are quite easy to
168	locate, albeit with considerable computer time. These "sweet spots" need
169	only be computed once, tabulated, and then searched when required for an
170	approximation that delivers the required precision for some fixed precision
171	type.
172
173	Unfortunately, following this path failed to find a really good approximation
174	for 128-bit long doubles, and those found for 64 and 80-bit reals required an
175	excessive number of terms. There are two competing issues here: high precision
176	requires a large value of /g/, but avoiding cancellation errors in the evaluation
177	requires a small /g/.
178
179	At this point note that the Lanczos sum can be converted into rational form
180	(a ratio of two polynomials, obtained from the partial-fraction form using
181	polynomial arithmetic),
182	and doing so changes the coefficients so that /they are all positive/. That
183	means that the sum in rational form can be evaluated without cancellation
184	error, albeit with double the number of coefficients for a given N. Repeating
185	the search of the "sweet spots", this time evaluating the Lanczos sum in
186	rational form, and testing only those "sweet spots" whose theoretical error
187	is less than the machine epsilon for the type being tested, yielded good
188	approximations for all the types tested. The optimal values found were quite
189	close to the best cases reported by __pugh (just slightly larger /N/ and slightly
190	smaller /g/ for a given precision than __pugh reports), and even though converting
191	to rational form doubles the number of stored coefficients, it should be
192	noted that half of them are integers (and therefore require less storage space)
193	and the approximations require a smaller /N/ than would otherwise be required,
194	so fewer floating point operations may be required overall.
195
196	The following table shows the optimal values for /N/ and /g/ when computing
197	at fixed precision. These should be taken as work in progress: there are no
198	values for 106-bit significand machines (Darwin long doubles & NTL quad_float),
199	and further optimisation of the values of /g/ may be possible.
200	Errors given in the table
201	are estimates of the error due to truncation of the Lanczos infinite series
202	to /N/ terms. They are calculated from the sum of the first five neglected
203	terms - and are known to be rather pessimistic estimates - although it is noticeable
204	that the best combinations of /N/ and /g/ occurred when the estimated truncation error
205	almost exactly matches the machine epsilon for the type in question.
206
207	[table Optimum value for N and g when computing at fixed precision
208	[[Significand Size][Platform/Compiler Used][N][g][Max Truncation Error]]
209	[[24][Win32, VC++ 7.1] [6] [1.428456135094165802001953125][9.41e-007]]
210	[[53][Win32, VC++ 7.1] [13] [6.024680040776729583740234375][3.23e-016]]
211	[[64][Suse Linux 9 IA64, gcc-3.3.3] [17] [12.2252227365970611572265625][2.34e-024]]
212	[[116][HP Tru64 Unix 5.1B \/ Alpha, Compaq C++ V7.1-006] [24] [20.3209821879863739013671875][4.75e-035]]
213	]
214
215	Finally note that the Lanczos approximation can be written as follows
216	by removing a factor of exp(g) from the denominator, and then dividing
217	all the coefficients by exp(g):
218
219	[equation lanczos7]
220
221	This form is more convenient for calculating lgamma, but for the gamma
222	function the division by /e/ turns a possibly exact quality into an
223	inexact value: this reduces accuracy in the common case that
224	the input is exact, and so isn't used for the gamma function.
225
226	[h4 References]
227
228	# [#godfrey]Paul Godfrey, [@http://my.fit.edu/~gabdo/gamma.txt "A note on the computation of the convergent
229	Lanczos complex Gamma approximation"].
230	# [#pugh]Glendon Ralph Pugh,
231	[@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf
232	"An Analysis of the Lanczos Gamma Approximation"],
233	PhD Thesis November 2004.
234	# Viktor T. Toth,
235	[@http://www.rskey.org/gamma.htm "Calculators and the Gamma Function"].
236	# Mathworld, [@http://mathworld.wolfram.com/LanczosApproximation.html
237	The Lanczos Approximation].
238
239	[endsect][/section:lanczos The Lanczos Approximation]
240
241	[/
242	Copyright 2006 John Maddock and Paul A. Bristow.
243	Distributed under the Boost Software License, Version 1.0.
244	(See accompanying file LICENSE_1_0.txt or copy at
245	http://www.boost.org/LICENSE_1_0.txt).
246	]