[ceph.git] / ceph / src / boost / libs / math / doc / distributions / nc_chi_squared.qbk

[section:nc_chi_squared_dist Noncentral Chi-Squared Distribution]

``#include <boost/math/distributions/non_central_chi_squared.hpp>``

   namespace boost{ namespace math{

   template <class RealType = double,
             class ``__Policy``   = ``__policy_class`` >
   class non_central_chi_squared_distribution;

   typedef non_central_chi_squared_distribution<> non_central_chi_squared;

   template <class RealType, class ``__Policy``>
   class non_central_chi_squared_distribution
   {
   public:
      typedef RealType  value_type;
      typedef Policy    policy_type;

      // Constructor:
      non_central_chi_squared_distribution(RealType v, RealType lambda);

      // Accessor to degrees of freedom parameter v:
      RealType degrees_of_freedom()const;

      // Accessor to non centrality parameter lambda:
      RealType non_centrality()const;

      // Parameter finders:
      static RealType find_degrees_of_freedom(RealType lambda, RealType x, RealType p);
      template <class A, class B, class C>
      static RealType find_degrees_of_freedom(const complemented3_type<A,B,C>& c);

      static RealType find_non_centrality(RealType v, RealType x, RealType p);
      template <class A, class B, class C>
      static RealType find_non_centrality(const complemented3_type<A,B,C>& c);
   };

   }} // namespaces

The noncentral chi-squared distribution is a generalization of the
__chi_squared_distrib. If X[sub i] are [nu] independent, normally
distributed random variables with means [mu][sub i] and variances
[sigma][sub i][super 2], then the random variable

[equation nc_chi_squ_ref1]

is distributed according to the noncentral chi-squared distribution.

The noncentral chi-squared distribution has two parameters:
[nu] which specifies the number of degrees of freedom
(i.e. the number of X[sub i]), and [lambda] which is related to the
mean of the random variables X[sub i] by:

[equation nc_chi_squ_ref2]

(Note that some references define [lambda] as one half of the above sum).

This leads to a PDF of:

[equation nc_chi_squ_ref3]

where ['f(x;k)] is the central chi-squared distribution PDF, and
['I[sub v](x)] is a modified Bessel function of the first kind.

The following graph illustrates how the distribution changes
for different values of [lambda]:

[graph nccs_pdf]

[h4 Member Functions]

      non_central_chi_squared_distribution(RealType v, RealType lambda);

Constructs a Chi-Squared distribution with /v/ degrees of freedom
and non-centrality parameter /lambda/.

Requires v > 0 and lambda >= 0, otherwise calls __domain_error.

      RealType degrees_of_freedom()const;

Returns the parameter /v/ from which this object was constructed.

      RealType non_centrality()const;

Returns the parameter /lambda/ from which this object was constructed.


   static RealType find_degrees_of_freedom(RealType lambda, RealType x, RealType p);

This function returns the number of degrees of freedom /v/ such that:
`cdf(non_central_chi_squared<RealType, Policy>(v, lambda), x) == p`

   template <class A, class B, class C>
   static RealType find_degrees_of_freedom(const complemented3_type<A,B,C>& c);

When called with argument `boost::math::complement(lambda, x, q)`
this function returns the number of degrees of freedom /v/ such that:

`cdf(complement(non_central_chi_squared<RealType, Policy>(v, lambda), x)) == q`.

   static RealType find_non_centrality(RealType v, RealType x, RealType p);

This function returns the non centrality parameter /lambda/ such that:

`cdf(non_central_chi_squared<RealType, Policy>(v, lambda), x) == p`

   template <class A, class B, class C>
   static RealType find_non_centrality(const complemented3_type<A,B,C>& c);

When called with argument `boost::math::complement(v, x, q)`
this function returns the non centrality parameter /lambda/ such that:

`cdf(complement(non_central_chi_squared<RealType, Policy>(v, lambda), x)) == q`.

[h4 Non-member Accessors]

All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions]
that are generic to all distributions are supported: __usual_accessors.

The domain of the random variable is \[0, +[infin]\].

[h4 Examples]

There is a
[link math_toolkit.stat_tut.weg.nccs_eg worked example]
for the noncentral chi-squared distribution.

[h4 Accuracy]

The following table shows the peak errors
(in units of [@http://en.wikipedia.org/wiki/Machine_epsilon epsilon])
found on various platforms with various floating point types.
The failures in the comparison to the [@http://www.r-project.org/ R Math library],
seem to be mostly in the corner cases when the probablity would be very small.
Unless otherwise specified any floating-point type that is narrower
than the one shown will have __zero_error.

[table_non_central_chi_squared_CDF]

[table_non_central_chi_squared_CDF_complement]

Error rates for the quantile
functions are broadly similar.  Special mention should go to
the `mode` function: there is no closed form for this function,
so it is evaluated numerically by finding the maxima of the PDF:
in principal this can not produce an accuracy greater than the
square root of the machine epsilon.

[h4 Tests]

There are two sets of test data used to verify this implementation:
firstly we can compare with published data, for example with
Table 6 of "Self-Validating Computations of Probabilities for
Selected Central and Noncentral Univariate Probability Functions",
Morgan C. Wang and William J. Kennedy,
Journal of the American Statistical Association,
Vol. 89, No. 427. (Sep., 1994), pp. 878-887.
Secondly, we have tables of test data, computed with this
implementation and using interval arithmetic - this data should
be accurate to at least 50 decimal digits - and is the used for
our accuracy tests.

[h4 Implementation]

The CDF and its complement are evaluated as follows:

First we determine which of the two values (the CDF or its
complement) is likely to be the smaller: for this we can use the
relation due to Temme (see "Asymptotic and Numerical Aspects of the
Noncentral Chi-Square Distribution", N. M. Temme, Computers Math. Applic.
Vol 25, No. 5, 55-63, 1993) that:

F([nu],[lambda];[nu]+[lambda]) [asymp] 0.5

and so compute the CDF when the random variable is less than
[nu]+[lambda], and its complement when the random variable is
greater than [nu]+[lambda].  If necessary the computed result
is then subtracted from 1 to give the desired result (the CDF or its
complement).

For small values of the non centrality parameter, the CDF is computed
using the method of Ding (see "Algorithm AS 275: Computing the Non-Central
#2 Distribution Function", Cherng G. Ding, Applied Statistics, Vol. 41,
No. 2. (1992), pp. 478-482).  This uses the following series representation:

[equation nc_chi_squ_ref4]

which requires just one call to __gamma_p_derivative with the subsequent
terms being computed by recursion as shown above.

For larger values of the non-centrality parameter, Ding's method can take
an unreasonable number of terms before convergence is achieved.  Furthermore,
the largest term is not the first term, so in extreme cases the first term may
be zero, leading to a zero result, even though the true value may be non-zero.

Therefore, when the non-centrality parameter is greater than 200, the method due
to Krishnamoorthy (see "Computing discrete mixtures of continuous distributions:
noncentral chisquare, noncentral t and the distribution of the
square of the sample multiple correlation coefficient",
Denise Benton and K. Krishnamoorthy, Computational Statistics &
Data Analysis, 43, (2003), 249-267) is used.

This method uses the well known sum:

[equation nc_chi_squ_ref5]

Where P[sub a](x) is the incomplete gamma function.

The method starts at the [lambda]th term, which is where the Poisson weighting
function achieves its maximum value, although this is not necessarily
the largest overall term.  Subsequent terms are calculated via the normal
recurrence relations for the incomplete gamma function, and iteration proceeds
both forwards and backwards until sufficient precision has been achieved.  It
should be noted that recurrence in the forwards direction of P[sub a](x) is
numerically unstable.  However, since we always start /after/ the largest
term in the series, numeric instability is introduced more slowly than the
series converges.

Computation of the complement of the CDF uses an extension of Krishnamoorthy's
method, given that:

[equation nc_chi_squ_ref6]

we can again start at the [lambda]'th term and proceed in both directions from
there until the required precision is achieved.  This time it is backwards
recursion on the incomplete gamma function Q[sub a](x) which is unstable.
However, as long as we start well /before/ the largest term, this is not an
issue in practice.

The PDF is computed directly using the relation:

[equation nc_chi_squ_ref3]

Where ['f(x; v)] is the PDF of the central __chi_squared_distrib and
['I[sub v](x)] is a modified Bessel function, see __cyl_bessel_i.
For small values of the
non-centrality parameter the relation in terms of __cyl_bessel_i
is used.  However, this method fails for large values of the
non-centrality parameter, so in that case the infinite sum is
evaluated using the method of Benton and Krishnamoorthy, and
the usual recurrence relations for successive terms.

The quantile functions are computed by numeric inversion of the CDF.
An improve starting quess is from
Thomas Luu,
[@http://discovery.ucl.ac.uk/1482128/, Fast and accurate parallel computation of quantile functions for random number generation, Doctorial Thesis, 2016].

There is no [@http://en.wikipedia.org/wiki/Closed_form closed form]
for the mode of the noncentral chi-squared
distribution: it is computed numerically by finding the maximum
of the PDF.  Likewise, the median is computed numerically via
the quantile.

The remaining non-member functions use the following formulas:

[equation nc_chi_squ_ref7]

Some analytic properties of noncentral distributions
(particularly unimodality, and monotonicity of their modes)
are surveyed and summarized by:

Andrea van Aubel & Wolfgang Gawronski, Applied Mathematics and Computation, 141 (2003) 3-12.

[endsect] [/section:nc_chi_squared_dist]

[/ nc_chi_squared.qbk
  Copyright 2008 John Maddock and Paul A. Bristow.
  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]
Commit	Line	Data
7c673cae FG	1	[section:nc_chi_squared_dist Noncentral Chi-Squared Distribution]
	2
	3	``#include <boost/math/distributions/non_central_chi_squared.hpp>``
	4
	5	namespace boost{ namespace math{
	6
	7	template <class RealType = double,
	8	class ``__Policy`` = ``__policy_class`` >
	9	class non_central_chi_squared_distribution;
	10
	11	typedef non_central_chi_squared_distribution<> non_central_chi_squared;
	12
	13	template <class RealType, class ``__Policy``>
	14	class non_central_chi_squared_distribution
	15	{
	16	public:
	17	typedef RealType value_type;
	18	typedef Policy policy_type;
	19
	20	// Constructor:
	21	non_central_chi_squared_distribution(RealType v, RealType lambda);
	22
	23	// Accessor to degrees of freedom parameter v:
	24	RealType degrees_of_freedom()const;
	25
	26	// Accessor to non centrality parameter lambda:
	27	RealType non_centrality()const;
	28
	29	// Parameter finders:
	30	static RealType find_degrees_of_freedom(RealType lambda, RealType x, RealType p);
	31	template <class A, class B, class C>
	32	static RealType find_degrees_of_freedom(const complemented3_type<A,B,C>& c);
	33
	34	static RealType find_non_centrality(RealType v, RealType x, RealType p);
	35	template <class A, class B, class C>
	36	static RealType find_non_centrality(const complemented3_type<A,B,C>& c);
	37	};
	38
	39	}} // namespaces
	40
	41	The noncentral chi-squared distribution is a generalization of the
	42	__chi_squared_distrib. If X[sub i] are [nu] independent, normally
	43	distributed random variables with means [mu][sub i] and variances
	44	[sigma][sub i][super 2], then the random variable
	45
	46	[equation nc_chi_squ_ref1]
	47
	48	is distributed according to the noncentral chi-squared distribution.
	49
	50	The noncentral chi-squared distribution has two parameters:
	51	[nu] which specifies the number of degrees of freedom
	52	(i.e. the number of X[sub i]), and [lambda] which is related to the
	53	mean of the random variables X[sub i] by:
	54
	55	[equation nc_chi_squ_ref2]
	56
	57	(Note that some references define [lambda] as one half of the above sum).
	58
	59	This leads to a PDF of:
	60
	61	[equation nc_chi_squ_ref3]
	62
	63	where ['f(x;k)] is the central chi-squared distribution PDF, and
	64	['I[sub v](x)] is a modified Bessel function of the first kind.
65
66	The following graph illustrates how the distribution changes
67	for different values of [lambda]:
68
69	[graph nccs_pdf]
70
71	[h4 Member Functions]
72
73	non_central_chi_squared_distribution(RealType v, RealType lambda);
74
75	Constructs a Chi-Squared distribution with /v/ degrees of freedom
76	and non-centrality parameter /lambda/.
77
78	Requires v > 0 and lambda >= 0, otherwise calls __domain_error.
79
80	RealType degrees_of_freedom()const;
81
82	Returns the parameter /v/ from which this object was constructed.
83
84	RealType non_centrality()const;
85
86	Returns the parameter /lambda/ from which this object was constructed.
87
88
89	static RealType find_degrees_of_freedom(RealType lambda, RealType x, RealType p);
90
91	This function returns the number of degrees of freedom /v/ such that:
92	`cdf(non_central_chi_squared<RealType, Policy>(v, lambda), x) == p`
93
94	template <class A, class B, class C>
95	static RealType find_degrees_of_freedom(const complemented3_type<A,B,C>& c);
96
97	When called with argument `boost::math::complement(lambda, x, q)`
98	this function returns the number of degrees of freedom /v/ such that:
99
100	`cdf(complement(non_central_chi_squared<RealType, Policy>(v, lambda), x)) == q`.
101
102	static RealType find_non_centrality(RealType v, RealType x, RealType p);
103
104	This function returns the non centrality parameter /lambda/ such that:
105
106	`cdf(non_central_chi_squared<RealType, Policy>(v, lambda), x) == p`
107
108	template <class A, class B, class C>
109	static RealType find_non_centrality(const complemented3_type<A,B,C>& c);
110
111	When called with argument `boost::math::complement(v, x, q)`
112	this function returns the non centrality parameter /lambda/ such that:
113
114	`cdf(complement(non_central_chi_squared<RealType, Policy>(v, lambda), x)) == q`.
115
116	[h4 Non-member Accessors]
117
118	All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions]
119	that are generic to all distributions are supported: __usual_accessors.
120
121	The domain of the random variable is \[0, +[infin]\].
122
123	[h4 Examples]
124
125	There is a
126	[link math_toolkit.stat_tut.weg.nccs_eg worked example]
127	for the noncentral chi-squared distribution.
128
129	[h4 Accuracy]
130
131	The following table shows the peak errors
132	(in units of [@http://en.wikipedia.org/wiki/Machine_epsilon epsilon])
133	found on various platforms with various floating point types.
134	The failures in the comparison to the [@http://www.r-project.org/ R Math library],
135	seem to be mostly in the corner cases when the probablity would be very small.
136	Unless otherwise specified any floating-point type that is narrower
137	than the one shown will have __zero_error.
138
139	[table_non_central_chi_squared_CDF]
140
141	[table_non_central_chi_squared_CDF_complement]
142
143	Error rates for the quantile
144	functions are broadly similar. Special mention should go to
145	the `mode` function: there is no closed form for this function,
146	so it is evaluated numerically by finding the maxima of the PDF:
147	in principal this can not produce an accuracy greater than the
148	square root of the machine epsilon.
149
150	[h4 Tests]
151
152	There are two sets of test data used to verify this implementation:
153	firstly we can compare with published data, for example with
154	Table 6 of "Self-Validating Computations of Probabilities for
155	Selected Central and Noncentral Univariate Probability Functions",
156	Morgan C. Wang and William J. Kennedy,
157	Journal of the American Statistical Association,
158	Vol. 89, No. 427. (Sep., 1994), pp. 878-887.
159	Secondly, we have tables of test data, computed with this
160	implementation and using interval arithmetic - this data should
161	be accurate to at least 50 decimal digits - and is the used for
162	our accuracy tests.
163
164	[h4 Implementation]
165
166	The CDF and its complement are evaluated as follows:
167
168	First we determine which of the two values (the CDF or its
169	complement) is likely to be the smaller: for this we can use the
170	relation due to Temme (see "Asymptotic and Numerical Aspects of the
171	Noncentral Chi-Square Distribution", N. M. Temme, Computers Math. Applic.
172	Vol 25, No. 5, 55-63, 1993) that:
173
174	F([nu],[lambda];[nu]+[lambda]) [asymp] 0.5
175
176	and so compute the CDF when the random variable is less than
177	[nu]+[lambda], and its complement when the random variable is
178	greater than [nu]+[lambda]. If necessary the computed result
179	is then subtracted from 1 to give the desired result (the CDF or its
180	complement).
181
182	For small values of the non centrality parameter, the CDF is computed
183	using the method of Ding (see "Algorithm AS 275: Computing the Non-Central
184	#2 Distribution Function", Cherng G. Ding, Applied Statistics, Vol. 41,
185	No. 2. (1992), pp. 478-482). This uses the following series representation:
186
187	[equation nc_chi_squ_ref4]
188
189	which requires just one call to __gamma_p_derivative with the subsequent
190	terms being computed by recursion as shown above.
191
192	For larger values of the non-centrality parameter, Ding's method can take
193	an unreasonable number of terms before convergence is achieved. Furthermore,
194	the largest term is not the first term, so in extreme cases the first term may
195	be zero, leading to a zero result, even though the true value may be non-zero.
196
197	Therefore, when the non-centrality parameter is greater than 200, the method due
198	to Krishnamoorthy (see "Computing discrete mixtures of continuous distributions:
199	noncentral chisquare, noncentral t and the distribution of the
200	square of the sample multiple correlation coefficient",
201	Denise Benton and K. Krishnamoorthy, Computational Statistics &
202	Data Analysis, 43, (2003), 249-267) is used.
203
204	This method uses the well known sum:
205
206	[equation nc_chi_squ_ref5]
207
208	Where P[sub a](x) is the incomplete gamma function.
209
210	The method starts at the [lambda]th term, which is where the Poisson weighting
211	function achieves its maximum value, although this is not necessarily
212	the largest overall term. Subsequent terms are calculated via the normal
213	recurrence relations for the incomplete gamma function, and iteration proceeds
214	both forwards and backwards until sufficient precision has been achieved. It
215	should be noted that recurrence in the forwards direction of P[sub a](x) is
216	numerically unstable. However, since we always start /after/ the largest
217	term in the series, numeric instability is introduced more slowly than the
218	series converges.
219
220	Computation of the complement of the CDF uses an extension of Krishnamoorthy's
221	method, given that:
222
223	[equation nc_chi_squ_ref6]
224
225	we can again start at the [lambda]'th term and proceed in both directions from
226	there until the required precision is achieved. This time it is backwards
227	recursion on the incomplete gamma function Q[sub a](x) which is unstable.
228	However, as long as we start well /before/ the largest term, this is not an
229	issue in practice.
230
231	The PDF is computed directly using the relation:
232
233	[equation nc_chi_squ_ref3]
234
235	Where ['f(x; v)] is the PDF of the central __chi_squared_distrib and
236	['I[sub v](x)] is a modified Bessel function, see __cyl_bessel_i.
237	For small values of the
238	non-centrality parameter the relation in terms of __cyl_bessel_i
239	is used. However, this method fails for large values of the
240	non-centrality parameter, so in that case the infinite sum is
241	evaluated using the method of Benton and Krishnamoorthy, and
242	the usual recurrence relations for successive terms.
243
244	The quantile functions are computed by numeric inversion of the CDF.
245	An improve starting quess is from
246	Thomas Luu,
247	[@http://discovery.ucl.ac.uk/1482128/, Fast and accurate parallel computation of quantile functions for random number generation, Doctorial Thesis, 2016].
248
249	There is no [@http://en.wikipedia.org/wiki/Closed_form closed form]
250	for the mode of the noncentral chi-squared
251	distribution: it is computed numerically by finding the maximum
252	of the PDF. Likewise, the median is computed numerically via
253	the quantile.
254
255	The remaining non-member functions use the following formulas:
256
257	[equation nc_chi_squ_ref7]
258
259	Some analytic properties of noncentral distributions
260	(particularly unimodality, and monotonicity of their modes)
261	are surveyed and summarized by:
262
263	Andrea van Aubel & Wolfgang Gawronski, Applied Mathematics and Computation, 141 (2003) 3-12.
264
265	[endsect] [/section:nc_chi_squared_dist]
266
267	[/ nc_chi_squared.qbk
268	Copyright 2008 John Maddock and Paul A. Bristow.
269	Distributed under the Boost Software License, Version 1.0.
270	(See accompanying file LICENSE_1_0.txt or copy at
271	http://www.boost.org/LICENSE_1_0.txt).
272	]
273