]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [section:arcine_dist Arcsine Distribution] |
2 | ||
3 | [import ../../example/arcsine_example.cpp] [/ for arcsine snips below] | |
4 | ||
5 | ||
6 | ``#include <boost/math/distributions/arcsine.hpp>`` | |
7 | ||
8 | namespace boost{ namespace math{ | |
9 | ||
10 | template <class RealType = double, | |
11 | class ``__Policy`` = ``__policy_class`` > | |
12 | class arcsine_distribution; | |
13 | ||
14 | typedef arcsine_distribution<double> arcsine; // double precision standard arcsine distribution [0,1]. | |
15 | ||
16 | template <class RealType, class ``__Policy``> | |
17 | class arcsine_distribution | |
18 | { | |
19 | public: | |
20 | typedef RealType value_type; | |
21 | typedef Policy policy_type; | |
22 | ||
23 | // Constructor from two range parameters, x_min and x_max: | |
24 | arcsine_distribution(RealType x_min, RealType x_max); | |
25 | ||
26 | // Range Parameter accessors: | |
27 | RealType x_min() const; | |
28 | RealType x_max() const; | |
29 | }; | |
30 | }} // namespaces | |
31 | ||
32 | The class type `arcsine_distribution` represents an | |
33 | [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine] | |
34 | [@http://en.wikipedia.org/wiki/Probability_distribution probability distribution function]. | |
35 | The arcsine distribution is named because its CDF uses the inverse sin[super -1] or arcsine. | |
36 | ||
37 | This is implemented as a generalized version with support from ['x_min] to ['x_max] | |
38 | providing the 'standard arcsine distribution' as default with ['x_min = 0] and ['x_max = 1]. | |
39 | (A few make other choices for 'standard'). | |
40 | ||
41 | The arcsine distribution is generalized to include any bounded support ['a <= x <= b] by | |
42 | [@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram] and | |
43 | [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia], | |
44 | but also using ['location] and ['scale] parameters by | |
45 | [@http://www.math.uah.edu/stat/index.html Virtual Laboratories in Probability and Statistics] | |
46 | [@http://www.math.uah.edu/stat/special/Arcsine.html Arcsine distribution]. | |
47 | The end-point version is simpler and more obvious, so we implement that. | |
48 | If desired, [@http://en.wikipedia.org/wiki/arcsine_distribution this] | |
49 | outlines how the __beta_distrib can be used to add a shape factor. | |
50 | ||
51 | The [@http://en.wikipedia.org/wiki/Probability_density_function probability density function PDF] | |
52 | for the [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine distribution] | |
53 | defined on the interval \[['x_min, x_max]\] is given by: | |
54 | ||
55 | [figspace] [figspace] f(x; x_min, x_max) = 1 /([pi][sdot][sqrt]((x - x_min)[sdot](x_max - x_min)) | |
56 | ||
57 | For example, __WolframAlpha arcsine distribution, from input of | |
58 | ||
59 | N[PDF[arcsinedistribution[0, 1], 0.5], 50] | |
60 | ||
61 | computes the PDF value | |
62 | ||
63 | 0.63661977236758134307553505349005744813783858296183 | |
64 | ||
65 | The Probability Density Functions (PDF) of generalized arcsine distributions are symmetric U-shaped curves, | |
66 | centered on ['(x_max - x_min)/2], | |
67 | highest (infinite) near the two extrema, and quite flat over the central region. | |
68 | ||
69 | If random variate ['x] is ['x_min] or ['x_max], then the PDF is infinity. | |
70 | If random variate ['x] is ['x_min] then the CDF is zero. | |
71 | If random variate ['x] is ['x_max] then the CDF is unity. | |
72 | ||
73 | The 'Standard' (0, 1) arcsine distribution is shown in blue | |
74 | and some generalized examples with other ['x] ranges. | |
75 | ||
76 | [graph arcsine_pdf] | |
77 | ||
78 | The Cumulative Distribution Function CDF is defined as | |
79 | ||
80 | [figspace] [figspace] F(x) = 2[sdot]arcsin([sqrt]((x-x_min)/(x_max - x))) / [pi] | |
81 | ||
82 | [graph arcsine_cdf] | |
83 | ||
84 | [h5 Constructor] | |
85 | ||
86 | arcsine_distribution(RealType x_min, RealType x_max); | |
87 | ||
88 | constructs an arcsine distribution with range parameters ['x_min] and ['x_max]. | |
89 | ||
90 | Requires ['x_min < x_max], otherwise __domain_error is called. | |
91 | ||
92 | For example: | |
93 | ||
94 | arcsine_distribution<> myarcsine(-2, 4); | |
95 | ||
96 | constructs an arcsine distribution with ['x_min = -2] and ['x_max = 4]. | |
97 | ||
98 | Default values of ['x_min = 0] and ['x_max = 1] and a ` typedef arcsine_distribution<double> arcsine;` mean that | |
99 | ||
100 | arcsine as; | |
101 | ||
102 | constructs a 'Standard 01' arcsine distribution. | |
103 | ||
104 | [h5 Parameter Accessors] | |
105 | ||
106 | RealType x_min() const; | |
107 | RealType x_max() const; | |
108 | ||
109 | Return the parameter ['x_min] or ['x_max] from which this distribution was constructed. | |
110 | ||
111 | So, for example: | |
112 | ||
113 | [arcsine_snip_8] | |
114 | ||
115 | [h4 Non-member Accessor Functions] | |
116 | ||
117 | All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions] | |
118 | that are generic to all distributions are supported: __usual_accessors. | |
119 | ||
120 | The formulae for calculating these are shown in the table below, and at | |
121 | [@http://mathworld.wolfram.com/arcsineDistribution.html Wolfram Mathworld]. | |
122 | ||
123 | [note There are always [*two] values for the [*mode], at ['x_min] and at ['x_max], default 0 and 1, | |
124 | so instead we raise the exception __domain_error. | |
125 | At these extrema, the PDFs are infinite, and the CDFs zero or unity.] | |
126 | ||
127 | [h4 Applications] | |
128 | ||
129 | The arcsine distribution is useful to describe | |
130 | [@http://en.wikipedia.org/wiki/Random_walk Random walks], (including drunken walks) | |
131 | [@http://en.wikipedia.org/wiki/Brownian_motion Brownian motion], | |
132 | [@http://en.wikipedia.org/wiki/Wiener_process Weiner processes], | |
133 | [@http://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trials], | |
134 | and their appplication to solve stock market and other | |
135 | [@http://en.wikipedia.org/wiki/Gambler%27s_ruin ruinous gambling games]. | |
136 | ||
137 | The random variate ['x] is constrained to ['x_min] and ['x_max], (for our 'standard' distribution, 0 and 1), | |
138 | and is usually some fraction. For any other ['x_min] and ['x_max] a fraction can be obtained from ['x] using | |
139 | ||
140 | [sixemspace] fraction = (x - x_min) / (x_max - x_min) | |
141 | ||
142 | The simplest example is tossing heads and tails with a fair coin and modelling the risk of losing, or winning. | |
143 | Walkers (molecules, drunks...) moving left or right of a centre line are another common example. | |
144 | ||
145 | The random variate ['x] is the fraction of time spent on the 'winning' side. | |
146 | If half the time is spent on the 'winning' side (and so the other half on the 'losing' side) then ['x = 1/2]. | |
147 | ||
148 | For large numbers of tosses, this is modelled by the (standard \[0,1\]) arcsine distribution, | |
149 | and the PDF can be calculated thus: | |
150 | ||
151 | [arcsine_snip_2] | |
152 | ||
153 | From the plot of PDF, it is clear that ['x] = [frac12] is the [*minimum] of the curve, | |
154 | so this is the [*least likely] scenario. | |
155 | (This is highly counter-intuitive, considering that fair tosses must [*eventually] become equal. | |
156 | It turns out that ['eventually] is not just very long, but [*infinite]!). | |
157 | ||
158 | The [*most likely] scenarios are towards the extrema where ['x] = 0 or ['x] = 1. | |
159 | ||
160 | If fraction of time on the left is a [frac14], | |
161 | it is only slightly more likely because the curve is quite flat bottomed. | |
162 | ||
163 | [arcsine_snip_3] | |
164 | ||
165 | If we consider fair coin-tossing games being played for 100 days | |
166 | (hypothetically continuously to be 'at-limit') | |
167 | the person winning after day 5 will not change in fraction 0.144 of the cases. | |
168 | ||
169 | We can easily compute this setting ['x] = 5./100 = 0.05 | |
170 | ||
171 | [arcsine_snip_4] | |
172 | ||
173 | Similarly, we can compute from a fraction of 0.05 /2 = 0.025 | |
174 | (halved because we are considering both winners and losers) | |
175 | corresponding to 1 - 0.025 or 97.5% of the gamblers, (walkers, particles...) on the [*same side] of the origin | |
176 | ||
177 | [arcsine_snip_5] | |
178 | ||
179 | (use of the complement gives a bit more clarity, | |
180 | and avoids potential loss of accuracy when ['x] is close to unity, see __why_complements). | |
181 | ||
182 | [arcsine_snip_6] | |
183 | ||
184 | or we can reverse the calculation by assuming a fraction of time on one side, say fraction 0.2, | |
185 | ||
186 | [arcsine_snip_7] | |
187 | ||
188 | [*Summary]: Every time we toss, the odds are equal, | |
189 | so on average we have the same change of winning and losing. | |
190 | ||
191 | But this is [*not true] for an an individual game where one will be [*mostly in a bad or good patch]. | |
192 | ||
193 | This is quite counter-intuitive to most people, but the mathematics is clear, | |
194 | and gamblers continue to provide proof. | |
195 | ||
196 | [*Moral]: if you in a losing patch, leave the game. | |
197 | (Because the odds to recover to a good patch are poor). | |
198 | ||
199 | [*Corollary]: Quit while you are ahead? | |
200 | ||
201 | A working example is at [@../../example/arcsine_example.cpp arcsine_example.cpp] | |
202 | including sample output . | |
203 | ||
204 | [h4 Related distributions] | |
205 | ||
206 | The arcsine distribution with ['x_min = 0] and ['x_max = 1] is special case of the | |
207 | __beta_distrib with [alpha] = 1/2 and [beta] = 1/2. | |
208 | ||
209 | [h4 Accuracy] | |
210 | ||
211 | This distribution is implemented using sqrt, sine, cos and arc sine and cos trigonometric functions | |
212 | which are normally accurate to a few __epsilon. | |
213 | But all values suffer from [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance or cancellation error] | |
214 | for values of ['x] close to ['x_max]. | |
215 | For example, for a standard [0, 1] arcsine distribution ['as], the pdf is symmetric about random variate ['x = 0.5] | |
216 | so that one would expect `pdf(as, 0.01) == pdf(as, 0.99)`. But as ['x] nears unity, there is increasing | |
217 | [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance]. | |
218 | To counteract this, the complement versions of CDF and quantile | |
219 | are implemented with alternative expressions using ['cos[super -1]] instead of ['sin[super -1]]. | |
220 | Users should see __why_complements for guidance on when to avoid loss of accuracy by using complements. | |
221 | ||
222 | [h4 Testing] | |
223 | The results were tested against a few accurate spot values computed by __WolframAlpha, for example: | |
224 | ||
225 | N[PDF[arcsinedistribution[0, 1], 0.5], 50] | |
226 | 0.63661977236758134307553505349005744813783858296183 | |
227 | ||
228 | [h4 Implementation] | |
229 | ||
230 | In the following table ['a] and ['b] are the parameters ['x_min][space] and ['x_max], | |
231 | ['x] is the random variable, ['p] is the probability and its complement ['q = 1-p]. | |
232 | ||
233 | [table | |
234 | [[Function][Implementation Notes]] | |
235 | [[support] [x [isin] \[a, b\], default x [isin] \[0, 1\] ]] | |
236 | [[pdf] [f(x; a, b) = 1/([pi][sdot][sqrt](x - a)[sdot](b - x))]] | |
237 | [[cdf] [F(x) = 2/[pi][sdot]sin[super-1]([sqrt](x - a) / (b - a) ) ]] | |
238 | [[cdf of complement] [2/([pi][sdot]cos[super-1]([sqrt](x - a) / (b - a)))]] | |
239 | [[quantile] [-a[sdot]sin[super 2]([frac12][pi][sdot]p) + a + b[sdot]sin[super 2]([frac12][pi][sdot]p)]] | |
240 | [[quantile from the complement] [-a[sdot]cos[super 2]([frac12][pi][sdot]p) + a + b[sdot]cos[super 2]([frac12][pi][sdot]q)]] | |
241 | [[mean] [[frac12](a+b)]] | |
242 | [[median] [[frac12](a+b)]] | |
243 | [[mode] [ x [isin] \[a, b\], so raises domain_error (returning NaN).]] | |
244 | [[variance] [(b - a)[super 2] / 8]] | |
245 | [[skewness] [0]] | |
246 | [[kurtosis excess] [ -3/2 ]] | |
247 | [[kurtosis] [kurtosis_excess + 3]] | |
248 | ] | |
249 | ||
250 | The quantile was calculated using an expression obtained by using __WolframAlpha | |
251 | to invert the formula for the CDF thus | |
252 | ||
253 | solve [p - 2/pi sin^-1(sqrt((x-a)/(b-a))) = 0, x] | |
254 | ||
255 | which was interpreted as | |
256 | ||
257 | Solve[p - (2 ArcSin[Sqrt[(-a + x)/(-a + b)]])/Pi == 0, x, MaxExtraConditions -> Automatic] | |
258 | ||
259 | and produced the resulting expression | |
260 | ||
261 | x = -a sin^2((pi p)/2)+a+b sin^2((pi p)/2) | |
262 | ||
263 | Thanks to Wolfram for providing this facility. | |
264 | ||
265 | [h4 References] | |
266 | ||
267 | * [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia arcsine distribution] | |
268 | * [@http://en.wikipedia.org/wiki/Beta_distribution Wikipedia Beta distribution] | |
269 | * [@http://mathworld.wolfram.com/BetaDistribution.html Wolfram MathWorld] | |
270 | * [@http://www.wolframalpha.com/ Wolfram Alpha] | |
271 | ||
272 | [h4 Sources] | |
273 | ||
274 | *[@http://estebanmoro.org/2009/04/the-probability-of-going-through-a-bad-patch The probability of going through a bad patch] Esteban Moro's Blog. | |
275 | *[@http://www.gotohaggstrom.com/What%20do%20schmucks%20and%20the%20arc%20sine%20law%20have%20in%20common.pdf What soschumcks and the arc sine have in common] Peter Haggstrom. | |
276 | *[@http://www.math.uah.edu/stat/special/Arcsine.html arcsine distribution]. | |
277 | *[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram reference arcsine examples]. | |
278 | *[@http://www.math.harvard.edu/library/sternberg/slides/1180908.pdf Shlomo Sternberg slides]. | |
279 | ||
280 | ||
281 | [endsect] [/section:arcsine_dist arcsine] | |
282 | ||
283 | [/ arcsine.qbk | |
284 | Copyright 2014 John Maddock and Paul A. Bristow. | |
285 | Distributed under the Boost Software License, Version 1.0. | |
286 | (See accompanying file LICENSE_1_0.txt or copy at | |
287 | http://www.boost.org/LICENSE_1_0.txt). | |
288 | ] |