]> git.proxmox.com Git - ceph.git/blame - ceph/src/boost/libs/math/doc/distributions/binomial_example.qbk
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / math / doc / distributions / binomial_example.qbk
CommitLineData
7c673cae
FG
1[section:binom_eg Binomial Distribution Examples]
2
3See also the reference documentation for the __binomial_distrib.
4
5[section:binomial_coinflip_example Binomial Coin-Flipping Example]
6
7[import ../../example/binomial_coinflip_example.cpp]
8[binomial_coinflip_example1]
9
10See [@../../example/binomial_coinflip_example.cpp binomial_coinflip_example.cpp]
11for full source code, the program output looks like this:
12
13[binomial_coinflip_example_output]
14
15[endsect] [/section:binomial_coinflip_example Binomial coinflip example]
16
17[section:binomial_quiz_example Binomial Quiz Example]
18
19[import ../../example/binomial_quiz_example.cpp]
20[binomial_quiz_example1]
21[binomial_quiz_example2]
22[discrete_quantile_real]
23
24See [@../../example/binomial_quiz_example.cpp binomial_quiz_example.cpp]
25for full source code and output.
26
27[endsect] [/section:binomial_coinflip_quiz Binomial Coin-Flipping example]
28
29[section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution]
30
31Imagine you have a process that follows a binomial distribution: for each
32trial conducted, an event either occurs or does it does not, referred
33to as "successes" and "failures". If, by experiment, you want to measure the
34frequency with which successes occur, the best estimate is given simply
35by /k/ \/ /N/, for /k/ successes out of /N/ trials. However our confidence in that
36estimate will be shaped by how many trials were conducted, and how many successes
37were observed. The static member functions
38`binomial_distribution<>::find_lower_bound_on_p` and
39`binomial_distribution<>::find_upper_bound_on_p` allow you to calculate
40the confidence intervals for your estimate of the occurrence frequency.
41
42The sample program [@../../example/binomial_confidence_limits.cpp
43binomial_confidence_limits.cpp] illustrates their use. It begins by defining
44a procedure that will print a table of confidence limits for various degrees
45of certainty:
46
47 #include <iostream>
48 #include <iomanip>
49 #include <boost/math/distributions/binomial.hpp>
50
51 void confidence_limits_on_frequency(unsigned trials, unsigned successes)
52 {
53 //
54 // trials = Total number of trials.
55 // successes = Total number of observed successes.
56 //
57 // Calculate confidence limits for an observed
58 // frequency of occurrence that follows a binomial
59 // distribution.
60 //
61 using namespace std;
62 using namespace boost::math;
63
64 // Print out general info:
65 cout <<
66 "___________________________________________\n"
67 "2-Sided Confidence Limits For Success Ratio\n"
68 "___________________________________________\n\n";
69 cout << setprecision(7);
70 cout << setw(40) << left << "Number of Observations" << "= " << trials << "\n";
71 cout << setw(40) << left << "Number of successes" << "= " << successes << "\n";
72 cout << setw(40) << left << "Sample frequency of occurrence" << "= " << double(successes) / trials << "\n";
73
74The procedure now defines a table of significance levels: these are the
75probabilities that the true occurrence frequency lies outside the calculated
76interval:
77
78 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
79
80Some pretty printing of the table header follows:
81
82 cout << "\n\n"
83 "_______________________________________________________________________\n"
84 "Confidence Lower CP Upper CP Lower JP Upper JP\n"
85 " Value (%) Limit Limit Limit Limit\n"
86 "_______________________________________________________________________\n";
87
88
89And now for the important part - the intervals themselves - for each
90value of /alpha/, we call `find_lower_bound_on_p` and
91`find_lower_upper_on_p` to obtain lower and upper bounds
92respectively. Note that since we are calculating a two-sided interval,
93we must divide the value of alpha in two.
94
95Please note that calculating two separate /single sided bounds/, each with risk
96level [alpha][space]is not the same thing as calculating a two sided interval.
97Had we calculate two single-sided intervals each with a risk
98that the true value is outside the interval of [alpha], then:
99
100* The risk that it is less than the lower bound is [alpha].
101
102and
103
104* The risk that it is greater than the upper bound is also [alpha].
105
106So the risk it is outside *upper or lower bound*, is *twice* alpha, and the
107probability that it is inside the bounds is therefore not nearly as high as
108one might have thought. This is why [alpha]/2 must be used in
109the calculations below.
110
111In contrast, had we been calculating a
112single-sided interval, for example: ['"Calculate a lower bound so that we are P%
113sure that the true occurrence frequency is greater than some value"]
114then we would *not* have divided by two.
115
116Finally note that `binomial_distribution` provides a choice of two
117methods for the calculation, we print out the results from both
118methods in this example:
119
120 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
121 {
122 // Confidence value:
123 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
124 // Calculate Clopper Pearson bounds:
125 double l = binomial_distribution<>::find_lower_bound_on_p(
126 trials, successes, alpha[i]/2);
127 double u = binomial_distribution<>::find_upper_bound_on_p(
128 trials, successes, alpha[i]/2);
129 // Print Clopper Pearson Limits:
130 cout << fixed << setprecision(5) << setw(15) << right << l;
131 cout << fixed << setprecision(5) << setw(15) << right << u;
132 // Calculate Jeffreys Prior Bounds:
133 l = binomial_distribution<>::find_lower_bound_on_p(
134 trials, successes, alpha[i]/2,
135 binomial_distribution<>::jeffreys_prior_interval);
136 u = binomial_distribution<>::find_upper_bound_on_p(
137 trials, successes, alpha[i]/2,
138 binomial_distribution<>::jeffreys_prior_interval);
139 // Print Jeffreys Prior Limits:
140 cout << fixed << setprecision(5) << setw(15) << right << l;
141 cout << fixed << setprecision(5) << setw(15) << right << u << std::endl;
142 }
143 cout << endl;
144 }
145
146And that's all there is to it. Let's see some sample output for a 2 in 10
147success ratio, first for 20 trials:
148
149[pre'''___________________________________________
1502-Sided Confidence Limits For Success Ratio
151___________________________________________
152
153Number of Observations = 20
154Number of successes = 4
155Sample frequency of occurrence = 0.2
156
157
158_______________________________________________________________________
159Confidence Lower CP Upper CP Lower JP Upper JP
160 Value (%) Limit Limit Limit Limit
161_______________________________________________________________________
162 50.000 0.12840 0.29588 0.14974 0.26916
163 75.000 0.09775 0.34633 0.11653 0.31861
164 90.000 0.07135 0.40103 0.08734 0.37274
165 95.000 0.05733 0.43661 0.07152 0.40823
166 99.000 0.03576 0.50661 0.04655 0.47859
167 99.900 0.01905 0.58632 0.02634 0.55960
168 99.990 0.01042 0.64997 0.01530 0.62495
169 99.999 0.00577 0.70216 0.00901 0.67897
170''']
171
172As you can see, even at the 95% confidence level the bounds are
173really quite wide (this example is chosen to be easily compared to the one
174in the __handbook
175[@http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm
176here]). Note also that the Clopper-Pearson calculation method (CP above)
177produces quite noticeably more pessimistic estimates than the Jeffreys Prior
178method (JP above).
179
180
181Compare that with the program output for
1822000 trials:
183
184[pre'''___________________________________________
1852-Sided Confidence Limits For Success Ratio
186___________________________________________
187
188Number of Observations = 2000
189Number of successes = 400
190Sample frequency of occurrence = 0.2000000
191
192
193_______________________________________________________________________
194Confidence Lower CP Upper CP Lower JP Upper JP
195 Value (%) Limit Limit Limit Limit
196_______________________________________________________________________
197 50.000 0.19382 0.20638 0.19406 0.20613
198 75.000 0.18965 0.21072 0.18990 0.21047
199 90.000 0.18537 0.21528 0.18561 0.21503
200 95.000 0.18267 0.21821 0.18291 0.21796
201 99.000 0.17745 0.22400 0.17769 0.22374
202 99.900 0.17150 0.23079 0.17173 0.23053
203 99.990 0.16658 0.23657 0.16681 0.23631
204 99.999 0.16233 0.24169 0.16256 0.24143
205''']
206
207Now even when the confidence level is very high, the limits are really
208quite close to the experimentally calculated value of 0.2. Furthermore
209the difference between the two calculation methods is now really quite small.
210
211[endsect]
212
213[section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.]
214
215Imagine you have a critical component that you know will fail in 1 in
216N "uses" (for some suitable definition of "use"). You may want to schedule
217routine replacement of the component so that its chance of failure between
218routine replacements is less than P%. If the failures follow a binomial
219distribution (each time the component is "used" it either fails or does not)
220then the static member function `binomial_distibution<>::find_maximum_number_of_trials`
221can be used to estimate the maximum number of "uses" of that component for some
222acceptable risk level /alpha/.
223
224The example program
225[@../../example/binomial_sample_sizes.cpp binomial_sample_sizes.cpp]
226demonstrates its usage. It centres on a routine that prints out
227a table of maximum sample sizes for various probability thresholds:
228
229 void find_max_sample_size(
230 double p, // success ratio.
231 unsigned successes) // Total number of observed successes permitted.
232 {
233
234The routine then declares a table of probability thresholds: these are the
235maximum acceptable probability that /successes/ or fewer events will be
236observed. In our example, /successes/ will be always zero, since we want
237no component failures, but in other situations non-zero values may well
238make sense.
239
240 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
241
242Much of the rest of the program is pretty-printing, the important part
243is in the calculation of maximum number of permitted trials for each
244value of alpha:
245
246 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
247 {
248 // Confidence value:
249 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
250 // calculate trials:
251 double t = binomial::find_maximum_number_of_trials(
252 successes, p, alpha[i]);
253 t = floor(t);
254 // Print Trials:
255 cout << fixed << setprecision(5) << setw(15) << right << t << endl;
256 }
257
258Note that since we're
259calculating the maximum number of trials permitted, we'll err on the safe
260side and take the floor of the result. Had we been calculating the
261/minimum/ number of trials required to observe a certain number of /successes/
262using `find_minimum_number_of_trials` we would have taken the ceiling instead.
263
264We'll finish off by looking at some sample output, firstly for
265a 1 in 1000 chance of component failure with each use:
266
267[pre
268'''________________________
269Maximum Number of Trials
270________________________
271
272Success ratio = 0.001
273Maximum Number of "successes" permitted = 0
274
275
276____________________________
277Confidence Max Number
278 Value (%) Of Trials
279____________________________
280 50.000 692
281 75.000 287
282 90.000 105
283 95.000 51
284 99.000 10
285 99.900 0
286 99.990 0
287 99.999 0'''
288]
289
290So 51 "uses" of the component would yield a 95% chance that no
291component failures would be observed.
292
293Compare that with a 1 in 1 million chance of component failure:
294
295[pre'''
296________________________
297Maximum Number of Trials
298________________________
299
300Success ratio = 0.0000010
301Maximum Number of "successes" permitted = 0
302
303
304____________________________
305Confidence Max Number
306 Value (%) Of Trials
307____________________________
308 50.000 693146
309 75.000 287681
310 90.000 105360
311 95.000 51293
312 99.000 10050
313 99.900 1000
314 99.990 100
315 99.999 10'''
316]
317
318In this case, even 1000 uses of the component would still yield a
319less than 1 in 1000 chance of observing a component failure
320(i.e. a 99.9% chance of no failure).
321
322[endsect] [/section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.]
323
324[endsect][/section:binom_eg Binomial Distribution]
325
326[/
327 Copyright 2006 John Maddock and Paul A. Bristow.
328 Distributed under the Boost Software License, Version 1.0.
329 (See accompanying file LICENSE_1_0.txt or copy at
330 http://www.boost.org/LICENSE_1_0.txt).
331]
332