]>
git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/math/example/binomial_quiz_example.cpp
1 // Copyright Paul A. Bristow 2007, 2009, 2010
2 // Copyright John Maddock 2006
4 // Use, modification and distribution are subject to the
5 // Boost Software License, Version 1.0.
6 // (See accompanying file LICENSE_1_0.txt
7 // or copy at http://www.boost.org/LICENSE_1_0.txt)
9 // binomial_examples_quiz.cpp
11 // Simple example of computing probabilities and quantiles for a binomial random variable
12 // representing the correct guesses on a multiple-choice test.
14 // source http://www.stat.wvu.edu/SRS/Modules/Binomial/test.html
16 //[binomial_quiz_example1
18 A multiple choice test has four possible answers to each of 16 questions.
19 A student guesses the answer to each question,
20 so the probability of getting a correct answer on any given question is
21 one in four, a quarter, 1/4, 25% or fraction 0.25.
22 The conditions of the binomial experiment are assumed to be met:
23 n = 16 questions constitute the trials;
24 each question results in one of two possible outcomes (correct or incorrect);
25 the probability of being correct is 0.25 and is constant if no knowledge about the subject is assumed;
26 the questions are answered independently if the student's answer to a question
27 in no way influences his/her answer to another question.
29 First, we need to be able to use the binomial distribution constructor
30 (and some std input/output, of course).
33 #include <boost/math/distributions/binomial.hpp>
34 using boost::math::binomial
;
37 using std::cout
; using std::endl
;
38 using std::ios
; using std::flush
; using std::left
; using std::right
; using std::fixed
;
40 using std::setw
; using std::setprecision
;
45 //][/binomial_quiz_example1]
51 cout
<< "Binomial distribution example - guessing in a quiz." << endl
;
52 //[binomial_quiz_example2
54 The number of correct answers, X, is distributed as a binomial random variable
55 with binomial distribution parameters: questions n and success fraction probability p.
56 So we construct a binomial distribution:
58 int questions
= 16; // All the questions in the quiz.
59 int answers
= 4; // Possible answers to each question.
60 double success_fraction
= 1. / answers
; // If a random guess, p = 1/4 = 0.25.
61 binomial
quiz(questions
, success_fraction
);
63 and display the distribution parameters we used thus:
65 cout
<< "In a quiz with " << quiz
.trials()
66 << " questions and with a probability of guessing right of "
67 << quiz
.success_fraction() * 100 << " %"
68 << " or 1 in " << static_cast<int>(1. / quiz
.success_fraction()) << endl
;
70 Show a few probabilities of just guessing:
72 cout
<< "Probability of getting none right is " << pdf(quiz
, 0) << endl
; // 0.010023
73 cout
<< "Probability of getting exactly one right is " << pdf(quiz
, 1) << endl
;
74 cout
<< "Probability of getting exactly two right is " << pdf(quiz
, 2) << endl
;
76 cout
<< "Probability of getting exactly " << pass_score
<< " answers right by chance is "
77 << pdf(quiz
, pass_score
) << endl
;
78 cout
<< "Probability of getting all " << questions
<< " answers right by chance is "
79 << pdf(quiz
, questions
) << endl
;
82 Probability of getting none right is 0.0100226
83 Probability of getting exactly one right is 0.0534538
84 Probability of getting exactly two right is 0.133635
85 Probability of getting exactly 11 right is 0.000247132
86 Probability of getting exactly all 16 answers right by chance is 2.32831e-010
88 These don't give any encouragement to guessers!
90 We can tabulate the 'getting exactly right' ( == ) probabilities thus:
92 cout
<< "\n" "Guessed Probability" << right
<< endl
;
93 for (int successes
= 0; successes
<= questions
; successes
++)
95 double probability
= pdf(quiz
, successes
);
96 cout
<< setw(2) << successes
<< " " << probability
<< endl
;
120 Then we can add the probabilities of some 'exactly right' like this:
122 cout
<< "Probability of getting none or one right is " << pdf(quiz
, 0) + pdf(quiz
, 1) << endl
;
126 Probability of getting none or one right is 0.0634764
128 But if more than a couple of scores are involved, it is more convenient (and may be more accurate)
129 to use the Cumulative Distribution Function (cdf) instead:
131 cout
<< "Probability of getting none or one right is " << cdf(quiz
, 1) << endl
;
134 Probability of getting none or one right is 0.0634764
136 Since the cdf is inclusive, we can get the probability of getting up to 10 right ( <= )
138 cout
<< "Probability of getting <= 10 right (to fail) is " << cdf(quiz
, 10) << endl
;
141 Probability of getting <= 10 right (to fail) is 0.999715
143 To get the probability of getting 11 or more right (to pass),
144 it is tempting to use ``1 - cdf(quiz, 10)`` to get the probability of > 10
146 cout
<< "Probability of getting > 10 right (to pass) is " << 1 - cdf(quiz
, 10) << endl
;
149 Probability of getting > 10 right (to pass) is 0.000285239
151 But this should be resisted in favor of using the __complements function (see __why_complements).
153 cout
<< "Probability of getting > 10 right (to pass) is " << cdf(complement(quiz
, 10)) << endl
;
156 Probability of getting > 10 right (to pass) is 0.000285239
158 And we can check that these two, <= 10 and > 10, add up to unity.
160 BOOST_ASSERT((cdf(quiz
, 10) + cdf(complement(quiz
, 10))) == 1.);
162 If we want a < rather than a <= test, because the CDF is inclusive, we must subtract one from the score.
164 cout
<< "Probability of getting less than " << pass_score
165 << " (< " << pass_score
<< ") answers right by guessing is "
166 << cdf(quiz
, pass_score
-1) << endl
;
169 Probability of getting less than 11 (< 11) answers right by guessing is 0.999715
171 and similarly to get a >= rather than a > test
172 we also need to subtract one from the score (and can again check the sum is unity).
173 This is because if the cdf is /inclusive/,
174 then its complement must be /exclusive/ otherwise there would be one possible
175 outcome counted twice!
177 cout
<< "Probability of getting at least " << pass_score
178 << "(>= " << pass_score
<< ") answers right by guessing is "
179 << cdf(complement(quiz
, pass_score
-1))
180 << ", only 1 in " << 1/cdf(complement(quiz
, pass_score
-1)) << endl
;
182 BOOST_ASSERT((cdf(quiz
, pass_score
-1) + cdf(complement(quiz
, pass_score
-1))) == 1);
186 Probability of getting at least 11 (>= 11) answers right by guessing is 0.000285239, only 1 in 3505.83
188 Finally we can tabulate some probabilities:
190 cout
<< "\n" "At most (<=)""\n""Guessed OK Probability" << right
<< endl
;
191 for (int score
= 0; score
<= questions
; score
++)
193 cout
<< setw(2) << score
<< " " << setprecision(10)
194 << cdf(quiz
, score
) << endl
;
200 Guessed OK Probability
220 cout
<< "\n" "At least (>)""\n""Guessed OK Probability" << right
<< endl
;
221 for (int score
= 0; score
<= questions
; score
++)
223 cout
<< setw(2) << score
<< " " << setprecision(10)
224 << cdf(complement(quiz
, score
)) << endl
;
229 Guessed OK Probability
248 We now consider the probabilities of *ranges* of correct guesses.
250 First, calculate the probability of getting a range of guesses right,
251 by adding the exact probabilities of each from low ... high.
253 int low
= 3; // Getting at least 3 right.
254 int high
= 5; // Getting as most 5 right.
256 for (int i
= low
; i
<= high
; i
++)
261 cout
<< "Probability of getting between "
262 << low
<< " and " << high
<< " answers right by guessing is "
263 << sum
<< endl
; // 0.61323
266 Probability of getting between 3 and 5 answers right by guessing is 0.6132
268 Or, usually better, we can use the difference of cdfs instead:
270 cout
<< "Probability of getting between " << low
<< " and " << high
<< " answers right by guessing is "
271 << cdf(quiz
, high
) - cdf(quiz
, low
- 1) << endl
; // 0.61323
274 Probability of getting between 3 and 5 answers right by guessing is 0.6132
276 And we can also try a few more combinations of high and low choices:
279 cout
<< "Probability of getting between " << low
<< " and " << high
<< " answers right by guessing is "
280 << cdf(quiz
, high
) - cdf(quiz
, low
- 1) << endl
; // 1 and 6 P= 0.91042
282 cout
<< "Probability of getting between " << low
<< " and " << high
<< " answers right by guessing is "
283 << cdf(quiz
, high
) - cdf(quiz
, low
- 1) << endl
; // 1 <= x 8 P = 0.9825
285 cout
<< "Probability of getting between " << low
<< " and " << high
<< " answers right by guessing is "
286 << cdf(quiz
, high
) - cdf(quiz
, low
- 1) << endl
; // 4 <= x 4 P = 0.22520
290 Probability of getting between 1 and 6 answers right by guessing is 0.9104
291 Probability of getting between 1 and 8 answers right by guessing is 0.9825
292 Probability of getting between 4 and 4 answers right by guessing is 0.2252
294 [h4 Using Binomial distribution moments]
295 Using moments of the distribution, we can say more about the spread of results from guessing.
297 cout
<< "By guessing, on average, one can expect to get " << mean(quiz
) << " correct answers." << endl
;
298 cout
<< "Standard deviation is " << standard_deviation(quiz
) << endl
;
299 cout
<< "So about 2/3 will lie within 1 standard deviation and get between "
300 << ceil(mean(quiz
) - standard_deviation(quiz
)) << " and "
301 << floor(mean(quiz
) + standard_deviation(quiz
)) << " correct." << endl
;
302 cout
<< "Mode (the most frequent) is " << mode(quiz
) << endl
;
303 cout
<< "Skewness is " << skewness(quiz
) << endl
;
307 By guessing, on average, one can expect to get 4 correct answers.
308 Standard deviation is 1.732
309 So about 2/3 will lie within 1 standard deviation and get between 3 and 5 correct.
310 Mode (the most frequent) is 4
314 The quantiles (percentiles or percentage points) for a few probability levels:
316 cout
<< "Quartiles " << quantile(quiz
, 0.25) << " to "
317 << quantile(complement(quiz
, 0.25)) << endl
; // Quartiles
318 cout
<< "1 standard deviation " << quantile(quiz
, 0.33) << " to "
319 << quantile(quiz
, 0.67) << endl
; // 1 sd
320 cout
<< "Deciles " << quantile(quiz
, 0.1) << " to "
321 << quantile(complement(quiz
, 0.1))<< endl
; // Deciles
322 cout
<< "5 to 95% " << quantile(quiz
, 0.05) << " to "
323 << quantile(complement(quiz
, 0.05))<< endl
; // 5 to 95%
324 cout
<< "2.5 to 97.5% " << quantile(quiz
, 0.025) << " to "
325 << quantile(complement(quiz
, 0.025)) << endl
; // 2.5 to 97.5%
326 cout
<< "2 to 98% " << quantile(quiz
, 0.02) << " to "
327 << quantile(complement(quiz
, 0.02)) << endl
; // 2 to 98%
329 cout
<< "If guessing then percentiles 1 to 99% will get " << quantile(quiz
, 0.01)
330 << " to " << quantile(complement(quiz
, 0.01)) << " right." << endl
;
332 Notice that these output integral values because the default policy is `integer_round_outwards`.
335 1 standard deviation 2 to 5
343 //] [/binomial_quiz_example2]
345 //[discrete_quantile_real
347 Quantiles values are controlled by the __understand_dis_quant quantile policy chosen.
348 The default is `integer_round_outwards`,
349 so the lower quantile is rounded down, and the upper quantile is rounded up.
351 But we might believe that the real values tell us a little more - see __math_discrete.
353 We could control the policy for *all* distributions by
355 #define BOOST_MATH_DISCRETE_QUANTILE_POLICY real
357 at the head of the program would make this policy apply
358 to this *one, and only*, translation unit.
360 Or we can now create a (typedef for) policy that has discrete quantiles real
361 (here avoiding any 'using namespaces ...' statements):
363 using boost::math::policies::policy
;
364 using boost::math::policies::discrete_quantile
;
365 using boost::math::policies::real
;
366 using boost::math::policies::integer_round_outwards
; // Default.
367 typedef boost::math::policies::policy
<discrete_quantile
<real
> > real_quantile_policy
;
369 Add a custom binomial distribution called ``real_quantile_binomial`` that uses ``real_quantile_policy``
371 using boost::math::binomial_distribution
;
372 typedef binomial_distribution
<double, real_quantile_policy
> real_quantile_binomial
;
374 Construct an object of this custom distribution:
376 real_quantile_binomial
quiz_real(questions
, success_fraction
);
378 And use this to show some quantiles - that now have real rather than integer values.
380 cout
<< "Quartiles " << quantile(quiz
, 0.25) << " to "
381 << quantile(complement(quiz_real
, 0.25)) << endl
; // Quartiles 2 to 4.6212
382 cout
<< "1 standard deviation " << quantile(quiz_real
, 0.33) << " to "
383 << quantile(quiz_real
, 0.67) << endl
; // 1 sd 2.6654 4.194
384 cout
<< "Deciles " << quantile(quiz_real
, 0.1) << " to "
385 << quantile(complement(quiz_real
, 0.1))<< endl
; // Deciles 1.3487 5.7583
386 cout
<< "5 to 95% " << quantile(quiz_real
, 0.05) << " to "
387 << quantile(complement(quiz_real
, 0.05))<< endl
; // 5 to 95% 0.83739 6.4559
388 cout
<< "2.5 to 97.5% " << quantile(quiz_real
, 0.025) << " to "
389 << quantile(complement(quiz_real
, 0.025)) << endl
; // 2.5 to 97.5% 0.42806 7.0688
390 cout
<< "2 to 98% " << quantile(quiz_real
, 0.02) << " to "
391 << quantile(complement(quiz_real
, 0.02)) << endl
; // 2 to 98% 0.31311 7.7880
393 cout
<< "If guessing, then percentiles 1 to 99% will get " << quantile(quiz_real
, 0.01)
394 << " to " << quantile(complement(quiz_real
, 0.01)) << " right." << endl
;
399 1 standard deviation 2.665 to 4.194
400 Deciles 1.349 to 5.758
401 5 to 95% 0.8374 to 6.456
402 2.5 to 97.5% 0.4281 to 7.069
403 2 to 98% 0.3131 to 7.252
404 If guessing then percentiles 1 to 99% will get 0 to 7.788 right.
408 //] [/discrete_quantile_real]
410 catch(const std::exception
& e
)
411 { // Always useful to include try & catch blocks because
412 // default policies are to throw exceptions on arguments that cause
413 // errors like underflow, overflow.
414 // Lacking try & catch blocks, the program will abort without a message below,
415 // which may give some helpful clues as to the cause of the exception.
417 "\n""Message from thrown exception was:\n " << e
.what() << std::endl
;
428 BAutorun "i:\boost-06-05-03-1300\libs\math\test\Math_test\debug\binomial_quiz_example.exe"
429 Binomial distribution example - guessing in a quiz.
430 In a quiz with 16 questions and with a probability of guessing right of 25 % or 1 in 4
431 Probability of getting none right is 0.0100226
432 Probability of getting exactly one right is 0.0534538
433 Probability of getting exactly two right is 0.133635
434 Probability of getting exactly 11 answers right by chance is 0.000247132
435 Probability of getting all 16 answers right by chance is 2.32831e-010
454 Probability of getting none or one right is 0.0634764
455 Probability of getting none or one right is 0.0634764
456 Probability of getting <= 10 right (to fail) is 0.999715
457 Probability of getting > 10 right (to pass) is 0.000285239
458 Probability of getting > 10 right (to pass) is 0.000285239
459 Probability of getting less than 11 (< 11) answers right by guessing is 0.999715
460 Probability of getting at least 11(>= 11) answers right by guessing is 0.000285239, only 1 in 3505.83
462 Guessed OK Probability
481 Guessed OK Probability
499 Probability of getting between 3 and 5 answers right by guessing is 0.6132
500 Probability of getting between 3 and 5 answers right by guessing is 0.6132
501 Probability of getting between 1 and 6 answers right by guessing is 0.9104
502 Probability of getting between 1 and 8 answers right by guessing is 0.9825
503 Probability of getting between 4 and 4 answers right by guessing is 0.2252
504 By guessing, on average, one can expect to get 4 correct answers.
505 Standard deviation is 1.732
506 So about 2/3 will lie within 1 standard deviation and get between 3 and 5 correct.
507 Mode (the most frequent) is 4
510 1 standard deviation 2 to 5
515 If guessing then percentiles 1 to 99% will get 0 to 8 right.
517 1 standard deviation 2.665 to 4.194
518 Deciles 1.349 to 5.758
519 5 to 95% 0.8374 to 6.456
520 2.5 to 97.5% 0.4281 to 7.069
521 2 to 98% 0.3131 to 7.252
522 If guessing, then percentiles 1 to 99% will get 0 to 7.788 right.