]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <html> |
2 | <head> | |
3 | <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> | |
4 | <title>Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution</title> | |
5 | <link rel="stylesheet" href="../../../../math.css" type="text/css"> | |
6 | <meta name="generator" content="DocBook XSL Stylesheets V1.77.1"> | |
7 | <link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1"> | |
8 | <link rel="up" href="../binom_eg.html" title="Binomial Distribution Examples"> | |
9 | <link rel="prev" href="binomial_quiz_example.html" title="Binomial Quiz Example"> | |
10 | <link rel="next" href="binom_size_eg.html" title="Estimating Sample Sizes for a Binomial Distribution."> | |
11 | </head> | |
12 | <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> | |
13 | <table cellpadding="2" width="100%"><tr> | |
14 | <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td> | |
15 | <td align="center"><a href="../../../../../../../../index.html">Home</a></td> | |
16 | <td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td> | |
17 | <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> | |
18 | <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> | |
19 | <td align="center"><a href="../../../../../../../../more/index.htm">More</a></td> | |
20 | </tr></table> | |
21 | <hr> | |
22 | <div class="spirit-nav"> | |
23 | <a accesskey="p" href="binomial_quiz_example.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../binom_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="binom_size_eg.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
24 | </div> | |
25 | <div class="section"> | |
26 | <div class="titlepage"><div><div><h5 class="title"> | |
27 | <a name="math_toolkit.stat_tut.weg.binom_eg.binom_conf"></a><a class="link" href="binom_conf.html" title="Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution">Calculating | |
28 | Confidence Limits on the Frequency of Occurrence for a Binomial Distribution</a> | |
29 | </h5></div></div></div> | |
30 | <p> | |
31 | Imagine you have a process that follows a binomial distribution: for | |
32 | each trial conducted, an event either occurs or does it does not, referred | |
33 | to as "successes" and "failures". If, by experiment, | |
34 | you want to measure the frequency with which successes occur, the best | |
35 | estimate is given simply by <span class="emphasis"><em>k</em></span> / <span class="emphasis"><em>N</em></span>, | |
36 | for <span class="emphasis"><em>k</em></span> successes out of <span class="emphasis"><em>N</em></span> trials. | |
37 | However our confidence in that estimate will be shaped by how many trials | |
38 | were conducted, and how many successes were observed. The static member | |
39 | functions <code class="computeroutput"><span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_lower_bound_on_p</span></code> | |
40 | and <code class="computeroutput"><span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_upper_bound_on_p</span></code> | |
41 | allow you to calculate the confidence intervals for your estimate of | |
42 | the occurrence frequency. | |
43 | </p> | |
44 | <p> | |
45 | The sample program <a href="../../../../../../example/binomial_confidence_limits.cpp" target="_top">binomial_confidence_limits.cpp</a> | |
46 | illustrates their use. It begins by defining a procedure that will print | |
47 | a table of confidence limits for various degrees of certainty: | |
48 | </p> | |
49 | <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> | |
50 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iomanip</span><span class="special">></span> | |
51 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">binomial</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> | |
52 | ||
53 | <span class="keyword">void</span> <span class="identifier">confidence_limits_on_frequency</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">trials</span><span class="special">,</span> <span class="keyword">unsigned</span> <span class="identifier">successes</span><span class="special">)</span> | |
54 | <span class="special">{</span> | |
55 | <span class="comment">//</span> | |
56 | <span class="comment">// trials = Total number of trials.</span> | |
57 | <span class="comment">// successes = Total number of observed successes.</span> | |
58 | <span class="comment">//</span> | |
59 | <span class="comment">// Calculate confidence limits for an observed</span> | |
60 | <span class="comment">// frequency of occurrence that follows a binomial</span> | |
61 | <span class="comment">// distribution.</span> | |
62 | <span class="comment">//</span> | |
63 | <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span> | |
64 | <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span> | |
65 | ||
66 | <span class="comment">// Print out general info:</span> | |
67 | <span class="identifier">cout</span> <span class="special"><<</span> | |
68 | <span class="string">"___________________________________________\n"</span> | |
69 | <span class="string">"2-Sided Confidence Limits For Success Ratio\n"</span> | |
70 | <span class="string">"___________________________________________\n\n"</span><span class="special">;</span> | |
71 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">7</span><span class="special">);</span> | |
72 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Number of Observations"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">trials</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> | |
73 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Number of successes"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">successes</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> | |
74 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Sample frequency of occurrence"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="keyword">double</span><span class="special">(</span><span class="identifier">successes</span><span class="special">)</span> <span class="special">/</span> <span class="identifier">trials</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> | |
75 | </pre> | |
76 | <p> | |
77 | The procedure now defines a table of significance levels: these are the | |
78 | probabilities that the true occurrence frequency lies outside the calculated | |
79 | interval: | |
80 | </p> | |
81 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span> | |
82 | </pre> | |
83 | <p> | |
84 | Some pretty printing of the table header follows: | |
85 | </p> | |
86 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"\n\n"</span> | |
87 | <span class="string">"_______________________________________________________________________\n"</span> | |
88 | <span class="string">"Confidence Lower CP Upper CP Lower JP Upper JP\n"</span> | |
89 | <span class="string">" Value (%) Limit Limit Limit Limit\n"</span> | |
90 | <span class="string">"_______________________________________________________________________\n"</span><span class="special">;</span> | |
91 | </pre> | |
92 | <p> | |
93 | And now for the important part - the intervals themselves - for each | |
94 | value of <span class="emphasis"><em>alpha</em></span>, we call <code class="computeroutput"><span class="identifier">find_lower_bound_on_p</span></code> | |
95 | and <code class="computeroutput"><span class="identifier">find_lower_upper_on_p</span></code> | |
96 | to obtain lower and upper bounds respectively. Note that since we are | |
97 | calculating a two-sided interval, we must divide the value of alpha in | |
98 | two. | |
99 | </p> | |
100 | <p> | |
101 | Please note that calculating two separate <span class="emphasis"><em>single sided bounds</em></span>, | |
102 | each with risk level α  is not the same thing as calculating a two sided | |
103 | interval. Had we calculate two single-sided intervals each with a risk | |
104 | that the true value is outside the interval of α, then: | |
105 | </p> | |
106 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"> | |
107 | The risk that it is less than the lower bound is α. | |
108 | </li></ul></div> | |
109 | <p> | |
110 | and | |
111 | </p> | |
112 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"> | |
113 | The risk that it is greater than the upper bound is also α. | |
114 | </li></ul></div> | |
115 | <p> | |
116 | So the risk it is outside <span class="bold"><strong>upper or lower bound</strong></span>, | |
117 | is <span class="bold"><strong>twice</strong></span> alpha, and the probability | |
118 | that it is inside the bounds is therefore not nearly as high as one might | |
119 | have thought. This is why α/2 must be used in the calculations below. | |
120 | </p> | |
121 | <p> | |
122 | In contrast, had we been calculating a single-sided interval, for example: | |
123 | <span class="emphasis"><em>"Calculate a lower bound so that we are P% sure that the | |
124 | true occurrence frequency is greater than some value"</em></span> | |
125 | then we would <span class="bold"><strong>not</strong></span> have divided by two. | |
126 | </p> | |
127 | <p> | |
128 | Finally note that <code class="computeroutput"><span class="identifier">binomial_distribution</span></code> | |
129 | provides a choice of two methods for the calculation, we print out the | |
130 | results from both methods in this example: | |
131 | </p> | |
132 | <pre class="programlisting"> <span class="keyword">for</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special"><</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">)/</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">[</span><span class="number">0</span><span class="special">]);</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span> | |
133 | <span class="special">{</span> | |
134 | <span class="comment">// Confidence value:</span> | |
135 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">10</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="number">100</span> <span class="special">*</span> <span class="special">(</span><span class="number">1</span><span class="special">-</span><span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]);</span> | |
136 | <span class="comment">// Calculate Clopper Pearson bounds:</span> | |
137 | <span class="keyword">double</span> <span class="identifier">l</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_lower_bound_on_p</span><span class="special">(</span> | |
138 | <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">);</span> | |
139 | <span class="keyword">double</span> <span class="identifier">u</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_upper_bound_on_p</span><span class="special">(</span> | |
140 | <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">);</span> | |
141 | <span class="comment">// Print Clopper Pearson Limits:</span> | |
142 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">l</span><span class="special">;</span> | |
143 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">u</span><span class="special">;</span> | |
144 | <span class="comment">// Calculate Jeffreys Prior Bounds:</span> | |
145 | <span class="identifier">l</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_lower_bound_on_p</span><span class="special">(</span> | |
146 | <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">,</span> | |
147 | <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">jeffreys_prior_interval</span><span class="special">);</span> | |
148 | <span class="identifier">u</span> <span class="special">=</span> <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">find_upper_bound_on_p</span><span class="special">(</span> | |
149 | <span class="identifier">trials</span><span class="special">,</span> <span class="identifier">successes</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">,</span> | |
150 | <span class="identifier">binomial_distribution</span><span class="special"><>::</span><span class="identifier">jeffreys_prior_interval</span><span class="special">);</span> | |
151 | <span class="comment">// Print Jeffreys Prior Limits:</span> | |
152 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">l</span><span class="special">;</span> | |
153 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">u</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> | |
154 | <span class="special">}</span> | |
155 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
156 | <span class="special">}</span> | |
157 | </pre> | |
158 | <p> | |
159 | And that's all there is to it. Let's see some sample output for a 2 in | |
160 | 10 success ratio, first for 20 trials: | |
161 | </p> | |
162 | <pre class="programlisting">___________________________________________ | |
163 | 2-Sided Confidence Limits For Success Ratio | |
164 | ___________________________________________ | |
165 | ||
166 | Number of Observations = 20 | |
167 | Number of successes = 4 | |
168 | Sample frequency of occurrence = 0.2 | |
169 | ||
170 | ||
171 | _______________________________________________________________________ | |
172 | Confidence Lower CP Upper CP Lower JP Upper JP | |
173 | Value (%) Limit Limit Limit Limit | |
174 | _______________________________________________________________________ | |
175 | 50.000 0.12840 0.29588 0.14974 0.26916 | |
176 | 75.000 0.09775 0.34633 0.11653 0.31861 | |
177 | 90.000 0.07135 0.40103 0.08734 0.37274 | |
178 | 95.000 0.05733 0.43661 0.07152 0.40823 | |
179 | 99.000 0.03576 0.50661 0.04655 0.47859 | |
180 | 99.900 0.01905 0.58632 0.02634 0.55960 | |
181 | 99.990 0.01042 0.64997 0.01530 0.62495 | |
182 | 99.999 0.00577 0.70216 0.00901 0.67897 | |
183 | </pre> | |
184 | <p> | |
185 | As you can see, even at the 95% confidence level the bounds are really | |
186 | quite wide (this example is chosen to be easily compared to the one in | |
187 | the <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH | |
188 | e-Handbook of Statistical Methods.</a> <a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm" target="_top">here</a>). | |
189 | Note also that the Clopper-Pearson calculation method (CP above) produces | |
190 | quite noticeably more pessimistic estimates than the Jeffreys Prior method | |
191 | (JP above). | |
192 | </p> | |
193 | <p> | |
194 | Compare that with the program output for 2000 trials: | |
195 | </p> | |
196 | <pre class="programlisting">___________________________________________ | |
197 | 2-Sided Confidence Limits For Success Ratio | |
198 | ___________________________________________ | |
199 | ||
200 | Number of Observations = 2000 | |
201 | Number of successes = 400 | |
202 | Sample frequency of occurrence = 0.2000000 | |
203 | ||
204 | ||
205 | _______________________________________________________________________ | |
206 | Confidence Lower CP Upper CP Lower JP Upper JP | |
207 | Value (%) Limit Limit Limit Limit | |
208 | _______________________________________________________________________ | |
209 | 50.000 0.19382 0.20638 0.19406 0.20613 | |
210 | 75.000 0.18965 0.21072 0.18990 0.21047 | |
211 | 90.000 0.18537 0.21528 0.18561 0.21503 | |
212 | 95.000 0.18267 0.21821 0.18291 0.21796 | |
213 | 99.000 0.17745 0.22400 0.17769 0.22374 | |
214 | 99.900 0.17150 0.23079 0.17173 0.23053 | |
215 | 99.990 0.16658 0.23657 0.16681 0.23631 | |
216 | 99.999 0.16233 0.24169 0.16256 0.24143 | |
217 | </pre> | |
218 | <p> | |
219 | Now even when the confidence level is very high, the limits are really | |
220 | quite close to the experimentally calculated value of 0.2. Furthermore | |
221 | the difference between the two calculation methods is now really quite | |
222 | small. | |
223 | </p> | |
224 | </div> | |
225 | <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> | |
226 | <td align="left"></td> | |
227 | <td align="right"><div class="copyright-footer">Copyright © 2006-2010, 2012-2014 Nikhar Agrawal, | |
228 | Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert | |
229 | Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan Råde, Gautam Sewani, | |
230 | Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p> | |
231 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
232 | file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) | |
233 | </p> | |
234 | </div></td> | |
235 | </tr></table> | |
236 | <hr> | |
237 | <div class="spirit-nav"> | |
238 | <a accesskey="p" href="binomial_quiz_example.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../binom_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="binom_size_eg.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
239 | </div> | |
240 | </body> | |
241 | </html> |