]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <html> |
2 | <head> | |
3 | <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> | |
4 | <title>Confidence Intervals on the Standard Deviation</title> | |
5 | <link rel="stylesheet" href="../../../../math.css" type="text/css"> | |
6 | <meta name="generator" content="DocBook XSL Stylesheets V1.77.1"> | |
7 | <link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1"> | |
8 | <link rel="up" href="../cs_eg.html" title="Chi Squared Distribution Examples"> | |
9 | <link rel="prev" href="../cs_eg.html" title="Chi Squared Distribution Examples"> | |
10 | <link rel="next" href="chi_sq_test.html" title="Chi-Square Test for the Standard Deviation"> | |
11 | </head> | |
12 | <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> | |
13 | <table cellpadding="2" width="100%"><tr> | |
14 | <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td> | |
15 | <td align="center"><a href="../../../../../../../../index.html">Home</a></td> | |
16 | <td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td> | |
17 | <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> | |
18 | <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> | |
19 | <td align="center"><a href="../../../../../../../../more/index.htm">More</a></td> | |
20 | </tr></table> | |
21 | <hr> | |
22 | <div class="spirit-nav"> | |
23 | <a accesskey="p" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="chi_sq_test.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
24 | </div> | |
25 | <div class="section"> | |
26 | <div class="titlepage"><div><div><h5 class="title"> | |
27 | <a name="math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals"></a><a class="link" href="chi_sq_intervals.html" title="Confidence Intervals on the Standard Deviation">Confidence | |
28 | Intervals on the Standard Deviation</a> | |
29 | </h5></div></div></div> | |
30 | <p> | |
31 | Once you have calculated the standard deviation for your data, a legitimate | |
32 | question to ask is "How reliable is the calculated standard deviation?". | |
33 | For this situation the Chi Squared distribution can be used to calculate | |
34 | confidence intervals for the standard deviation. | |
35 | </p> | |
36 | <p> | |
37 | The full example code & sample output is in <a href="../../../../../../example/chi_square_std_dev_test.cpp" target="_top">chi_square_std_dev_test.cpp</a>. | |
38 | </p> | |
39 | <p> | |
40 | We'll begin by defining the procedure that will calculate and print out | |
41 | the confidence intervals: | |
42 | </p> | |
43 | <pre class="programlisting"><span class="keyword">void</span> <span class="identifier">confidence_limits_on_std_deviation</span><span class="special">(</span> | |
44 | <span class="keyword">double</span> <span class="identifier">Sd</span><span class="special">,</span> <span class="comment">// Sample Standard Deviation</span> | |
45 | <span class="keyword">unsigned</span> <span class="identifier">N</span><span class="special">)</span> <span class="comment">// Sample size</span> | |
46 | <span class="special">{</span> | |
47 | </pre> | |
48 | <p> | |
49 | We'll begin by printing out some general information: | |
50 | </p> | |
51 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> | |
52 | <span class="string">"________________________________________________\n"</span> | |
53 | <span class="string">"2-Sided Confidence Limits For Standard Deviation\n"</span> | |
54 | <span class="string">"________________________________________________\n\n"</span><span class="special">;</span> | |
55 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">7</span><span class="special">);</span> | |
56 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Number of Observations"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">N</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> | |
57 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="string">"Standard Deviation"</span> <span class="special"><<</span> <span class="string">"= "</span> <span class="special"><<</span> <span class="identifier">Sd</span> <span class="special"><<</span> <span class="string">"\n"</span><span class="special">;</span> | |
58 | </pre> | |
59 | <p> | |
60 | and then define a table of significance levels for which we'll calculate | |
61 | intervals: | |
62 | </p> | |
63 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span> | |
64 | </pre> | |
65 | <p> | |
66 | The distribution we'll need to calculate the confidence intervals is | |
67 | a Chi Squared distribution, with N-1 degrees of freedom: | |
68 | </p> | |
69 | <pre class="programlisting"><span class="identifier">chi_squared</span> <span class="identifier">dist</span><span class="special">(</span><span class="identifier">N</span> <span class="special">-</span> <span class="number">1</span><span class="special">);</span> | |
70 | </pre> | |
71 | <p> | |
72 | For each value of alpha, the formula for the confidence interval is given | |
73 | by: | |
74 | </p> | |
75 | <p> | |
76 | <span class="inlinemediaobject"><img src="../../../../../equations/chi_squ_tut1.svg"></span> | |
77 | </p> | |
78 | <p> | |
79 | Where <span class="inlinemediaobject"><img src="../../../../../equations/chi_squ_tut2.svg"></span> is the upper critical value, and <span class="inlinemediaobject"><img src="../../../../../equations/chi_squ_tut3.svg"></span> is | |
80 | the lower critical value of the Chi Squared distribution. | |
81 | </p> | |
82 | <p> | |
83 | In code we begin by printing out a table header: | |
84 | </p> | |
85 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"\n\n"</span> | |
86 | <span class="string">"_____________________________________________\n"</span> | |
87 | <span class="string">"Confidence Lower Upper\n"</span> | |
88 | <span class="string">" Value (%) Limit Limit\n"</span> | |
89 | <span class="string">"_____________________________________________\n"</span><span class="special">;</span> | |
90 | </pre> | |
91 | <p> | |
92 | and then loop over the values of alpha and calculate the intervals for | |
93 | each: remember that the lower critical value is the same as the quantile, | |
94 | and the upper critical value is the same as the quantile from the complement | |
95 | of the probability: | |
96 | </p> | |
97 | <pre class="programlisting"><span class="keyword">for</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special"><</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">)/</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">[</span><span class="number">0</span><span class="special">]);</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span> | |
98 | <span class="special">{</span> | |
99 | <span class="comment">// Confidence value:</span> | |
100 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">10</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="number">100</span> <span class="special">*</span> <span class="special">(</span><span class="number">1</span><span class="special">-</span><span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]);</span> | |
101 | <span class="comment">// Calculate limits:</span> | |
102 | <span class="keyword">double</span> <span class="identifier">lower_limit</span> <span class="special">=</span> <span class="identifier">sqrt</span><span class="special">((</span><span class="identifier">N</span> <span class="special">-</span> <span class="number">1</span><span class="special">)</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">/</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">dist</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">2</span><span class="special">)));</span> | |
103 | <span class="keyword">double</span> <span class="identifier">upper_limit</span> <span class="special">=</span> <span class="identifier">sqrt</span><span class="special">((</span><span class="identifier">N</span> <span class="special">-</span> <span class="number">1</span><span class="special">)</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">/</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">dist</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">2</span><span class="special">));</span> | |
104 | <span class="comment">// Print Limits:</span> | |
105 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">lower_limit</span><span class="special">;</span> | |
106 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">fixed</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">right</span> <span class="special"><<</span> <span class="identifier">upper_limit</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
107 | <span class="special">}</span> | |
108 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
109 | </pre> | |
110 | <p> | |
111 | To see some example output we'll use the <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm" target="_top">gear | |
112 | data</a> from the <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH | |
113 | e-Handbook of Statistical Methods.</a>. The data represents measurements | |
114 | of gear diameter from a manufacturing process. | |
115 | </p> | |
116 | <pre class="programlisting">________________________________________________ | |
117 | 2-Sided Confidence Limits For Standard Deviation | |
118 | ________________________________________________ | |
119 | ||
120 | Number of Observations = 100 | |
121 | Standard Deviation = 0.006278908 | |
122 | ||
123 | ||
124 | _____________________________________________ | |
125 | Confidence Lower Upper | |
126 | Value (%) Limit Limit | |
127 | _____________________________________________ | |
128 | 50.000 0.00601 0.00662 | |
129 | 75.000 0.00582 0.00685 | |
130 | 90.000 0.00563 0.00712 | |
131 | 95.000 0.00551 0.00729 | |
132 | 99.000 0.00530 0.00766 | |
133 | 99.900 0.00507 0.00812 | |
134 | 99.990 0.00489 0.00855 | |
135 | 99.999 0.00474 0.00895 | |
136 | </pre> | |
137 | <p> | |
138 | So at the 95% confidence level we conclude that the standard deviation | |
139 | is between 0.00551 and 0.00729. | |
140 | </p> | |
141 | <h5> | |
142 | <a name="math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals.h0"></a> | |
143 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals.confidence_intervals_as_a_functi"></a></span><a class="link" href="chi_sq_intervals.html#math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals.confidence_intervals_as_a_functi">Confidence | |
144 | intervals as a function of the number of observations</a> | |
145 | </h5> | |
146 | <p> | |
147 | Similarly, we can also list the confidence intervals for the standard | |
148 | deviation for the common confidence levels 95%, for increasing numbers | |
149 | of observations. | |
150 | </p> | |
151 | <p> | |
152 | The standard deviation used to compute these values is unity, so the | |
153 | limits listed are <span class="bold"><strong>multipliers</strong></span> for any | |
154 | particular standard deviation. For example, given a standard deviation | |
155 | of 0.0062789 as in the example above; for 100 observations the multiplier | |
156 | is 0.8780 giving the lower confidence limit of 0.8780 * 0.006728 = 0.00551. | |
157 | </p> | |
158 | <pre class="programlisting">____________________________________________________ | |
159 | Confidence level (two-sided) = 0.0500000 | |
160 | Standard Deviation = 1.0000000 | |
161 | ________________________________________ | |
162 | Observations Lower Upper | |
163 | Limit Limit | |
164 | ________________________________________ | |
165 | 2 0.4461 31.9102 | |
166 | 3 0.5207 6.2847 | |
167 | 4 0.5665 3.7285 | |
168 | 5 0.5991 2.8736 | |
169 | 6 0.6242 2.4526 | |
170 | 7 0.6444 2.2021 | |
171 | 8 0.6612 2.0353 | |
172 | 9 0.6755 1.9158 | |
173 | 10 0.6878 1.8256 | |
174 | 15 0.7321 1.5771 | |
175 | 20 0.7605 1.4606 | |
176 | 30 0.7964 1.3443 | |
177 | 40 0.8192 1.2840 | |
178 | 50 0.8353 1.2461 | |
179 | 60 0.8476 1.2197 | |
180 | 100 0.8780 1.1617 | |
181 | 120 0.8875 1.1454 | |
182 | 1000 0.9580 1.0459 | |
183 | 10000 0.9863 1.0141 | |
184 | 50000 0.9938 1.0062 | |
185 | 100000 0.9956 1.0044 | |
186 | 1000000 0.9986 1.0014 | |
187 | </pre> | |
188 | <p> | |
189 | With just 2 observations the limits are from <span class="bold"><strong>0.445</strong></span> | |
190 | up to to <span class="bold"><strong>31.9</strong></span>, so the standard deviation | |
191 | might be about <span class="bold"><strong>half</strong></span> the observed value | |
192 | up to <span class="bold"><strong>30 times</strong></span> the observed value! | |
193 | </p> | |
194 | <p> | |
195 | Estimating a standard deviation with just a handful of values leaves | |
196 | a very great uncertainty, especially the upper limit. Note especially | |
197 | how far the upper limit is skewed from the most likely standard deviation. | |
198 | </p> | |
199 | <p> | |
200 | Even for 10 observations, normally considered a reasonable number, the | |
201 | range is still from 0.69 to 1.8, about a range of 0.7 to 2, and is still | |
202 | highly skewed with an upper limit <span class="bold"><strong>twice</strong></span> | |
203 | the median. | |
204 | </p> | |
205 | <p> | |
206 | When we have 1000 observations, the estimate of the standard deviation | |
207 | is starting to look convincing, with a range from 0.95 to 1.05 - now | |
208 | near symmetrical, but still about + or - 5%. | |
209 | </p> | |
210 | <p> | |
211 | Only when we have 10000 or more repeated observations can we start to | |
212 | be reasonably confident (provided we are sure that other factors like | |
213 | drift are not creeping in). | |
214 | </p> | |
215 | <p> | |
216 | For 10000 observations, the interval is 0.99 to 1.1 - finally a really | |
217 | convincing + or -1% confidence. | |
218 | </p> | |
219 | </div> | |
220 | <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> | |
221 | <td align="left"></td> | |
222 | <td align="right"><div class="copyright-footer">Copyright © 2006-2010, 2012-2014 Nikhar Agrawal, | |
223 | Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert | |
224 | Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan Råde, Gautam Sewani, | |
225 | Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p> | |
226 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
227 | file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) | |
228 | </p> | |
229 | </div></td> | |
230 | </tr></table> | |
231 | <hr> | |
232 | <div class="spirit-nav"> | |
233 | <a accesskey="p" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="chi_sq_test.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
234 | </div> | |
235 | </body> | |
236 | </html> |