]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/math/doc/html/math_toolkit/stat_tut/weg/cs_eg/chi_sq_intervals.html
add subtree-ish sources for 12.0.3
[ceph.git] / ceph / src / boost / libs / math / doc / html / math_toolkit / stat_tut / weg / cs_eg / chi_sq_intervals.html
1 <html>
2 <head>
3 <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
4 <title>Confidence Intervals on the Standard Deviation</title>
5 <link rel="stylesheet" href="../../../../math.css" type="text/css">
6 <meta name="generator" content="DocBook XSL Stylesheets V1.77.1">
7 <link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1">
8 <link rel="up" href="../cs_eg.html" title="Chi Squared Distribution Examples">
9 <link rel="prev" href="../cs_eg.html" title="Chi Squared Distribution Examples">
10 <link rel="next" href="chi_sq_test.html" title="Chi-Square Test for the Standard Deviation">
11 </head>
12 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13 <table cellpadding="2" width="100%"><tr>
14 <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td>
15 <td align="center"><a href="../../../../../../../../index.html">Home</a></td>
16 <td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td>
17 <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18 <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19 <td align="center"><a href="../../../../../../../../more/index.htm">More</a></td>
20 </tr></table>
21 <hr>
22 <div class="spirit-nav">
23 <a accesskey="p" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="chi_sq_test.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a>
24 </div>
25 <div class="section">
26 <div class="titlepage"><div><div><h5 class="title">
27 <a name="math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals"></a><a class="link" href="chi_sq_intervals.html" title="Confidence Intervals on the Standard Deviation">Confidence
28 Intervals on the Standard Deviation</a>
29 </h5></div></div></div>
30 <p>
31 Once you have calculated the standard deviation for your data, a legitimate
32 question to ask is "How reliable is the calculated standard deviation?".
33 For this situation the Chi Squared distribution can be used to calculate
34 confidence intervals for the standard deviation.
35 </p>
36 <p>
37 The full example code &amp; sample output is in <a href="../../../../../../example/chi_square_std_dev_test.cpp" target="_top">chi_square_std_dev_test.cpp</a>.
38 </p>
39 <p>
40 We'll begin by defining the procedure that will calculate and print out
41 the confidence intervals:
42 </p>
43 <pre class="programlisting"><span class="keyword">void</span> <span class="identifier">confidence_limits_on_std_deviation</span><span class="special">(</span>
44 <span class="keyword">double</span> <span class="identifier">Sd</span><span class="special">,</span> <span class="comment">// Sample Standard Deviation</span>
45 <span class="keyword">unsigned</span> <span class="identifier">N</span><span class="special">)</span> <span class="comment">// Sample size</span>
46 <span class="special">{</span>
47 </pre>
48 <p>
49 We'll begin by printing out some general information:
50 </p>
51 <pre class="programlisting"><span class="identifier">cout</span> <span class="special">&lt;&lt;</span>
52 <span class="string">"________________________________________________\n"</span>
53 <span class="string">"2-Sided Confidence Limits For Standard Deviation\n"</span>
54 <span class="string">"________________________________________________\n\n"</span><span class="special">;</span>
55 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">7</span><span class="special">);</span>
56 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">left</span> <span class="special">&lt;&lt;</span> <span class="string">"Number of Observations"</span> <span class="special">&lt;&lt;</span> <span class="string">"= "</span> <span class="special">&lt;&lt;</span> <span class="identifier">N</span> <span class="special">&lt;&lt;</span> <span class="string">"\n"</span><span class="special">;</span>
57 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">left</span> <span class="special">&lt;&lt;</span> <span class="string">"Standard Deviation"</span> <span class="special">&lt;&lt;</span> <span class="string">"= "</span> <span class="special">&lt;&lt;</span> <span class="identifier">Sd</span> <span class="special">&lt;&lt;</span> <span class="string">"\n"</span><span class="special">;</span>
58 </pre>
59 <p>
60 and then define a table of significance levels for which we'll calculate
61 intervals:
62 </p>
63 <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span>
64 </pre>
65 <p>
66 The distribution we'll need to calculate the confidence intervals is
67 a Chi Squared distribution, with N-1 degrees of freedom:
68 </p>
69 <pre class="programlisting"><span class="identifier">chi_squared</span> <span class="identifier">dist</span><span class="special">(</span><span class="identifier">N</span> <span class="special">-</span> <span class="number">1</span><span class="special">);</span>
70 </pre>
71 <p>
72 For each value of alpha, the formula for the confidence interval is given
73 by:
74 </p>
75 <p>
76 <span class="inlinemediaobject"><img src="../../../../../equations/chi_squ_tut1.svg"></span>
77 </p>
78 <p>
79 Where <span class="inlinemediaobject"><img src="../../../../../equations/chi_squ_tut2.svg"></span> is the upper critical value, and <span class="inlinemediaobject"><img src="../../../../../equations/chi_squ_tut3.svg"></span> is
80 the lower critical value of the Chi Squared distribution.
81 </p>
82 <p>
83 In code we begin by printing out a table header:
84 </p>
85 <pre class="programlisting"><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"\n\n"</span>
86 <span class="string">"_____________________________________________\n"</span>
87 <span class="string">"Confidence Lower Upper\n"</span>
88 <span class="string">" Value (%) Limit Limit\n"</span>
89 <span class="string">"_____________________________________________\n"</span><span class="special">;</span>
90 </pre>
91 <p>
92 and then loop over the values of alpha and calculate the intervals for
93 each: remember that the lower critical value is the same as the quantile,
94 and the upper critical value is the same as the quantile from the complement
95 of the probability:
96 </p>
97 <pre class="programlisting"><span class="keyword">for</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">)/</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">[</span><span class="number">0</span><span class="special">]);</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span>
98 <span class="special">{</span>
99 <span class="comment">// Confidence value:</span>
100 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">fixed</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">10</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">right</span> <span class="special">&lt;&lt;</span> <span class="number">100</span> <span class="special">*</span> <span class="special">(</span><span class="number">1</span><span class="special">-</span><span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]);</span>
101 <span class="comment">// Calculate limits:</span>
102 <span class="keyword">double</span> <span class="identifier">lower_limit</span> <span class="special">=</span> <span class="identifier">sqrt</span><span class="special">((</span><span class="identifier">N</span> <span class="special">-</span> <span class="number">1</span><span class="special">)</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">/</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">dist</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">2</span><span class="special">)));</span>
103 <span class="keyword">double</span> <span class="identifier">upper_limit</span> <span class="special">=</span> <span class="identifier">sqrt</span><span class="special">((</span><span class="identifier">N</span> <span class="special">-</span> <span class="number">1</span><span class="special">)</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">/</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">dist</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">2</span><span class="special">));</span>
104 <span class="comment">// Print Limits:</span>
105 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">fixed</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">right</span> <span class="special">&lt;&lt;</span> <span class="identifier">lower_limit</span><span class="special">;</span>
106 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">fixed</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">5</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">right</span> <span class="special">&lt;&lt;</span> <span class="identifier">upper_limit</span> <span class="special">&lt;&lt;</span> <span class="identifier">endl</span><span class="special">;</span>
107 <span class="special">}</span>
108 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">endl</span><span class="special">;</span>
109 </pre>
110 <p>
111 To see some example output we'll use the <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm" target="_top">gear
112 data</a> from the <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH
113 e-Handbook of Statistical Methods.</a>. The data represents measurements
114 of gear diameter from a manufacturing process.
115 </p>
116 <pre class="programlisting">________________________________________________
117 2-Sided Confidence Limits For Standard Deviation
118 ________________________________________________
119
120 Number of Observations = 100
121 Standard Deviation = 0.006278908
122
123
124 _____________________________________________
125 Confidence Lower Upper
126 Value (%) Limit Limit
127 _____________________________________________
128 50.000 0.00601 0.00662
129 75.000 0.00582 0.00685
130 90.000 0.00563 0.00712
131 95.000 0.00551 0.00729
132 99.000 0.00530 0.00766
133 99.900 0.00507 0.00812
134 99.990 0.00489 0.00855
135 99.999 0.00474 0.00895
136 </pre>
137 <p>
138 So at the 95% confidence level we conclude that the standard deviation
139 is between 0.00551 and 0.00729.
140 </p>
141 <h5>
142 <a name="math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals.h0"></a>
143 <span class="phrase"><a name="math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals.confidence_intervals_as_a_functi"></a></span><a class="link" href="chi_sq_intervals.html#math_toolkit.stat_tut.weg.cs_eg.chi_sq_intervals.confidence_intervals_as_a_functi">Confidence
144 intervals as a function of the number of observations</a>
145 </h5>
146 <p>
147 Similarly, we can also list the confidence intervals for the standard
148 deviation for the common confidence levels 95%, for increasing numbers
149 of observations.
150 </p>
151 <p>
152 The standard deviation used to compute these values is unity, so the
153 limits listed are <span class="bold"><strong>multipliers</strong></span> for any
154 particular standard deviation. For example, given a standard deviation
155 of 0.0062789 as in the example above; for 100 observations the multiplier
156 is 0.8780 giving the lower confidence limit of 0.8780 * 0.006728 = 0.00551.
157 </p>
158 <pre class="programlisting">____________________________________________________
159 Confidence level (two-sided) = 0.0500000
160 Standard Deviation = 1.0000000
161 ________________________________________
162 Observations Lower Upper
163 Limit Limit
164 ________________________________________
165 2 0.4461 31.9102
166 3 0.5207 6.2847
167 4 0.5665 3.7285
168 5 0.5991 2.8736
169 6 0.6242 2.4526
170 7 0.6444 2.2021
171 8 0.6612 2.0353
172 9 0.6755 1.9158
173 10 0.6878 1.8256
174 15 0.7321 1.5771
175 20 0.7605 1.4606
176 30 0.7964 1.3443
177 40 0.8192 1.2840
178 50 0.8353 1.2461
179 60 0.8476 1.2197
180 100 0.8780 1.1617
181 120 0.8875 1.1454
182 1000 0.9580 1.0459
183 10000 0.9863 1.0141
184 50000 0.9938 1.0062
185 100000 0.9956 1.0044
186 1000000 0.9986 1.0014
187 </pre>
188 <p>
189 With just 2 observations the limits are from <span class="bold"><strong>0.445</strong></span>
190 up to to <span class="bold"><strong>31.9</strong></span>, so the standard deviation
191 might be about <span class="bold"><strong>half</strong></span> the observed value
192 up to <span class="bold"><strong>30 times</strong></span> the observed value!
193 </p>
194 <p>
195 Estimating a standard deviation with just a handful of values leaves
196 a very great uncertainty, especially the upper limit. Note especially
197 how far the upper limit is skewed from the most likely standard deviation.
198 </p>
199 <p>
200 Even for 10 observations, normally considered a reasonable number, the
201 range is still from 0.69 to 1.8, about a range of 0.7 to 2, and is still
202 highly skewed with an upper limit <span class="bold"><strong>twice</strong></span>
203 the median.
204 </p>
205 <p>
206 When we have 1000 observations, the estimate of the standard deviation
207 is starting to look convincing, with a range from 0.95 to 1.05 - now
208 near symmetrical, but still about + or - 5%.
209 </p>
210 <p>
211 Only when we have 10000 or more repeated observations can we start to
212 be reasonably confident (provided we are sure that other factors like
213 drift are not creeping in).
214 </p>
215 <p>
216 For 10000 observations, the interval is 0.99 to 1.1 - finally a really
217 convincing + or -1% confidence.
218 </p>
219 </div>
220 <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
221 <td align="left"></td>
222 <td align="right"><div class="copyright-footer">Copyright &#169; 2006-2010, 2012-2014 Nikhar Agrawal,
223 Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert
224 Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan R&#229;de, Gautam Sewani,
225 Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p>
226 Distributed under the Boost Software License, Version 1.0. (See accompanying
227 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
228 </p>
229 </div></td>
230 </tr></table>
231 <hr>
232 <div class="spirit-nav">
233 <a accesskey="p" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../cs_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="chi_sq_test.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a>
234 </div>
235 </body>
236 </html>