]> git.proxmox.com Git - ceph.git/blame - ceph/src/boost/libs/math/doc/html/math_toolkit/stat_tut/weg/st_eg/tut_mean_intervals.html
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / math / doc / html / math_toolkit / stat_tut / weg / st_eg / tut_mean_intervals.html
CommitLineData
7c673cae
FG
1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
4<title>Calculating confidence intervals on the mean with the Students-t distribution</title>
5<link rel="stylesheet" href="../../../../math.css" type="text/css">
6<meta name="generator" content="DocBook XSL Stylesheets V1.77.1">
7<link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1">
8<link rel="up" href="../st_eg.html" title="Student's t Distribution Examples">
9<link rel="prev" href="../st_eg.html" title="Student's t Distribution Examples">
10<link rel="next" href="tut_mean_test.html" title='Testing a sample mean for difference from a "true" mean'>
11</head>
12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13<table cellpadding="2" width="100%"><tr>
14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td>
15<td align="center"><a href="../../../../../../../../index.html">Home</a></td>
16<td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td>
17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19<td align="center"><a href="../../../../../../../../more/index.htm">More</a></td>
20</tr></table>
21<hr>
22<div class="spirit-nav">
23<a accesskey="p" href="../st_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../st_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tut_mean_test.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a>
24</div>
25<div class="section">
26<div class="titlepage"><div><div><h5 class="title">
27<a name="math_toolkit.stat_tut.weg.st_eg.tut_mean_intervals"></a><a class="link" href="tut_mean_intervals.html" title="Calculating confidence intervals on the mean with the Students-t distribution">Calculating
28 confidence intervals on the mean with the Students-t distribution</a>
29</h5></div></div></div>
30<p>
31 Let's say you have a sample mean, you may wish to know what confidence
32 intervals you can place on that mean. Colloquially: "I want an interval
33 that I can be P% sure contains the true mean". (On a technical point,
34 note that the interval either contains the true mean or it does not:
35 the meaning of the confidence level is subtly different from this colloquialism.
36 More background information can be found on the <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm" target="_top">NIST
37 site</a>).
38 </p>
39<p>
40 The formula for the interval can be expressed as:
41 </p>
42<p>
43 <span class="inlinemediaobject"><img src="../../../../../equations/dist_tutorial4.svg"></span>
44 </p>
45<p>
46 Where, <span class="emphasis"><em>Y<sub>s</sub></em></span> is the sample mean, <span class="emphasis"><em>s</em></span>
47 is the sample standard deviation, <span class="emphasis"><em>N</em></span> is the sample
48 size, /&#945;/ is the desired significance level and <span class="emphasis"><em>t<sub>(&#945;/2,N-1)</sub></em></span>
49 is the upper critical value of the Students-t distribution with <span class="emphasis"><em>N-1</em></span>
50 degrees of freedom.
51 </p>
52<div class="note"><table border="0" summary="Note">
53<tr>
54<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../../doc/src/images/note.png"></td>
55<th align="left">Note</th>
56</tr>
57<tr><td align="left" valign="top">
58<p>
59 The quantity &#945; &#160; is the maximum acceptable risk of falsely rejecting the
60 null-hypothesis. The smaller the value of &#945; the greater the strength
61 of the test.
62 </p>
63<p>
64 The confidence level of the test is defined as 1 - &#945;, and often expressed
65 as a percentage. So for example a significance level of 0.05, is equivalent
66 to a 95% confidence level. Refer to <a href="http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm" target="_top">"What
67 are confidence intervals?"</a> in <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH
68 e-Handbook of Statistical Methods.</a> for more information.
69 </p>
70</td></tr>
71</table></div>
72<div class="note"><table border="0" summary="Note">
73<tr>
74<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../../doc/src/images/note.png"></td>
75<th align="left">Note</th>
76</tr>
77<tr><td align="left" valign="top"><p>
78 The usual assumptions of <a href="http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables" target="_top">independent
79 and identically distributed (i.i.d.)</a> variables and <a href="http://en.wikipedia.org/wiki/Normal_distribution" target="_top">normal
80 distribution</a> of course apply here, as they do in other examples.
81 </p></td></tr>
82</table></div>
83<p>
84 From the formula, it should be clear that:
85 </p>
86<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
87<li class="listitem">
88 The width of the confidence interval decreases as the sample size
89 increases.
90 </li>
91<li class="listitem">
92 The width increases as the standard deviation increases.
93 </li>
94<li class="listitem">
95 The width increases as the <span class="emphasis"><em>confidence level increases</em></span>
96 (0.5 towards 0.99999 - stronger).
97 </li>
98<li class="listitem">
99 The width increases as the <span class="emphasis"><em>significance level decreases</em></span>
100 (0.5 towards 0.00000...01 - stronger).
101 </li>
102</ul></div>
103<p>
104 The following example code is taken from the example program <a href="../../../../../../example/students_t_single_sample.cpp" target="_top">students_t_single_sample.cpp</a>.
105 </p>
106<p>
107 We'll begin by defining a procedure to calculate intervals for various
108 confidence levels; the procedure will print these out as a table:
109 </p>
110<pre class="programlisting"><span class="comment">// Needed includes:</span>
111<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">students_t</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
112<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
113<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iomanip</span><span class="special">&gt;</span>
114<span class="comment">// Bring everything into global namespace for ease of use:</span>
115<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span>
116<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span>
117
118<span class="keyword">void</span> <span class="identifier">confidence_limits_on_mean</span><span class="special">(</span>
119 <span class="keyword">double</span> <span class="identifier">Sm</span><span class="special">,</span> <span class="comment">// Sm = Sample Mean.</span>
120 <span class="keyword">double</span> <span class="identifier">Sd</span><span class="special">,</span> <span class="comment">// Sd = Sample Standard Deviation.</span>
121 <span class="keyword">unsigned</span> <span class="identifier">Sn</span><span class="special">)</span> <span class="comment">// Sn = Sample Size.</span>
122<span class="special">{</span>
123 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span>
124 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span>
125
126 <span class="comment">// Print out general info:</span>
127 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span>
128 <span class="string">"__________________________________\n"</span>
129 <span class="string">"2-Sided Confidence Limits For Mean\n"</span>
130 <span class="string">"__________________________________\n\n"</span><span class="special">;</span>
131 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">7</span><span class="special">);</span>
132 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">left</span> <span class="special">&lt;&lt;</span> <span class="string">"Number of Observations"</span> <span class="special">&lt;&lt;</span> <span class="string">"= "</span> <span class="special">&lt;&lt;</span> <span class="identifier">Sn</span> <span class="special">&lt;&lt;</span> <span class="string">"\n"</span><span class="special">;</span>
133 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">left</span> <span class="special">&lt;&lt;</span> <span class="string">"Mean"</span> <span class="special">&lt;&lt;</span> <span class="string">"= "</span> <span class="special">&lt;&lt;</span> <span class="identifier">Sm</span> <span class="special">&lt;&lt;</span> <span class="string">"\n"</span><span class="special">;</span>
134 <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">40</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">left</span> <span class="special">&lt;&lt;</span> <span class="string">"Standard Deviation"</span> <span class="special">&lt;&lt;</span> <span class="string">"= "</span> <span class="special">&lt;&lt;</span> <span class="identifier">Sd</span> <span class="special">&lt;&lt;</span> <span class="string">"\n"</span><span class="special">;</span>
135</pre>
136<p>
137 We'll define a table of significance/risk levels for which we'll compute
138 intervals:
139 </p>
140<pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span>
141</pre>
142<p>
143 Note that these are the complements of the confidence/probability levels:
144 0.5, 0.75, 0.9 .. 0.99999).
145 </p>
146<p>
147 Next we'll declare the distribution object we'll need, note that the
148 <span class="emphasis"><em>degrees of freedom</em></span> parameter is the sample size
149 less one:
150 </p>
151<pre class="programlisting"><span class="identifier">students_t</span> <span class="identifier">dist</span><span class="special">(</span><span class="identifier">Sn</span> <span class="special">-</span> <span class="number">1</span><span class="special">);</span>
152</pre>
153<p>
154 Most of what follows in the program is pretty printing, so let's focus
155 on the calculation of the interval. First we need the t-statistic, computed
156 using the <span class="emphasis"><em>quantile</em></span> function and our significance
157 level. Note that since the significance levels are the complement of
158 the probability, we have to wrap the arguments in a call to <span class="emphasis"><em>complement(...)</em></span>:
159 </p>
160<pre class="programlisting"><span class="keyword">double</span> <span class="identifier">T</span> <span class="special">=</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">dist</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">2</span><span class="special">));</span>
161</pre>
162<p>
163 Note that alpha was divided by two, since we'll be calculating both the
164 upper and lower bounds: had we been interested in a single sided interval
165 then we would have omitted this step.
166 </p>
167<p>
168 Now to complete the picture, we'll get the (one-sided) width of the interval
169 from the t-statistic by multiplying by the standard deviation, and dividing
170 by the square root of the sample size:
171 </p>
172<pre class="programlisting"><span class="keyword">double</span> <span class="identifier">w</span> <span class="special">=</span> <span class="identifier">T</span> <span class="special">*</span> <span class="identifier">Sd</span> <span class="special">/</span> <span class="identifier">sqrt</span><span class="special">(</span><span class="keyword">double</span><span class="special">(</span><span class="identifier">Sn</span><span class="special">));</span>
173</pre>
174<p>
175 The two-sided interval is then the sample mean plus and minus this width.
176 </p>
177<p>
178 And apart from some more pretty-printing that completes the procedure.
179 </p>
180<p>
181 Let's take a look at some sample output, first using the <a href="http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm" target="_top">Heat
182 flow data</a> from the NIST site. The data set was collected by Bob
183 Zarr of NIST in January, 1990 from a heat flow meter calibration and
184 stability analysis. The corresponding dataplot output for this test can
185 be found in <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm" target="_top">section
186 3.5.2</a> of the <a href="http://www.itl.nist.gov/div898/handbook/" target="_top">NIST/SEMATECH
187 e-Handbook of Statistical Methods.</a>.
188 </p>
189<pre class="programlisting"> __________________________________
190 2-Sided Confidence Limits For Mean
191 __________________________________
192
193 Number of Observations = 195
194 Mean = 9.26146
195 Standard Deviation = 0.02278881
196
197
198 ___________________________________________________________________
199 Confidence T Interval Lower Upper
200 Value (%) Value Width Limit Limit
201 ___________________________________________________________________
202 50.000 0.676 1.103e-003 9.26036 9.26256
203 75.000 1.154 1.883e-003 9.25958 9.26334
204 90.000 1.653 2.697e-003 9.25876 9.26416
205 95.000 1.972 3.219e-003 9.25824 9.26468
206 99.000 2.601 4.245e-003 9.25721 9.26571
207 99.900 3.341 5.453e-003 9.25601 9.26691
208 99.990 3.973 6.484e-003 9.25498 9.26794
209 99.999 4.537 7.404e-003 9.25406 9.26886
210</pre>
211<p>
212 As you can see the large sample size (195) and small standard deviation
213 (0.023) have combined to give very small intervals, indeed we can be
214 very confident that the true mean is 9.2.
215 </p>
216<p>
217 For comparison the next example data output is taken from <span class="emphasis"><em>P.K.Hou,
218 O. W. Lau &amp; M.C. Wong, Analyst (1983) vol. 108, p 64. and from Statistics
219 for Analytical Chemistry, 3rd ed. (1994), pp 54-55 J. C. Miller and J.
220 N. Miller, Ellis Horwood ISBN 0 13 0309907.</em></span> The values result
221 from the determination of mercury by cold-vapour atomic absorption.
222 </p>
223<pre class="programlisting"> __________________________________
224 2-Sided Confidence Limits For Mean
225 __________________________________
226
227 Number of Observations = 3
228 Mean = 37.8000000
229 Standard Deviation = 0.9643650
230
231
232 ___________________________________________________________________
233 Confidence T Interval Lower Upper
234 Value (%) Value Width Limit Limit
235 ___________________________________________________________________
236 50.000 0.816 0.455 37.34539 38.25461
237 75.000 1.604 0.893 36.90717 38.69283
238 90.000 2.920 1.626 36.17422 39.42578
239 95.000 4.303 2.396 35.40438 40.19562
240 99.000 9.925 5.526 32.27408 43.32592
241 99.900 31.599 17.594 20.20639 55.39361
242 99.990 99.992 55.673 -17.87346 93.47346
243 99.999 316.225 176.067 -138.26683 213.86683
244</pre>
245<p>
246 This time the fact that there are only three measurements leads to much
247 wider intervals, indeed such large intervals that it's hard to be very
248 confident in the location of the mean.
249 </p>
250</div>
251<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
252<td align="left"></td>
253<td align="right"><div class="copyright-footer">Copyright &#169; 2006-2010, 2012-2014 Nikhar Agrawal,
254 Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert
255 Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan R&#229;de, Gautam Sewani,
256 Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p>
257 Distributed under the Boost Software License, Version 1.0. (See accompanying
258 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
259 </p>
260</div></td>
261</tr></table>
262<hr>
263<div class="spirit-nav">
264<a accesskey="p" href="../st_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../st_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="tut_mean_test.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a>
265</div>
266</body>
267</html>