ceph/src/boost/libs/math/doc/html/math_toolkit/stat_tut/weg/st_eg/tut_mean_size.html

   1 <html>
   2 <head>
   3 <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
   4 <title>Estimating how large a sample size would have to become in order to give a significant Students-t test result with a single sample test</title>
   5 <link rel="stylesheet" href="../../../../math.css" type="text/css">
   6 <meta name="generator" content="DocBook XSL Stylesheets V1.77.1">
   7 <link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1">
   8 <link rel="up" href="../st_eg.html" title="Student's t Distribution Examples">
   9 <link rel="prev" href="tut_mean_test.html" title='Testing a sample mean for difference from a "true" mean'>
  10 <link rel="next" href="two_sample_students_t.html" title="Comparing the means of two samples with the Students-t test">
  11 </head>
  12 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
  13 <table cellpadding="2" width="100%"><tr>
  14 <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td>
  15 <td align="center"><a href="../../../../../../../../index.html">Home</a></td>
  16 <td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td>
  17 <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
  18 <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
  19 <td align="center"><a href="../../../../../../../../more/index.htm">More</a></td>
  20 </tr></table>
  21 <hr>
  22 <div class="spirit-nav">
  23 <a accesskey="p" href="tut_mean_test.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../st_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="two_sample_students_t.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a>
  24 </div>
  25 <div class="section">
  26 <div class="titlepage"><div><div><h5 class="title">
  27 <a name="math_toolkit.stat_tut.weg.st_eg.tut_mean_size"></a><a class="link" href="tut_mean_size.html" title="Estimating how large a sample size would have to become in order to give a significant Students-t test result with a single sample test">Estimating
  28           how large a sample size would have to become in order to give a significant
  29           Students-t test result with a single sample test</a>
  30 </h5></div></div></div>
  31 <p>
  32             Imagine you have conducted a Students-t test on a single sample in order
  33             to check for systematic errors in your measurements. Imagine that the
  34             result is borderline. At this point one might go off and collect more
  35             data, but it might be prudent to first ask the question "How much
  36             more?". The parameter estimators of the students_t_distribution
  37             class can provide this information.
  38           </p>
  39 <p>
  40             This section is based on the example code in <a href="../../../../../../example/students_t_single_sample.cpp" target="_top">students_t_single_sample.cpp</a>
  41             and we begin by defining a procedure that will print out a table of estimated
  42             sample sizes for various confidence levels:
  43           </p>
  44 <pre class="programlisting"><span class="comment">// Needed includes:</span>
  45 <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">students_t</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
  46 <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
  47 <span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iomanip</span><span class="special">&gt;</span>
  48 <span class="comment">// Bring everything into global namespace for ease of use:</span>
  49 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">;</span>
  50 <span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">std</span><span class="special">;</span>
  51
  52 <span class="keyword">void</span> <span class="identifier">single_sample_find_df</span><span class="special">(</span>
  53    <span class="keyword">double</span> <span class="identifier">M</span><span class="special">,</span>          <span class="comment">// M = true mean.</span>
  54    <span class="keyword">double</span> <span class="identifier">Sm</span><span class="special">,</span>         <span class="comment">// Sm = Sample Mean.</span>
  55    <span class="keyword">double</span> <span class="identifier">Sd</span><span class="special">)</span>         <span class="comment">// Sd = Sample Standard Deviation.</span>
  56 <span class="special">{</span>
  57 </pre>
  58 <p>
  59             Next we define a table of significance levels:
  60           </p>
  61 <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">0.5</span><span class="special">,</span> <span class="number">0.25</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span>
  62 </pre>
  63 <p>
  64             Printing out the table of sample sizes required for various confidence
  65             levels begins with the table header:
  66           </p>
  67 <pre class="programlisting"><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"\n\n"</span>
  68         <span class="string">"_______________________________________________________________\n"</span>
  69         <span class="string">"Confidence       Estimated          Estimated\n"</span>
  70         <span class="string">" Value (%)      Sample Size        Sample Size\n"</span>
  71         <span class="string">"              (one sided test)    (two sided test)\n"</span>
  72         <span class="string">"_______________________________________________________________\n"</span><span class="special">;</span>
  73 </pre>
  74 <p>
  75             And now the important part: the sample sizes required. Class <code class="computeroutput"><span class="identifier">students_t_distribution</span></code> has a static
  76             member function <code class="computeroutput"><span class="identifier">find_degrees_of_freedom</span></code>
  77             that will calculate how large a sample size needs to be in order to give
  78             a definitive result.
  79           </p>
  80 <p>
  81             The first argument is the difference between the means that you wish
  82             to be able to detect, here it's the absolute value of the difference
  83             between the sample mean, and the true mean.
  84           </p>
  85 <p>
  86             Then come two probability values: alpha and beta. Alpha is the maximum
  87             acceptable risk of rejecting the null-hypothesis when it is in fact true.
  88             Beta is the maximum acceptable risk of failing to reject the null-hypothesis
  89             when in fact it is false. Also note that for a two-sided test, alpha
  90             must be divided by 2.
  91           </p>
  92 <p>
  93             The final parameter of the function is the standard deviation of the
  94             sample.
  95           </p>
  96 <p>
  97             In this example, we assume that alpha and beta are the same, and call
  98             <code class="computeroutput"><span class="identifier">find_degrees_of_freedom</span></code>
  99             twice: once with alpha for a one-sided test, and once with alpha/2 for
 100             a two-sided test.
 101           </p>
 102 <pre class="programlisting">   <span class="keyword">for</span><span class="special">(</span><span class="keyword">unsigned</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">)/</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">[</span><span class="number">0</span><span class="special">]);</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span>
 103    <span class="special">{</span>
 104       <span class="comment">// Confidence value:</span>
 105       <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">fixed</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">10</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">right</span> <span class="special">&lt;&lt;</span> <span class="number">100</span> <span class="special">*</span> <span class="special">(</span><span class="number">1</span><span class="special">-</span><span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]);</span>
 106       <span class="comment">// calculate df for single sided test:</span>
 107       <span class="keyword">double</span> <span class="identifier">df</span> <span class="special">=</span> <span class="identifier">students_t</span><span class="special">::</span><span class="identifier">find_degrees_of_freedom</span><span class="special">(</span>
 108          <span class="identifier">fabs</span><span class="special">(</span><span class="identifier">M</span> <span class="special">-</span> <span class="identifier">Sm</span><span class="special">),</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">],</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">],</span> <span class="identifier">Sd</span><span class="special">);</span>
 109       <span class="comment">// convert to sample size:</span>
 110       <span class="keyword">double</span> <span class="identifier">size</span> <span class="special">=</span> <span class="identifier">ceil</span><span class="special">(</span><span class="identifier">df</span><span class="special">)</span> <span class="special">+</span> <span class="number">1</span><span class="special">;</span>
 111       <span class="comment">// Print size:</span>
 112       <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">fixed</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">0</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">16</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">right</span> <span class="special">&lt;&lt;</span> <span class="identifier">size</span><span class="special">;</span>
 113       <span class="comment">// calculate df for two sided test:</span>
 114       <span class="identifier">df</span> <span class="special">=</span> <span class="identifier">students_t</span><span class="special">::</span><span class="identifier">find_degrees_of_freedom</span><span class="special">(</span>
 115          <span class="identifier">fabs</span><span class="special">(</span><span class="identifier">M</span> <span class="special">-</span> <span class="identifier">Sm</span><span class="special">),</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">],</span> <span class="identifier">Sd</span><span class="special">);</span>
 116       <span class="comment">// convert to sample size:</span>
 117       <span class="identifier">size</span> <span class="special">=</span> <span class="identifier">ceil</span><span class="special">(</span><span class="identifier">df</span><span class="special">)</span> <span class="special">+</span> <span class="number">1</span><span class="special">;</span>
 118       <span class="comment">// Print size:</span>
 119       <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">fixed</span> <span class="special">&lt;&lt;</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">0</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">16</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">right</span> <span class="special">&lt;&lt;</span> <span class="identifier">size</span> <span class="special">&lt;&lt;</span> <span class="identifier">endl</span><span class="special">;</span>
 120    <span class="special">}</span>
 121    <span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">endl</span><span class="special">;</span>
 122 <span class="special">}</span>
 123 </pre>
 124 <p>
 125             Let's now look at some sample output using data taken from <span class="emphasis"><em>P.K.Hou,
 126             O. W. Lau &amp; M.C. Wong, Analyst (1983) vol. 108, p 64. and from Statistics
 127             for Analytical Chemistry, 3rd ed. (1994), pp 54-55 J. C. Miller and J.
 128             N. Miller, Ellis Horwood ISBN 0 13 0309907.</em></span> The values result
 129             from the determination of mercury by cold-vapour atomic absorption.
 130           </p>
 131 <p>
 132             Only three measurements were made, and the Students-t test above gave
 133             a borderline result, so this example will show us how many samples would
 134             need to be collected:
 135           </p>
 136 <pre class="programlisting">_____________________________________________________________
 137 Estimated sample sizes required for various confidence levels
 138 _____________________________________________________________
 139
 140 True Mean                               =  38.90000
 141 Sample Mean                             =  37.80000
 142 Sample Standard Deviation               =  0.96437
 143
 144
 145 _______________________________________________________________
 146 Confidence       Estimated          Estimated
 147  Value (%)      Sample Size        Sample Size
 148               (one sided test)    (two sided test)
 149 _______________________________________________________________
 150     75.000               3               4
 151     90.000               7               9
 152     95.000              11              13
 153     99.000              20              22
 154     99.900              35              37
 155     99.990              50              53
 156     99.999              66              68
 157 </pre>
 158 <p>
 159             So in this case, many more measurements would have had to be made, for
 160             example at the 95% level, 14 measurements in total for a two-sided test.
 161           </p>
 162 </div>
 163 <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
 164 <td align="left"></td>
 165 <td align="right"><div class="copyright-footer">Copyright &#169; 2006-2010, 2012-2014 Nikhar Agrawal,
 166       Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert
 167       Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan R&#229;de, Gautam Sewani,
 168       Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p>
 169         Distributed under the Boost Software License, Version 1.0. (See accompanying
 170         file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
 171       </p>
 172 </div></td>
 173 </tr></table>
 174 <hr>
 175 <div class="spirit-nav">
 176 <a accesskey="p" href="tut_mean_test.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../st_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="two_sample_students_t.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a>
 177 </div>
 178 </body>
 179 </html>