]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <html> |
2 | <head> | |
3 | <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> | |
4 | <title>Some Miscellaneous Examples of the Normal (Gaussian) Distribution</title> | |
5 | <link rel="stylesheet" href="../../../../math.css" type="text/css"> | |
6 | <meta name="generator" content="DocBook XSL Stylesheets V1.77.1"> | |
7 | <link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1"> | |
8 | <link rel="up" href="../normal_example.html" title="Normal Distribution Examples"> | |
9 | <link rel="prev" href="../normal_example.html" title="Normal Distribution Examples"> | |
10 | <link rel="next" href="../inverse_chi_squared_eg.html" title="Inverse Chi-Squared Distribution Bayes Example"> | |
11 | </head> | |
12 | <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> | |
13 | <table cellpadding="2" width="100%"><tr> | |
14 | <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td> | |
15 | <td align="center"><a href="../../../../../../../../index.html">Home</a></td> | |
16 | <td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td> | |
17 | <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> | |
18 | <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> | |
19 | <td align="center"><a href="../../../../../../../../more/index.htm">More</a></td> | |
20 | </tr></table> | |
21 | <hr> | |
22 | <div class="spirit-nav"> | |
23 | <a accesskey="p" href="../normal_example.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../normal_example.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../inverse_chi_squared_eg.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
24 | </div> | |
25 | <div class="section"> | |
26 | <div class="titlepage"><div><div><h5 class="title"> | |
27 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc"></a><a class="link" href="normal_misc.html" title="Some Miscellaneous Examples of the Normal (Gaussian) Distribution">Some | |
28 | Miscellaneous Examples of the Normal (Gaussian) Distribution</a> | |
29 | </h5></div></div></div> | |
30 | <p> | |
31 | The sample program <a href="../../../../../../example/normal_misc_examples.cpp" target="_top">normal_misc_examples.cpp</a> | |
32 | illustrates their use. | |
33 | </p> | |
34 | <h5> | |
35 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h0"></a> | |
36 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.traditional_tables"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.traditional_tables">Traditional | |
37 | Tables</a> | |
38 | </h5> | |
39 | <p> | |
40 | First we need some includes to access the normal distribution (and some | |
41 | std output of course). | |
42 | </p> | |
43 | <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">normal</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> <span class="comment">// for normal_distribution</span> | |
44 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">normal</span><span class="special">;</span> <span class="comment">// typedef provides default type is double.</span> | |
45 | ||
46 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> | |
47 | <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">left</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">showpoint</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">noshowpoint</span><span class="special">;</span> | |
48 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iomanip</span><span class="special">></span> | |
49 | <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">setw</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">setprecision</span><span class="special">;</span> | |
50 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">limits</span><span class="special">></span> | |
51 | <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">numeric_limits</span><span class="special">;</span> | |
52 | ||
53 | <span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> | |
54 | <span class="special">{</span> | |
55 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Example: Normal distribution, Miscellaneous Applications."</span><span class="special">;</span> | |
56 | ||
57 | <span class="keyword">try</span> | |
58 | <span class="special">{</span> | |
59 | <span class="special">{</span> <span class="comment">// Traditional tables and values.</span> | |
60 | </pre> | |
61 | <p> | |
62 | Let's start by printing some traditional tables. | |
63 | </p> | |
64 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">step</span> <span class="special">=</span> <span class="number">1.</span><span class="special">;</span> <span class="comment">// in z</span> | |
65 | <span class="keyword">double</span> <span class="identifier">range</span> <span class="special">=</span> <span class="number">4</span><span class="special">;</span> <span class="comment">// min and max z = -range to +range.</span> | |
66 | <span class="keyword">int</span> <span class="identifier">precision</span> <span class="special">=</span> <span class="number">17</span><span class="special">;</span> <span class="comment">// traditional tables are only computed to much lower precision.</span> | |
67 | <span class="comment">// but std::numeric_limits<double>::max_digits10; on new Standard Libraries gives</span> | |
68 | <span class="comment">// 17, the maximum number of digits that can possibly be significant.</span> | |
69 | <span class="comment">// std::numeric_limits<double>::digits10; == 15 is number of guaranteed digits,</span> | |
70 | <span class="comment">// the other two digits being 'noisy'.</span> | |
71 | ||
72 | <span class="comment">// Construct a standard normal distribution s</span> | |
73 | <span class="identifier">normal</span> <span class="identifier">s</span><span class="special">;</span> <span class="comment">// (default mean = zero, and standard deviation = unity)</span> | |
74 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Standard normal distribution, mean = "</span><span class="special"><<</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> | |
75 | <span class="special"><<</span> <span class="string">", standard deviation = "</span> <span class="special"><<</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
76 | </pre> | |
77 | <p> | |
78 | First the probability distribution function (pdf). | |
79 | </p> | |
80 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Probability distribution function values"</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
81 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">" z "</span> <span class="string">" pdf "</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
82 | <span class="identifier">cout</span><span class="special">.</span><span class="identifier">precision</span><span class="special">(</span><span class="number">5</span><span class="special">);</span> | |
83 | <span class="keyword">for</span> <span class="special">(</span><span class="keyword">double</span> <span class="identifier">z</span> <span class="special">=</span> <span class="special">-</span><span class="identifier">range</span><span class="special">;</span> <span class="identifier">z</span> <span class="special"><</span> <span class="identifier">range</span> <span class="special">+</span> <span class="identifier">step</span><span class="special">;</span> <span class="identifier">z</span> <span class="special">+=</span> <span class="identifier">step</span><span class="special">)</span> | |
84 | <span class="special">{</span> | |
85 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">6</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">z</span> <span class="special"><<</span> <span class="string">" "</span> | |
86 | <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="identifier">precision</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">12</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">pdf</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">z</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
87 | <span class="special">}</span> | |
88 | <span class="identifier">cout</span><span class="special">.</span><span class="identifier">precision</span><span class="special">(</span><span class="number">6</span><span class="special">);</span> <span class="comment">// default</span> | |
89 | </pre> | |
90 | <p> | |
91 | And the area under the normal curve from -∞ up to z, the cumulative distribution | |
92 | function (cdf). | |
93 | </p> | |
94 | <pre class="programlisting"><span class="comment">// For a standard normal distribution</span> | |
95 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Standard normal mean = "</span><span class="special"><<</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> | |
96 | <span class="special"><<</span> <span class="string">", standard deviation = "</span> <span class="special"><<</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
97 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Integral (area under the curve) from - infinity up to z "</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
98 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">" z "</span> <span class="string">" cdf "</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
99 | <span class="keyword">for</span> <span class="special">(</span><span class="keyword">double</span> <span class="identifier">z</span> <span class="special">=</span> <span class="special">-</span><span class="identifier">range</span><span class="special">;</span> <span class="identifier">z</span> <span class="special"><</span> <span class="identifier">range</span> <span class="special">+</span> <span class="identifier">step</span><span class="special">;</span> <span class="identifier">z</span> <span class="special">+=</span> <span class="identifier">step</span><span class="special">)</span> | |
100 | <span class="special">{</span> | |
101 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">left</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">3</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">6</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">z</span> <span class="special"><<</span> <span class="string">" "</span> | |
102 | <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="identifier">precision</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">12</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">z</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
103 | <span class="special">}</span> | |
104 | <span class="identifier">cout</span><span class="special">.</span><span class="identifier">precision</span><span class="special">(</span><span class="number">6</span><span class="special">);</span> <span class="comment">// default</span> | |
105 | </pre> | |
106 | <p> | |
107 | And all this you can do with a nanoscopic amount of work compared to | |
108 | the team of <span class="bold"><strong>human computers</strong></span> toiling | |
109 | with Milton Abramovitz and Irene Stegen at the US National Bureau of | |
110 | Standards (now <a href="http://www.nist.gov" target="_top">NIST</a>). Starting | |
111 | in 1938, their "Handbook of Mathematical Functions with Formulas, | |
112 | Graphs and Mathematical Tables", was eventually published in 1964, | |
113 | and has been reprinted numerous times since. (A major replacement is | |
114 | planned at <a href="http://dlmf.nist.gov" target="_top">Digital Library of Mathematical | |
115 | Functions</a>). | |
116 | </p> | |
117 | <p> | |
118 | Pretty-printing a traditional 2-dimensional table is left as an exercise | |
119 | for the student, but why bother now that the Math Toolkit lets you write | |
120 | </p> | |
121 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">z</span> <span class="special">=</span> <span class="number">2.</span><span class="special">;</span> | |
122 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Area for z = "</span> <span class="special"><<</span> <span class="identifier">z</span> <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">z</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// to get the area for z.</span> | |
123 | </pre> | |
124 | <p> | |
125 | Correspondingly, we can obtain the traditional 'critical' values for | |
126 | significance levels. For the 95% confidence level, the significance level | |
127 | usually called alpha, is 0.05 = 1 - 0.95 (for a one-sided test), so we | |
128 | can write | |
129 | </p> | |
130 | <pre class="programlisting"> <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"95% of area has a z below "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="number">0.95</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
131 | <span class="comment">// 95% of area has a z below 1.64485</span> | |
132 | </pre> | |
133 | <p> | |
134 | and a two-sided test (a comparison between two levels, rather than a | |
135 | one-sided test) | |
136 | </p> | |
137 | <pre class="programlisting"> <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"95% of area has a z between "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="number">0.975</span><span class="special">)</span> | |
138 | <span class="special"><<</span> <span class="string">" and "</span> <span class="special"><<</span> <span class="special">-</span><span class="identifier">quantile</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="number">0.975</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
139 | <span class="comment">// 95% of area has a z between 1.95996 and -1.95996</span> | |
140 | </pre> | |
141 | <p> | |
142 | First, define a table of significance levels: these are the probabilities | |
143 | that the true occurrence frequency lies outside the calculated interval. | |
144 | </p> | |
145 | <p> | |
146 | It is convenient to have an alpha level for the probability that z lies | |
147 | outside just one standard deviation. This will not be some nice neat | |
148 | number like 0.05, but we can easily calculate it, | |
149 | </p> | |
150 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha1</span> <span class="special">=</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span><span class="special">)</span> <span class="special">*</span> <span class="number">2</span><span class="special">;</span> <span class="comment">// 0.3173105078629142</span> | |
151 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">17</span><span class="special">)</span> <span class="special"><<</span> <span class="string">"Significance level for z == 1 is "</span> <span class="special"><<</span> <span class="identifier">alpha1</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
152 | </pre> | |
153 | <p> | |
154 | and place in our array of favorite alpha values. | |
155 | </p> | |
156 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">alpha</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span><span class="number">0.3173105078629142</span><span class="special">,</span> <span class="comment">// z for 1 standard deviation.</span> | |
157 | <span class="number">0.20</span><span class="special">,</span> <span class="number">0.1</span><span class="special">,</span> <span class="number">0.05</span><span class="special">,</span> <span class="number">0.01</span><span class="special">,</span> <span class="number">0.001</span><span class="special">,</span> <span class="number">0.0001</span><span class="special">,</span> <span class="number">0.00001</span> <span class="special">};</span> | |
158 | </pre> | |
159 | <p> | |
160 | Confidence value as % is (1 - alpha) * 100 (so alpha 0.05 == 95% confidence) | |
161 | that the true occurrence frequency lies <span class="bold"><strong>inside</strong></span> | |
162 | the calculated interval. | |
163 | </p> | |
164 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"level of significance (alpha)"</span> <span class="special"><<</span> <span class="identifier">setprecision</span><span class="special">(</span><span class="number">4</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
165 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"2-sided 1 -sided z(alpha) "</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
166 | <span class="keyword">for</span> <span class="special">(</span><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special"><</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">)/</span><span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">alpha</span><span class="special">[</span><span class="number">0</span><span class="special">]);</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span> | |
167 | <span class="special">{</span> | |
168 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">15</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span><span class="number">2</span> <span class="special"><<</span> <span class="identifier">setw</span><span class="special">(</span><span class="number">10</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">alpha</span><span class="special">[</span><span class="identifier">i</span><span class="special">]/</span><span class="number">2</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
169 | <span class="comment">// Use quantile(complement(s, alpha[i]/2)) to avoid potential loss of accuracy from quantile(s, 1 - alpha[i]/2)</span> | |
170 | <span class="special">}</span> | |
171 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
172 | </pre> | |
173 | <p> | |
174 | Notice the distinction between one-sided (also called one-tailed) where | |
175 | we are using a > <span class="bold"><strong>or</strong></span> < test (and | |
176 | not both) and considering the area of the tail (integral) from z up to | |
177 | +∞, and a two-sided test where we are using two > <span class="bold"><strong>and</strong></span> | |
178 | < tests, and thus considering two tails, from -∞ up to z low and z high | |
179 | up to +∞. | |
180 | </p> | |
181 | <p> | |
182 | So the 2-sided values alpha[i] are calculated using alpha[i]/2. | |
183 | </p> | |
184 | <p> | |
185 | If we consider a simple example of alpha = 0.05, then for a two-sided | |
186 | test, the lower tail area from -∞ up to -1.96 is 0.025 (alpha/2) and the | |
187 | upper tail area from +z up to +1.96 is also 0.025 (alpha/2), and the | |
188 | area between -1.96 up to 12.96 is alpha = 0.95. and the sum of the two | |
189 | tails is 0.025 + 0.025 = 0.05, | |
190 | </p> | |
191 | <h5> | |
192 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h1"></a> | |
193 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.standard_deviations_either_side_"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.standard_deviations_either_side_">Standard | |
194 | deviations either side of the Mean</a> | |
195 | </h5> | |
196 | <p> | |
197 | Armed with the cumulative distribution function, we can easily calculate | |
198 | the easy to remember proportion of values that lie within 1, 2 and 3 | |
199 | standard deviations from the mean. | |
200 | </p> | |
201 | <pre class="programlisting"><span class="identifier">cout</span><span class="special">.</span><span class="identifier">precision</span><span class="special">(</span><span class="number">3</span><span class="special">);</span> | |
202 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">showpoint</span> <span class="special"><<</span> <span class="string">"cdf(s, s.standard_deviation()) = "</span> | |
203 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">())</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// from -infinity to 1 sd</span> | |
204 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"cdf(complement(s, s.standard_deviation())) = "</span> | |
205 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
206 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction 1 standard deviation within either side of mean is "</span> | |
207 | <span class="special"><<</span> <span class="number">1</span> <span class="special">-</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()))</span> <span class="special">*</span> <span class="number">2</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
208 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction 2 standard deviations within either side of mean is "</span> | |
209 | <span class="special"><<</span> <span class="number">1</span> <span class="special">-</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="number">2</span> <span class="special">*</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()))</span> <span class="special">*</span> <span class="number">2</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
210 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction 3 standard deviations within either side of mean is "</span> | |
211 | <span class="special"><<</span> <span class="number">1</span> <span class="special">-</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="number">3</span> <span class="special">*</span> <span class="identifier">s</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()))</span> <span class="special">*</span> <span class="number">2</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
212 | </pre> | |
213 | <p> | |
214 | To a useful precision, the 1, 2 & 3 percentages are 68, 95 and 99.7, | |
215 | and these are worth memorising as useful 'rules of thumb', as, for example, | |
216 | in <a href="http://en.wikipedia.org/wiki/Standard_deviation" target="_top">standard | |
217 | deviation</a>: | |
218 | </p> | |
219 | <pre class="programlisting">Fraction 1 standard deviation within either side of mean is 0.683 | |
220 | Fraction 2 standard deviations within either side of mean is 0.954 | |
221 | Fraction 3 standard deviations within either side of mean is 0.997 | |
222 | </pre> | |
223 | <p> | |
224 | We could of course get some really accurate values for these <a href="http://en.wikipedia.org/wiki/Confidence_interval" target="_top">confidence | |
225 | intervals</a> by using cout.precision(15); | |
226 | </p> | |
227 | <pre class="programlisting">Fraction 1 standard deviation within either side of mean is 0.682689492137086 | |
228 | Fraction 2 standard deviations within either side of mean is 0.954499736103642 | |
229 | Fraction 3 standard deviations within either side of mean is 0.997300203936740 | |
230 | </pre> | |
231 | <p> | |
232 | But before you get too excited about this impressive precision, don't | |
233 | forget that the <span class="bold"><strong>confidence intervals of the standard | |
234 | deviation</strong></span> are surprisingly wide, especially if you have estimated | |
235 | the standard deviation from only a few measurements. | |
236 | </p> | |
237 | <h5> | |
238 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h2"></a> | |
239 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.some_simple_examples"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.some_simple_examples">Some | |
240 | simple examples</a> | |
241 | </h5> | |
242 | <h5> | |
243 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h3"></a> | |
244 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.life_of_light_bulbs"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.life_of_light_bulbs">Life | |
245 | of light bulbs</a> | |
246 | </h5> | |
247 | <p> | |
248 | Examples from K. Krishnamoorthy, Handbook of Statistical Distributions | |
249 | with Applications, ISBN 1 58488 635 8, page 125... implemented using | |
250 | the Math Toolkit library. | |
251 | </p> | |
252 | <p> | |
253 | A few very simple examples are shown here: | |
254 | </p> | |
255 | <pre class="programlisting"><span class="comment">// K. Krishnamoorthy, Handbook of Statistical Distributions with Applications,</span> | |
256 | <span class="comment">// ISBN 1 58488 635 8, page 125, example 10.3.5</span> | |
257 | </pre> | |
258 | <p> | |
259 | Mean lifespan of 100 W bulbs is 1100 h with standard deviation of 100 | |
260 | h. Assuming, perhaps with little evidence and much faith, that the distribution | |
261 | is normal, we construct a normal distribution called <span class="emphasis"><em>bulbs</em></span> | |
262 | with these values: | |
263 | </p> | |
264 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">mean_life</span> <span class="special">=</span> <span class="number">1100.</span><span class="special">;</span> | |
265 | <span class="keyword">double</span> <span class="identifier">life_standard_deviation</span> <span class="special">=</span> <span class="number">100.</span><span class="special">;</span> | |
266 | <span class="identifier">normal</span> <span class="identifier">bulbs</span><span class="special">(</span><span class="identifier">mean_life</span><span class="special">,</span> <span class="identifier">life_standard_deviation</span><span class="special">);</span> | |
267 | <span class="keyword">double</span> <span class="identifier">expected_life</span> <span class="special">=</span> <span class="number">1000.</span><span class="special">;</span> | |
268 | </pre> | |
269 | <p> | |
270 | The we can use the Cumulative distribution function to predict fractions | |
271 | (or percentages, if * 100) that will last various lifetimes. | |
272 | </p> | |
273 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction of bulbs that will last at best (<=) "</span> <span class="comment">// P(X <= 1000)</span> | |
274 | <span class="special"><<</span> <span class="identifier">expected_life</span> <span class="special"><<</span> <span class="string">" is "</span><span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bulbs</span><span class="special">,</span> <span class="identifier">expected_life</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
275 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction of bulbs that will last at least (>) "</span> <span class="comment">// P(X > 1000)</span> | |
276 | <span class="special"><<</span> <span class="identifier">expected_life</span> <span class="special"><<</span> <span class="string">" is "</span><span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">bulbs</span><span class="special">,</span> <span class="identifier">expected_life</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
277 | <span class="keyword">double</span> <span class="identifier">min_life</span> <span class="special">=</span> <span class="number">900</span><span class="special">;</span> | |
278 | <span class="keyword">double</span> <span class="identifier">max_life</span> <span class="special">=</span> <span class="number">1200</span><span class="special">;</span> | |
279 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction of bulbs that will last between "</span> | |
280 | <span class="special"><<</span> <span class="identifier">min_life</span> <span class="special"><<</span> <span class="string">" and "</span> <span class="special"><<</span> <span class="identifier">max_life</span> <span class="special"><<</span> <span class="string">" is "</span> | |
281 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bulbs</span><span class="special">,</span> <span class="identifier">max_life</span><span class="special">)</span> <span class="comment">// P(X <= 1200)</span> | |
282 | <span class="special">-</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bulbs</span><span class="special">,</span> <span class="identifier">min_life</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// P(X <= 900)</span> | |
283 | </pre> | |
284 | <div class="note"><table border="0" summary="Note"> | |
285 | <tr> | |
286 | <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../../../../../../doc/src/images/note.png"></td> | |
287 | <th align="left">Note</th> | |
288 | </tr> | |
289 | <tr><td align="left" valign="top"><p> | |
290 | Real-life failures are often very ab-normal, with a significant number | |
291 | that 'dead-on-arrival' or suffer failure very early in their life: | |
292 | the lifetime of the survivors of 'early mortality' may be well described | |
293 | by the normal distribution. | |
294 | </p></td></tr> | |
295 | </table></div> | |
296 | <h5> | |
297 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h4"></a> | |
298 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.how_many_onions"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.how_many_onions">How | |
299 | many onions?</a> | |
300 | </h5> | |
301 | <p> | |
302 | Weekly demand for 5 lb sacks of onions at a store is normally distributed | |
303 | with mean 140 sacks and standard deviation 10. | |
304 | </p> | |
305 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">mean</span> <span class="special">=</span> <span class="number">140.</span><span class="special">;</span> <span class="comment">// sacks per week.</span> | |
306 | <span class="keyword">double</span> <span class="identifier">standard_deviation</span> <span class="special">=</span> <span class="number">10</span><span class="special">;</span> | |
307 | <span class="identifier">normal</span> <span class="identifier">sacks</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
308 | ||
309 | <span class="keyword">double</span> <span class="identifier">stock</span> <span class="special">=</span> <span class="number">160.</span><span class="special">;</span> <span class="comment">// per week.</span> | |
310 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Percentage of weeks overstocked "</span> | |
311 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">sacks</span><span class="special">,</span> <span class="identifier">stock</span><span class="special">)</span> <span class="special">*</span> <span class="number">100.</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// P(X <=160)</span> | |
312 | <span class="comment">// Percentage of weeks overstocked 97.7</span> | |
313 | </pre> | |
314 | <p> | |
315 | So there will be lots of mouldy onions! So we should be able to say what | |
316 | stock level will meet demand 95% of the weeks. | |
317 | </p> | |
318 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">stock_95</span> <span class="special">=</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">sacks</span><span class="special">,</span> <span class="number">0.95</span><span class="special">);</span> | |
319 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Store should stock "</span> <span class="special"><<</span> <span class="keyword">int</span><span class="special">(</span><span class="identifier">stock_95</span><span class="special">)</span> <span class="special"><<</span> <span class="string">" sacks to meet 95% of demands."</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
320 | </pre> | |
321 | <p> | |
322 | And it is easy to estimate how to meet 80% of demand, and waste even | |
323 | less. | |
324 | </p> | |
325 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">stock_80</span> <span class="special">=</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">sacks</span><span class="special">,</span> <span class="number">0.80</span><span class="special">);</span> | |
326 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Store should stock "</span> <span class="special"><<</span> <span class="keyword">int</span><span class="special">(</span><span class="identifier">stock_80</span><span class="special">)</span> <span class="special"><<</span> <span class="string">" sacks to meet 8 out of 10 demands."</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
327 | </pre> | |
328 | <h5> | |
329 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h5"></a> | |
330 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.packing_beef"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.packing_beef">Packing | |
331 | beef</a> | |
332 | </h5> | |
333 | <p> | |
334 | A machine is set to pack 3 kg of ground beef per pack. Over a long period | |
335 | of time it is found that the average packed was 3 kg with a standard | |
336 | deviation of 0.1 kg. Assuming the packing is normally distributed, we | |
337 | can find the fraction (or %) of packages that weigh more than 3.1 kg. | |
338 | </p> | |
339 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">mean</span> <span class="special">=</span> <span class="number">3.</span><span class="special">;</span> <span class="comment">// kg</span> | |
340 | <span class="keyword">double</span> <span class="identifier">standard_deviation</span> <span class="special">=</span> <span class="number">0.1</span><span class="special">;</span> <span class="comment">// kg</span> | |
341 | <span class="identifier">normal</span> <span class="identifier">packs</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
342 | ||
343 | <span class="keyword">double</span> <span class="identifier">max_weight</span> <span class="special">=</span> <span class="number">3.1</span><span class="special">;</span> <span class="comment">// kg</span> | |
344 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Percentage of packs > "</span> <span class="special"><<</span> <span class="identifier">max_weight</span> <span class="special"><<</span> <span class="string">" is "</span> | |
345 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">max_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// P(X > 3.1)</span> | |
346 | ||
347 | <span class="keyword">double</span> <span class="identifier">under_weight</span> <span class="special">=</span> <span class="number">2.9</span><span class="special">;</span> | |
348 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"fraction of packs <= "</span> <span class="special"><<</span> <span class="identifier">under_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
349 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">under_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
350 | <span class="comment">// fraction of packs <= 2.9 with a mean of 3 is 0.841345</span> | |
351 | <span class="comment">// This is 0.84 - more than the target 0.95</span> | |
352 | <span class="comment">// Want 95% to be over this weight, so what should we set the mean weight to be?</span> | |
353 | <span class="comment">// KK StatCalc says:</span> | |
354 | <span class="keyword">double</span> <span class="identifier">over_mean</span> <span class="special">=</span> <span class="number">3.0664</span><span class="special">;</span> | |
355 | <span class="identifier">normal</span> <span class="identifier">xpacks</span><span class="special">(</span><span class="identifier">over_mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
356 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">under_weight</span> | |
357 | <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">xpacks</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> | |
358 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">xpacks</span><span class="special">,</span> <span class="identifier">under_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
359 | <span class="comment">// fraction of packs >= 2.9 with a mean of 3.06449 is 0.950005</span> | |
360 | <span class="keyword">double</span> <span class="identifier">under_fraction</span> <span class="special">=</span> <span class="number">0.05</span><span class="special">;</span> <span class="comment">// so 95% are above the minimum weight mean - sd = 2.9</span> | |
361 | <span class="keyword">double</span> <span class="identifier">low_limit</span> <span class="special">=</span> <span class="identifier">standard_deviation</span><span class="special">;</span> | |
362 | <span class="keyword">double</span> <span class="identifier">offset</span> <span class="special">=</span> <span class="identifier">mean</span> <span class="special">-</span> <span class="identifier">low_limit</span> <span class="special">-</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">under_fraction</span><span class="special">);</span> | |
363 | <span class="keyword">double</span> <span class="identifier">nominal_mean</span> <span class="special">=</span> <span class="identifier">mean</span> <span class="special">+</span> <span class="identifier">offset</span><span class="special">;</span> | |
364 | ||
365 | <span class="identifier">normal</span> <span class="identifier">nominal_packs</span><span class="special">(</span><span class="identifier">nominal_mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
366 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Setting the packer to "</span> <span class="special"><<</span> <span class="identifier">nominal_mean</span> <span class="special"><<</span> <span class="string">" will mean that "</span> | |
367 | <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">under_weight</span> | |
368 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">nominal_packs</span><span class="special">,</span> <span class="identifier">under_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
369 | </pre> | |
370 | <p> | |
371 | Setting the packer to 3.06449 will mean that fraction of packs >= | |
372 | 2.9 is 0.95. | |
373 | </p> | |
374 | <p> | |
375 | Setting the packer to 3.13263 will mean that fraction of packs >= | |
376 | 2.9 is 0.99, but will more than double the mean loss from 0.0644 to 0.133. | |
377 | </p> | |
378 | <p> | |
379 | Alternatively, we could invest in a better (more precise) packer with | |
380 | a lower standard deviation. | |
381 | </p> | |
382 | <p> | |
383 | To estimate how much better (how much smaller standard deviation) it | |
384 | would have to be, we need to get the 5% quantile to be located at the | |
385 | under_weight limit, 2.9 | |
386 | </p> | |
387 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">p</span> <span class="special">=</span> <span class="number">0.05</span><span class="special">;</span> <span class="comment">// wanted p th quantile.</span> | |
388 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Quantile of "</span> <span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" = "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span> | |
389 | <span class="special"><<</span> <span class="string">", mean = "</span> <span class="special"><<</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> <span class="special"><<</span> <span class="string">", sd = "</span> <span class="special"><<</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">//</span> | |
390 | </pre> | |
391 | <p> | |
392 | Quantile of 0.05 = 2.83551, mean = 3, sd = 0.1 | |
393 | </p> | |
394 | <p> | |
395 | With the current packer (mean = 3, sd = 0.1), the 5% quantile is at 2.8551 | |
396 | kg, a little below our target of 2.9 kg. So we know that the standard | |
397 | deviation is going to have to be smaller. | |
398 | </p> | |
399 | <p> | |
400 | Let's start by guessing that it (now 0.1) needs to be halved, to a standard | |
401 | deviation of 0.05 | |
402 | </p> | |
403 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">pack05</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="number">0.05</span><span class="special">);</span> | |
404 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Quantile of "</span> <span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" = "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">pack05</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span> | |
405 | <span class="special"><<</span> <span class="string">", mean = "</span> <span class="special"><<</span> <span class="identifier">pack05</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> <span class="special"><<</span> <span class="string">", sd = "</span> <span class="special"><<</span> <span class="identifier">pack05</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
406 | ||
407 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">under_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
408 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack05</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
409 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack05</span><span class="special">,</span> <span class="identifier">under_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
410 | <span class="comment">//</span> | |
411 | </pre> | |
412 | <p> | |
413 | Fraction of packs >= 2.9 with a mean of 3 and standard deviation of | |
414 | 0.05 is 0.9772 | |
415 | </p> | |
416 | <p> | |
417 | So 0.05 was quite a good guess, but we are a little over the 2.9 target, | |
418 | so the standard deviation could be a tiny bit more. So we could do some | |
419 | more guessing to get closer, say by increasing to 0.06 | |
420 | </p> | |
421 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">pack06</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="number">0.06</span><span class="special">);</span> | |
422 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Quantile of "</span> <span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" = "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">pack06</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span> | |
423 | <span class="special"><<</span> <span class="string">", mean = "</span> <span class="special"><<</span> <span class="identifier">pack06</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> <span class="special"><<</span> <span class="string">", sd = "</span> <span class="special"><<</span> <span class="identifier">pack06</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
424 | ||
425 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">under_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
426 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack06</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
427 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack06</span><span class="special">,</span> <span class="identifier">under_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
428 | </pre> | |
429 | <p> | |
430 | Fraction of packs >= 2.9 with a mean of 3 and standard deviation of | |
431 | 0.06 is 0.9522 | |
432 | </p> | |
433 | <p> | |
434 | Now we are getting really close, but to do the job properly, we could | |
435 | use root finding method, for example the tools provided, and used elsewhere, | |
436 | in the Math Toolkit, see <a class="link" href="../../../roots/roots_noderiv.html" title="Root Finding Without Derivatives">root-finding | |
437 | without derivatives</a>. | |
438 | </p> | |
439 | <p> | |
440 | But in this normal distribution case, we could be even smarter and make | |
441 | a direct calculation. | |
442 | </p> | |
443 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">s</span><span class="special">;</span> <span class="comment">// For standard normal distribution,</span> | |
444 | <span class="keyword">double</span> <span class="identifier">sd</span> <span class="special">=</span> <span class="number">0.1</span><span class="special">;</span> | |
445 | <span class="keyword">double</span> <span class="identifier">x</span> <span class="special">=</span> <span class="number">2.9</span><span class="special">;</span> <span class="comment">// Our required limit.</span> | |
446 | <span class="comment">// then probability p = N((x - mean) / sd)</span> | |
447 | <span class="comment">// So if we want to find the standard deviation that would be required to meet this limit,</span> | |
448 | <span class="comment">// so that the p th quantile is located at x,</span> | |
449 | <span class="comment">// in this case the 0.95 (95%) quantile at 2.9 kg pack weight, when the mean is 3 kg.</span> | |
450 | ||
451 | <span class="keyword">double</span> <span class="identifier">prob</span> <span class="special">=</span> <span class="identifier">pdf</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="special">(</span><span class="identifier">x</span> <span class="special">-</span> <span class="identifier">mean</span><span class="special">)</span> <span class="special">/</span> <span class="identifier">sd</span><span class="special">);</span> | |
452 | <span class="keyword">double</span> <span class="identifier">qp</span> <span class="special">=</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">s</span><span class="special">,</span> <span class="number">0.95</span><span class="special">);</span> | |
453 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"prob = "</span> <span class="special"><<</span> <span class="identifier">prob</span> <span class="special"><<</span> <span class="string">", quantile(p) "</span> <span class="special"><<</span> <span class="identifier">qp</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// p = 0.241971, quantile(p) 1.64485</span> | |
454 | <span class="comment">// Rearranging, we can directly calculate the required standard deviation:</span> | |
455 | <span class="keyword">double</span> <span class="identifier">sd95</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">abs</span><span class="special">((</span><span class="identifier">x</span> <span class="special">-</span> <span class="identifier">mean</span><span class="special">))</span> <span class="special">/</span> <span class="identifier">qp</span><span class="special">;</span> | |
456 | ||
457 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"If we want the "</span><span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" th quantile to be located at "</span> | |
458 | <span class="special"><<</span> <span class="identifier">x</span> <span class="special"><<</span> <span class="string">", would need a standard deviation of "</span> <span class="special"><<</span> <span class="identifier">sd95</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
459 | ||
460 | <span class="identifier">normal</span> <span class="identifier">pack95</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">sd95</span><span class="special">);</span> <span class="comment">// Distribution of the 'ideal better' packer.</span> | |
461 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">under_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
462 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack95</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
463 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack95</span><span class="special">,</span> <span class="identifier">under_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
464 | ||
465 | <span class="comment">// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.0608 is 0.95</span> | |
466 | </pre> | |
467 | <p> | |
468 | Notice that these two deceptively simple questions (do we over-fill or | |
469 | measure better) are actually very common. The weight of beef might be | |
470 | replaced by a measurement of more or less anything. But the calculations | |
471 | rely on the accuracy of the standard deviation - something that is almost | |
472 | always less good than we might wish, especially if based on a few measurements. | |
473 | </p> | |
474 | <h5> | |
475 | <a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.h6"></a> | |
476 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.normal_example.normal_misc.length_of_bolts"></a></span><a class="link" href="normal_misc.html#math_toolkit.stat_tut.weg.normal_example.normal_misc.length_of_bolts">Length | |
477 | of bolts</a> | |
478 | </h5> | |
479 | <p> | |
480 | A bolt is usable if between 3.9 and 4.1 long. From a large batch of bolts, | |
481 | a sample of 50 show a mean length of 3.95 with standard deviation 0.1. | |
482 | Assuming a normal distribution, what proportion is usable? The true sample | |
483 | mean is unknown, but we can use the sample mean and standard deviation | |
484 | to find approximate solutions. | |
485 | </p> | |
486 | <pre class="programlisting"> <span class="identifier">normal</span> <span class="identifier">bolts</span><span class="special">(</span><span class="number">3.95</span><span class="special">,</span> <span class="number">0.1</span><span class="special">);</span> | |
487 | <span class="keyword">double</span> <span class="identifier">top</span> <span class="special">=</span> <span class="number">4.1</span><span class="special">;</span> | |
488 | <span class="keyword">double</span> <span class="identifier">bottom</span> <span class="special">=</span> <span class="number">3.9</span><span class="special">;</span> | |
489 | ||
490 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction long enough [ P(X <= "</span> <span class="special"><<</span> <span class="identifier">top</span> <span class="special"><<</span> <span class="string">") ] is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bolts</span><span class="special">,</span> <span class="identifier">top</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
491 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction too short [ P(X <= "</span> <span class="special"><<</span> <span class="identifier">bottom</span> <span class="special"><<</span> <span class="string">") ] is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bolts</span><span class="special">,</span> <span class="identifier">bottom</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
492 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction OK -between "</span> <span class="special"><<</span> <span class="identifier">bottom</span> <span class="special"><<</span> <span class="string">" and "</span> <span class="special"><<</span> <span class="identifier">top</span> | |
493 | <span class="special"><<</span> <span class="string">"[ P(X <= "</span> <span class="special"><<</span> <span class="identifier">top</span> <span class="special"><<</span> <span class="string">") - P(X<= "</span> <span class="special"><<</span> <span class="identifier">bottom</span> <span class="special"><<</span> <span class="string">" ) ] is "</span> | |
494 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bolts</span><span class="special">,</span> <span class="identifier">top</span><span class="special">)</span> <span class="special">-</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">bolts</span><span class="special">,</span> <span class="identifier">bottom</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
495 | ||
496 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction too long [ P(X > "</span> <span class="special"><<</span> <span class="identifier">top</span> <span class="special"><<</span> <span class="string">") ] is "</span> | |
497 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">bolts</span><span class="special">,</span> <span class="identifier">top</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
498 | ||
499 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"95% of bolts are shorter than "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">bolts</span><span class="special">,</span> <span class="number">0.95</span><span class="special">)</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
500 | </pre> | |
501 | </div> | |
502 | <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> | |
503 | <td align="left"></td> | |
504 | <td align="right"><div class="copyright-footer">Copyright © 2006-2010, 2012-2014 Nikhar Agrawal, | |
505 | Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert | |
506 | Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan Råde, Gautam Sewani, | |
507 | Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p> | |
508 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
509 | file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) | |
510 | </p> | |
511 | </div></td> | |
512 | </tr></table> | |
513 | <hr> | |
514 | <div class="spirit-nav"> | |
515 | <a accesskey="p" href="../normal_example.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../normal_example.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../inverse_chi_squared_eg.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
516 | </div> | |
517 | </body> | |
518 | </html> |