]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | <html> |
2 | <head> | |
3 | <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII"> | |
4 | <title>Find mean and standard deviation example</title> | |
5 | <link rel="stylesheet" href="../../../../math.css" type="text/css"> | |
6 | <meta name="generator" content="DocBook XSL Stylesheets V1.77.1"> | |
7 | <link rel="home" href="../../../../index.html" title="Math Toolkit 2.5.1"> | |
8 | <link rel="up" href="../find_eg.html" title="Find Location and Scale Examples"> | |
9 | <link rel="prev" href="find_scale_eg.html" title="Find Scale (Standard Deviation) Example"> | |
10 | <link rel="next" href="../nag_library.html" title="Comparison with C, R, FORTRAN-style Free Functions"> | |
11 | </head> | |
12 | <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> | |
13 | <table cellpadding="2" width="100%"><tr> | |
14 | <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../../boost.png"></td> | |
15 | <td align="center"><a href="../../../../../../../../index.html">Home</a></td> | |
16 | <td align="center"><a href="../../../../../../../../libs/libraries.htm">Libraries</a></td> | |
17 | <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> | |
18 | <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> | |
19 | <td align="center"><a href="../../../../../../../../more/index.htm">More</a></td> | |
20 | </tr></table> | |
21 | <hr> | |
22 | <div class="spirit-nav"> | |
23 | <a accesskey="p" href="find_scale_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../find_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../nag_library.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
24 | </div> | |
25 | <div class="section"> | |
26 | <div class="titlepage"><div><div><h5 class="title"> | |
27 | <a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg"></a><a class="link" href="find_mean_and_sd_eg.html" title="Find mean and standard deviation example">Find | |
28 | mean and standard deviation example</a> | |
29 | </h5></div></div></div> | |
30 | <p> | |
31 | First we need some includes to access the normal distribution, the algorithms | |
32 | to find location and scale (and some std output of course). | |
33 | </p> | |
34 | <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">normal</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> <span class="comment">// for normal_distribution</span> | |
35 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">normal</span><span class="special">;</span> <span class="comment">// typedef provides default type is double.</span> | |
36 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">cauchy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> <span class="comment">// for cauchy_distribution</span> | |
37 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">cauchy</span><span class="special">;</span> <span class="comment">// typedef provides default type is double.</span> | |
38 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">find_location</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> | |
39 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">find_location</span><span class="special">;</span> | |
40 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">find_scale</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> | |
41 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">find_scale</span><span class="special">;</span> | |
42 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">complement</span><span class="special">;</span> | |
43 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">policies</span><span class="special">::</span><span class="identifier">policy</span><span class="special">;</span> | |
44 | ||
45 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> | |
46 | <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">left</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">showpoint</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">noshowpoint</span><span class="special">;</span> | |
47 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iomanip</span><span class="special">></span> | |
48 | <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">setw</span><span class="special">;</span> <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">setprecision</span><span class="special">;</span> | |
49 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">limits</span><span class="special">></span> | |
50 | <span class="keyword">using</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">numeric_limits</span><span class="special">;</span> | |
51 | <span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">stdexcept</span><span class="special">></span> | |
52 | </pre> | |
53 | <h5> | |
54 | <a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.h0"></a> | |
55 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.using_find_location_and_find_sca"></a></span><a class="link" href="find_mean_and_sd_eg.html#math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.using_find_location_and_find_sca">Using | |
56 | find_location and find_scale to meet dispensing and measurement specifications</a> | |
57 | </h5> | |
58 | <p> | |
59 | Consider an example from K Krishnamoorthy, Handbook of Statistical Distributions | |
60 | with Applications, ISBN 1-58488-635-8, (2006) p 126, example 10.3.7. | |
61 | </p> | |
62 | <p> | |
63 | "A machine is set to pack 3 kg of ground beef per pack. Over a long | |
64 | period of time it is found that the average packed was 3 kg with a standard | |
65 | deviation of 0.1 kg. Assume the packing is normally distributed." | |
66 | </p> | |
67 | <p> | |
68 | We start by constructing a normal distribution with the given parameters: | |
69 | </p> | |
70 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">mean</span> <span class="special">=</span> <span class="number">3.</span><span class="special">;</span> <span class="comment">// kg</span> | |
71 | <span class="keyword">double</span> <span class="identifier">standard_deviation</span> <span class="special">=</span> <span class="number">0.1</span><span class="special">;</span> <span class="comment">// kg</span> | |
72 | <span class="identifier">normal</span> <span class="identifier">packs</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
73 | </pre> | |
74 | <p> | |
75 | We can then find the fraction (or %) of packages that weigh more than | |
76 | 3.1 kg. | |
77 | </p> | |
78 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">max_weight</span> <span class="special">=</span> <span class="number">3.1</span><span class="special">;</span> <span class="comment">// kg</span> | |
79 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Percentage of packs > "</span> <span class="special"><<</span> <span class="identifier">max_weight</span> <span class="special"><<</span> <span class="string">" is "</span> | |
80 | <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">max_weight</span><span class="special">))</span> <span class="special">*</span> <span class="number">100.</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> <span class="comment">// P(X > 3.1)</span> | |
81 | </pre> | |
82 | <p> | |
83 | We might want to ensure that 95% of packs are over a minimum weight specification, | |
84 | then we want the value of the mean such that P(X < 2.9) = 0.05. | |
85 | </p> | |
86 | <p> | |
87 | Using the mean of 3 kg, we can estimate the fraction of packs that fail | |
88 | to meet the specification of 2.9 kg. | |
89 | </p> | |
90 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">minimum_weight</span> <span class="special">=</span> <span class="number">2.9</span><span class="special">;</span> | |
91 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs <= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
92 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
93 | <span class="comment">// fraction of packs <= 2.9 with a mean of 3 is 0.841345</span> | |
94 | </pre> | |
95 | <p> | |
96 | This is 0.84 - more than the target fraction of 0.95. If we want 95% | |
97 | to be over the minimum weight, what should we set the mean weight to | |
98 | be? | |
99 | </p> | |
100 | <p> | |
101 | Using the KK StatCalc program supplied with the book and the method given | |
102 | on page 126 gives 3.06449. | |
103 | </p> | |
104 | <p> | |
105 | We can confirm this by constructing a new distribution which we call | |
106 | 'xpacks' with a safety margin mean of 3.06449 thus: | |
107 | </p> | |
108 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">over_mean</span> <span class="special">=</span> <span class="number">3.06449</span><span class="special">;</span> | |
109 | <span class="identifier">normal</span> <span class="identifier">xpacks</span><span class="special">(</span><span class="identifier">over_mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
110 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> | |
111 | <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">xpacks</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> | |
112 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">xpacks</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
113 | <span class="comment">// fraction of packs >= 2.9 with a mean of 3.06449 is 0.950005</span> | |
114 | </pre> | |
115 | <p> | |
116 | Using this Math Toolkit, we can calculate the required mean directly | |
117 | thus: | |
118 | </p> | |
119 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">under_fraction</span> <span class="special">=</span> <span class="number">0.05</span><span class="special">;</span> <span class="comment">// so 95% are above the minimum weight mean - sd = 2.9</span> | |
120 | <span class="keyword">double</span> <span class="identifier">low_limit</span> <span class="special">=</span> <span class="identifier">standard_deviation</span><span class="special">;</span> | |
121 | <span class="keyword">double</span> <span class="identifier">offset</span> <span class="special">=</span> <span class="identifier">mean</span> <span class="special">-</span> <span class="identifier">low_limit</span> <span class="special">-</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">under_fraction</span><span class="special">);</span> | |
122 | <span class="keyword">double</span> <span class="identifier">nominal_mean</span> <span class="special">=</span> <span class="identifier">mean</span> <span class="special">+</span> <span class="identifier">offset</span><span class="special">;</span> | |
123 | <span class="comment">// mean + (mean - low_limit - quantile(packs, under_fraction));</span> | |
124 | ||
125 | <span class="identifier">normal</span> <span class="identifier">nominal_packs</span><span class="special">(</span><span class="identifier">nominal_mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
126 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Setting the packer to "</span> <span class="special"><<</span> <span class="identifier">nominal_mean</span> <span class="special"><<</span> <span class="string">" will mean that "</span> | |
127 | <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> | |
128 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">nominal_packs</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
129 | <span class="comment">// Setting the packer to 3.06449 will mean that fraction of packs >= 2.9 is 0.95</span> | |
130 | </pre> | |
131 | <p> | |
132 | This calculation is generalized as the free function called <code class="computeroutput"><span class="identifier">find_location</span></code>, see <a class="link" href="../../../dist_ref/dist_algorithms.html" title="Distribution Algorithms">algorithms</a>. | |
133 | </p> | |
134 | <p> | |
135 | To use this we will need to | |
136 | </p> | |
137 | <pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">distributions</span><span class="special">/</span><span class="identifier">find_location</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> | |
138 | <span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">find_location</span><span class="special">;</span> | |
139 | </pre> | |
140 | <p> | |
141 | and then use find_location function to find safe_mean, & construct | |
142 | a new normal distribution called 'goodpacks'. | |
143 | </p> | |
144 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">safe_mean</span> <span class="special">=</span> <span class="identifier">find_location</span><span class="special"><</span><span class="identifier">normal</span><span class="special">>(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="identifier">under_fraction</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
145 | <span class="identifier">normal</span> <span class="identifier">good_packs</span><span class="special">(</span><span class="identifier">safe_mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
146 | </pre> | |
147 | <p> | |
148 | with the same confirmation as before: | |
149 | </p> | |
150 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Setting the packer to "</span> <span class="special"><<</span> <span class="identifier">nominal_mean</span> <span class="special"><<</span> <span class="string">" will mean that "</span> | |
151 | <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> | |
152 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">good_packs</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
153 | <span class="comment">// Setting the packer to 3.06449 will mean that fraction of packs >= 2.9 is 0.95</span> | |
154 | </pre> | |
155 | <h5> | |
156 | <a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.h1"></a> | |
157 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.using_cauchy_lorentz_instead_of_"></a></span><a class="link" href="find_mean_and_sd_eg.html#math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.using_cauchy_lorentz_instead_of_">Using | |
158 | Cauchy-Lorentz instead of normal distribution</a> | |
159 | </h5> | |
160 | <p> | |
161 | After examining the weight distribution of a large number of packs, we | |
162 | might decide that, after all, the assumption of a normal distribution | |
163 | is not really justified. We might find that the fit is better to a <a class="link" href="../../../dist_ref/dists/cauchy_dist.html" title="Cauchy-Lorentz Distribution">Cauchy Distribution</a>. | |
164 | This distribution has wider 'wings', so that whereas most of the values | |
165 | are closer to the mean than the normal, there are also more values than | |
166 | 'normal' that lie further from the mean than the normal. | |
167 | </p> | |
168 | <p> | |
169 | This might happen because a larger than normal lump of meat is either | |
170 | included or excluded. | |
171 | </p> | |
172 | <p> | |
173 | We first create a <a class="link" href="../../../dist_ref/dists/cauchy_dist.html" title="Cauchy-Lorentz Distribution">Cauchy | |
174 | Distribution</a> with the original mean and standard deviation, and | |
175 | estimate the fraction that lie below our minimum weight specification. | |
176 | </p> | |
177 | <pre class="programlisting"><span class="identifier">cauchy</span> <span class="identifier">cpacks</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
178 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Cauchy Setting the packer to "</span> <span class="special"><<</span> <span class="identifier">mean</span> <span class="special"><<</span> <span class="string">" will mean that "</span> | |
179 | <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> | |
180 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">cpacks</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
181 | <span class="comment">// Cauchy Setting the packer to 3 will mean that fraction of packs >= 2.9 is 0.75</span> | |
182 | </pre> | |
183 | <p> | |
184 | Note that far fewer of the packs meet the specification, only 75% instead | |
185 | of 95%. Now we can repeat the find_location, using the cauchy distribution | |
186 | as template parameter, in place of the normal used above. | |
187 | </p> | |
188 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">lc</span> <span class="special">=</span> <span class="identifier">find_location</span><span class="special"><</span><span class="identifier">cauchy</span><span class="special">>(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="identifier">under_fraction</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
189 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"find_location<cauchy>(minimum_weight, over fraction, standard_deviation); "</span> <span class="special"><<</span> <span class="identifier">lc</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
190 | <span class="comment">// find_location<cauchy>(minimum_weight, over fraction, packs.standard_deviation()); 3.53138</span> | |
191 | </pre> | |
192 | <p> | |
193 | Note that the safe_mean setting needs to be much higher, 3.53138 instead | |
194 | of 3.06449, so we will make rather less profit. | |
195 | </p> | |
196 | <p> | |
197 | And again confirm that the fraction meeting specification is as expected. | |
198 | </p> | |
199 | <pre class="programlisting"><span class="identifier">cauchy</span> <span class="identifier">goodcpacks</span><span class="special">(</span><span class="identifier">lc</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">);</span> | |
200 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Cauchy Setting the packer to "</span> <span class="special"><<</span> <span class="identifier">lc</span> <span class="special"><<</span> <span class="string">" will mean that "</span> | |
201 | <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> | |
202 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">goodcpacks</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
203 | <span class="comment">// Cauchy Setting the packer to 3.53138 will mean that fraction of packs >= 2.9 is 0.95</span> | |
204 | </pre> | |
205 | <p> | |
206 | Finally we could estimate the effect of a much tighter specification, | |
207 | that 99% of packs met the specification. | |
208 | </p> | |
209 | <pre class="programlisting"><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Cauchy Setting the packer to "</span> | |
210 | <span class="special"><<</span> <span class="identifier">find_location</span><span class="special"><</span><span class="identifier">cauchy</span><span class="special">>(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="number">0.99</span><span class="special">,</span> <span class="identifier">standard_deviation</span><span class="special">)</span> | |
211 | <span class="special"><<</span> <span class="string">" will mean that "</span> | |
212 | <span class="special"><<</span> <span class="string">"fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> | |
213 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">goodcpacks</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
214 | </pre> | |
215 | <p> | |
216 | Setting the packer to 3.13263 will mean that fraction of packs >= | |
217 | 2.9 is 0.99, but will more than double the mean loss from 0.0644 to 0.133 | |
218 | kg per pack. | |
219 | </p> | |
220 | <p> | |
221 | Of course, this calculation is not limited to packs of meat, it applies | |
222 | to dispensing anything, and it also applies to a 'virtual' material like | |
223 | any measurement. | |
224 | </p> | |
225 | <p> | |
226 | The only caveat is that the calculation assumes that the standard deviation | |
227 | (scale) is known with a reasonably low uncertainty, something that is | |
228 | not so easy to ensure in practice. And that the distribution is well | |
229 | defined, <a class="link" href="../../../dist_ref/dists/normal_dist.html" title="Normal (Gaussian) Distribution">Normal | |
230 | Distribution</a> or <a class="link" href="../../../dist_ref/dists/cauchy_dist.html" title="Cauchy-Lorentz Distribution">Cauchy | |
231 | Distribution</a>, or some other. | |
232 | </p> | |
233 | <p> | |
234 | If one is simply dispensing a very large number of packs, then it may | |
235 | be feasible to measure the weight of hundreds or thousands of packs. | |
236 | With a healthy 'degrees of freedom', the confidence intervals for the | |
237 | standard deviation are not too wide, typically about + and - 10% for | |
238 | hundreds of observations. | |
239 | </p> | |
240 | <p> | |
241 | For other applications, where it is more difficult or expensive to make | |
242 | many observations, the confidence intervals are depressingly wide. | |
243 | </p> | |
244 | <p> | |
245 | See <a class="link" href="../cs_eg/chi_sq_intervals.html" title="Confidence Intervals on the Standard Deviation">Confidence | |
246 | Intervals on the standard deviation</a> for a worked example <a href="../../../../../../example/chi_square_std_dev_test.cpp" target="_top">chi_square_std_dev_test.cpp</a> | |
247 | of estimating these intervals. | |
248 | </p> | |
249 | <h5> | |
250 | <a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.h2"></a> | |
251 | <span class="phrase"><a name="math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.changing_the_scale_or_standard_d"></a></span><a class="link" href="find_mean_and_sd_eg.html#math_toolkit.stat_tut.weg.find_eg.find_mean_and_sd_eg.changing_the_scale_or_standard_d">Changing | |
252 | the scale or standard deviation</a> | |
253 | </h5> | |
254 | <p> | |
255 | Alternatively, we could invest in a better (more precise) packer (or | |
256 | measuring device) with a lower standard deviation, or scale. | |
257 | </p> | |
258 | <p> | |
259 | This might cost more, but would reduce the amount we have to 'give away' | |
260 | in order to meet the specification. | |
261 | </p> | |
262 | <p> | |
263 | To estimate how much better (how much smaller standard deviation) it | |
264 | would have to be, we need to get the 5% quantile to be located at the | |
265 | under_weight limit, 2.9 | |
266 | </p> | |
267 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">p</span> <span class="special">=</span> <span class="number">0.05</span><span class="special">;</span> <span class="comment">// wanted p th quantile.</span> | |
268 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Quantile of "</span> <span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" = "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">packs</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span> | |
269 | <span class="special"><<</span> <span class="string">", mean = "</span> <span class="special"><<</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> <span class="special"><<</span> <span class="string">", sd = "</span> <span class="special"><<</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
270 | </pre> | |
271 | <p> | |
272 | Quantile of 0.05 = 2.83551, mean = 3, sd = 0.1 | |
273 | </p> | |
274 | <p> | |
275 | With the current packer (mean = 3, sd = 0.1), the 5% quantile is at 2.8551 | |
276 | kg, a little below our target of 2.9 kg. So we know that the standard | |
277 | deviation is going to have to be smaller. | |
278 | </p> | |
279 | <p> | |
280 | Let's start by guessing that it (now 0.1) needs to be halved, to a standard | |
281 | deviation of 0.05 kg. | |
282 | </p> | |
283 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">pack05</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="number">0.05</span><span class="special">);</span> | |
284 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Quantile of "</span> <span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" = "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">pack05</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span> | |
285 | <span class="special"><<</span> <span class="string">", mean = "</span> <span class="special"><<</span> <span class="identifier">pack05</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> <span class="special"><<</span> <span class="string">", sd = "</span> <span class="special"><<</span> <span class="identifier">pack05</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
286 | <span class="comment">// Quantile of 0.05 = 2.91776, mean = 3, sd = 0.05</span> | |
287 | ||
288 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
289 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack05</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
290 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack05</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
291 | <span class="comment">// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.05 is 0.97725</span> | |
292 | </pre> | |
293 | <p> | |
294 | So 0.05 was quite a good guess, but we are a little over the 2.9 target, | |
295 | so the standard deviation could be a tiny bit more. So we could do some | |
296 | more guessing to get closer, say by increasing standard deviation to | |
297 | 0.06 kg, constructing another new distribution called pack06. | |
298 | </p> | |
299 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">pack06</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="number">0.06</span><span class="special">);</span> | |
300 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"Quantile of "</span> <span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">" = "</span> <span class="special"><<</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">pack06</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span> | |
301 | <span class="special"><<</span> <span class="string">", mean = "</span> <span class="special"><<</span> <span class="identifier">pack06</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()</span> <span class="special"><<</span> <span class="string">", sd = "</span> <span class="special"><<</span> <span class="identifier">pack06</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
302 | <span class="comment">// Quantile of 0.05 = 2.90131, mean = 3, sd = 0.06</span> | |
303 | ||
304 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
305 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack06</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
306 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack06</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
307 | <span class="comment">// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.06 is 0.95221</span> | |
308 | </pre> | |
309 | <p> | |
310 | Now we are getting really close, but to do the job properly, we might | |
311 | need to use root finding method, for example the tools provided, and | |
312 | used elsewhere, in the Math Toolkit, see <a class="link" href="../../../roots/roots_noderiv.html" title="Root Finding Without Derivatives">root-finding | |
313 | without derivatives</a> | |
314 | </p> | |
315 | <p> | |
316 | But in this (normal) distribution case, we can and should be even smarter | |
317 | and make a direct calculation. | |
318 | </p> | |
319 | <p> | |
320 | Our required limit is minimum_weight = 2.9 kg, often called the random | |
321 | variate z. For a standard normal distribution, then probability p = N((minimum_weight | |
322 | - mean) / sd). | |
323 | </p> | |
324 | <p> | |
325 | We want to find the standard deviation that would be required to meet | |
326 | this limit, so that the p th quantile is located at z (minimum_weight). | |
327 | In this case, the 0.05 (5%) quantile is at 2.9 kg pack weight, when the | |
328 | mean is 3 kg, ensuring that 0.95 (95%) of packs are above the minimum | |
329 | weight. | |
330 | </p> | |
331 | <p> | |
332 | Rearranging, we can directly calculate the required standard deviation: | |
333 | </p> | |
334 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">N01</span><span class="special">;</span> <span class="comment">// standard normal distribution with mean zero and unit standard deviation.</span> | |
335 | <span class="identifier">p</span> <span class="special">=</span> <span class="number">0.05</span><span class="special">;</span> | |
336 | <span class="keyword">double</span> <span class="identifier">qp</span> <span class="special">=</span> <span class="identifier">quantile</span><span class="special">(</span><span class="identifier">N01</span><span class="special">,</span> <span class="identifier">p</span><span class="special">);</span> | |
337 | <span class="keyword">double</span> <span class="identifier">sd95</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">minimum_weight</span> <span class="special">-</span> <span class="identifier">mean</span><span class="special">)</span> <span class="special">/</span> <span class="identifier">qp</span><span class="special">;</span> | |
338 | ||
339 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"For the "</span><span class="special"><<</span> <span class="identifier">p</span> <span class="special"><<</span> <span class="string">"th quantile to be located at "</span> | |
340 | <span class="special"><<</span> <span class="identifier">minimum_weight</span> <span class="special"><<</span> <span class="string">", would need a standard deviation of "</span> <span class="special"><<</span> <span class="identifier">sd95</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
341 | <span class="comment">// For the 0.05th quantile to be located at 2.9, would need a standard deviation of 0.0607957</span> | |
342 | </pre> | |
343 | <p> | |
344 | We can now construct a new (normal) distribution pack95 for the 'better' | |
345 | packer, and check that our distribution will meet the specification. | |
346 | </p> | |
347 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">pack95</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">sd95</span><span class="special">);</span> | |
348 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
349 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack95</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
350 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack95</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
351 | <span class="comment">// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.0607957 is 0.95</span> | |
352 | </pre> | |
353 | <p> | |
354 | This calculation is generalized in the free function find_scale, as shown | |
355 | below, giving the same standard deviation. | |
356 | </p> | |
357 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">ss</span> <span class="special">=</span> <span class="identifier">find_scale</span><span class="special"><</span><span class="identifier">normal</span><span class="special">>(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="identifier">under_fraction</span><span class="special">,</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">mean</span><span class="special">());</span> | |
358 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"find_scale<normal>(minimum_weight, under_fraction, packs.mean()); "</span> <span class="special"><<</span> <span class="identifier">ss</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
359 | <span class="comment">// find_scale<normal>(minimum_weight, under_fraction, packs.mean()); 0.0607957</span> | |
360 | </pre> | |
361 | <p> | |
362 | If we had defined an over_fraction, or percentage that must pass specification | |
363 | </p> | |
364 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">over_fraction</span> <span class="special">=</span> <span class="number">0.95</span><span class="special">;</span> | |
365 | </pre> | |
366 | <p> | |
367 | And (wrongly) written | |
368 | </p> | |
369 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">sso</span> <span class="special">=</span> <span class="identifier">find_scale</span><span class="special"><</span><span class="identifier">normal</span><span class="special">>(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="identifier">over_fraction</span><span class="special">,</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">mean</span><span class="special">());</span> | |
370 | </pre> | |
371 | <p> | |
372 | With the default policy, we would get a message like | |
373 | </p> | |
374 | <pre class="programlisting">Message from thrown exception was: | |
375 | Error in function boost::math::find_scale<Dist, Policy>(double, double, double, Policy): | |
376 | Computed scale (-0.060795683191176959) is <= 0! Was the complement intended? | |
377 | </pre> | |
378 | <p> | |
379 | But this would return a <span class="bold"><strong>negative</strong></span> standard | |
380 | deviation - obviously impossible. The probability should be 1 - over_fraction, | |
381 | not over_fraction, thus: | |
382 | </p> | |
383 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">ss1o</span> <span class="special">=</span> <span class="identifier">find_scale</span><span class="special"><</span><span class="identifier">normal</span><span class="special">>(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="number">1</span> <span class="special">-</span> <span class="identifier">over_fraction</span><span class="special">,</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">mean</span><span class="special">());</span> | |
384 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"find_scale<normal>(minimum_weight, under_fraction, packs.mean()); "</span> <span class="special"><<</span> <span class="identifier">ss1o</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
385 | <span class="comment">// find_scale<normal>(minimum_weight, under_fraction, packs.mean()); 0.0607957</span> | |
386 | </pre> | |
387 | <p> | |
388 | But notice that using '1 - over_fraction' - will lead to a loss of accuracy, | |
389 | especially if over_fraction was close to unity. (See <a class="link" href="../../overview/complements.html#why_complements">why | |
390 | complements?</a>). In this (very common) case, we should instead use | |
391 | the <a class="link" href="../../overview/complements.html" title="Complements are supported too - and when to use them">complements</a>, | |
392 | giving the most accurate result. | |
393 | </p> | |
394 | <pre class="programlisting"><span class="keyword">double</span> <span class="identifier">ssc</span> <span class="special">=</span> <span class="identifier">find_scale</span><span class="special"><</span><span class="identifier">normal</span><span class="special">>(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">minimum_weight</span><span class="special">,</span> <span class="identifier">over_fraction</span><span class="special">,</span> <span class="identifier">packs</span><span class="special">.</span><span class="identifier">mean</span><span class="special">()));</span> | |
395 | <span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"find_scale<normal>(complement(minimum_weight, over_fraction, packs.mean())); "</span> <span class="special"><<</span> <span class="identifier">ssc</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
396 | <span class="comment">// find_scale<normal>(complement(minimum_weight, over_fraction, packs.mean())); 0.0607957</span> | |
397 | </pre> | |
398 | <p> | |
399 | Note that our guess of 0.06 was close to the accurate value of 0.060795683191176959. | |
400 | </p> | |
401 | <p> | |
402 | We can again confirm our prediction thus: | |
403 | </p> | |
404 | <pre class="programlisting"><span class="identifier">normal</span> <span class="identifier">pack95c</span><span class="special">(</span><span class="identifier">mean</span><span class="special">,</span> <span class="identifier">ssc</span><span class="special">);</span> | |
405 | <span class="identifier">cout</span> <span class="special"><<</span><span class="string">"Fraction of packs >= "</span> <span class="special"><<</span> <span class="identifier">minimum_weight</span> <span class="special"><<</span> <span class="string">" with a mean of "</span> <span class="special"><<</span> <span class="identifier">mean</span> | |
406 | <span class="special"><<</span> <span class="string">" and standard deviation of "</span> <span class="special"><<</span> <span class="identifier">pack95c</span><span class="special">.</span><span class="identifier">standard_deviation</span><span class="special">()</span> | |
407 | <span class="special"><<</span> <span class="string">" is "</span> <span class="special"><<</span> <span class="identifier">cdf</span><span class="special">(</span><span class="identifier">complement</span><span class="special">(</span><span class="identifier">pack95c</span><span class="special">,</span> <span class="identifier">minimum_weight</span><span class="special">))</span> <span class="special"><<</span> <span class="identifier">endl</span><span class="special">;</span> | |
408 | <span class="comment">// Fraction of packs >= 2.9 with a mean of 3 and standard deviation of 0.0607957 is 0.95</span> | |
409 | </pre> | |
410 | <p> | |
411 | Notice that these two deceptively simple questions: | |
412 | </p> | |
413 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"> | |
414 | Do we over-fill to make sure we meet a minimum specification (or | |
415 | under-fill to avoid an overdose)? | |
416 | </li></ul></div> | |
417 | <p> | |
418 | and/or | |
419 | </p> | |
420 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"> | |
421 | Do we measure better? | |
422 | </li></ul></div> | |
423 | <p> | |
424 | are actually extremely common. | |
425 | </p> | |
426 | <p> | |
427 | The weight of beef might be replaced by a measurement of more or less | |
428 | anything, from drug tablet content, Apollo landing rocket firing, X-ray | |
429 | treatment doses... | |
430 | </p> | |
431 | <p> | |
432 | The scale can be variation in dispensing or uncertainty in measurement. | |
433 | </p> | |
434 | <p> | |
435 | See <a href="../../../../../../example/find_mean_and_sd_normal.cpp" target="_top">find_mean_and_sd_normal.cpp</a> | |
436 | for full source code & appended program output. | |
437 | </p> | |
438 | </div> | |
439 | <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> | |
440 | <td align="left"></td> | |
441 | <td align="right"><div class="copyright-footer">Copyright © 2006-2010, 2012-2014 Nikhar Agrawal, | |
442 | Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert | |
443 | Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan Råde, Gautam Sewani, | |
444 | Benjamin Sobotta, Thijs van den Berg, Daryle Walker and Xiaogang Zhang<p> | |
445 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
446 | file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) | |
447 | </p> | |
448 | </div></td> | |
449 | </tr></table> | |
450 | <hr> | |
451 | <div class="spirit-nav"> | |
452 | <a accesskey="p" href="find_scale_eg.html"><img src="../../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../find_eg.html"><img src="../../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../../../index.html"><img src="../../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../nag_library.html"><img src="../../../../../../../../doc/src/images/next.png" alt="Next"></a> | |
453 | </div> | |
454 | </body> | |
455 | </html> |