]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | |
2 | [section:cs_eg Chi Squared Distribution Examples] | |
3 | ||
4 | [section:chi_sq_intervals Confidence Intervals on the Standard Deviation] | |
5 | ||
6 | Once you have calculated the standard deviation for your data, a legitimate | |
7 | question to ask is "How reliable is the calculated standard deviation?". | |
8 | For this situation the Chi Squared distribution can be used to calculate | |
9 | confidence intervals for the standard deviation. | |
10 | ||
11 | The full example code & sample output is in | |
12 | [@../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp]. | |
13 | ||
14 | We'll begin by defining the procedure that will calculate and print out the | |
15 | confidence intervals: | |
16 | ||
17 | void confidence_limits_on_std_deviation( | |
18 | double Sd, // Sample Standard Deviation | |
19 | unsigned N) // Sample size | |
20 | { | |
21 | ||
22 | We'll begin by printing out some general information: | |
23 | ||
24 | cout << | |
25 | "________________________________________________\n" | |
26 | "2-Sided Confidence Limits For Standard Deviation\n" | |
27 | "________________________________________________\n\n"; | |
28 | cout << setprecision(7); | |
29 | cout << setw(40) << left << "Number of Observations" << "= " << N << "\n"; | |
30 | cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n"; | |
31 | ||
32 | and then define a table of significance levels for which we'll calculate | |
33 | intervals: | |
34 | ||
35 | double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 }; | |
36 | ||
37 | The distribution we'll need to calculate the confidence intervals is a | |
38 | Chi Squared distribution, with N-1 degrees of freedom: | |
39 | ||
40 | chi_squared dist(N - 1); | |
41 | ||
42 | For each value of alpha, the formula for the confidence interval is given by: | |
43 | ||
44 | [equation chi_squ_tut1] | |
45 | ||
46 | Where [equation chi_squ_tut2] is the upper critical value, and | |
47 | [equation chi_squ_tut3] is the lower critical value of the | |
48 | Chi Squared distribution. | |
49 | ||
50 | In code we begin by printing out a table header: | |
51 | ||
52 | cout << "\n\n" | |
53 | "_____________________________________________\n" | |
54 | "Confidence Lower Upper\n" | |
55 | " Value (%) Limit Limit\n" | |
56 | "_____________________________________________\n"; | |
57 | ||
58 | and then loop over the values of alpha and calculate the intervals | |
59 | for each: remember that the lower critical value is the same as the | |
60 | quantile, and the upper critical value is the same as the quantile | |
61 | from the complement of the probability: | |
62 | ||
63 | for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i) | |
64 | { | |
65 | // Confidence value: | |
66 | cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]); | |
67 | // Calculate limits: | |
68 | double lower_limit = sqrt((N - 1) * Sd * Sd / quantile(complement(dist, alpha[i] / 2))); | |
69 | double upper_limit = sqrt((N - 1) * Sd * Sd / quantile(dist, alpha[i] / 2)); | |
70 | // Print Limits: | |
71 | cout << fixed << setprecision(5) << setw(15) << right << lower_limit; | |
72 | cout << fixed << setprecision(5) << setw(15) << right << upper_limit << endl; | |
73 | } | |
74 | cout << endl; | |
75 | ||
76 | To see some example output we'll use the | |
77 | [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm | |
78 | gear data] from the __handbook. | |
79 | The data represents measurements of gear diameter from a manufacturing | |
80 | process. | |
81 | ||
82 | [pre''' | |
83 | ________________________________________________ | |
84 | 2-Sided Confidence Limits For Standard Deviation | |
85 | ________________________________________________ | |
86 | ||
87 | Number of Observations = 100 | |
88 | Standard Deviation = 0.006278908 | |
89 | ||
90 | ||
91 | _____________________________________________ | |
92 | Confidence Lower Upper | |
93 | Value (%) Limit Limit | |
94 | _____________________________________________ | |
95 | 50.000 0.00601 0.00662 | |
96 | 75.000 0.00582 0.00685 | |
97 | 90.000 0.00563 0.00712 | |
98 | 95.000 0.00551 0.00729 | |
99 | 99.000 0.00530 0.00766 | |
100 | 99.900 0.00507 0.00812 | |
101 | 99.990 0.00489 0.00855 | |
102 | 99.999 0.00474 0.00895 | |
103 | '''] | |
104 | ||
105 | So at the 95% confidence level we conclude that the standard deviation | |
106 | is between 0.00551 and 0.00729. | |
107 | ||
108 | [h4 Confidence intervals as a function of the number of observations] | |
109 | ||
110 | Similarly, we can also list the confidence intervals for the standard deviation | |
111 | for the common confidence levels 95%, for increasing numbers of observations. | |
112 | ||
113 | The standard deviation used to compute these values is unity, | |
114 | so the limits listed are *multipliers* for any particular standard deviation. | |
115 | For example, given a standard deviation of 0.0062789 as in the example | |
116 | above; for 100 observations the multiplier is 0.8780 | |
117 | giving the lower confidence limit of 0.8780 * 0.006728 = 0.00551. | |
118 | ||
119 | [pre''' | |
120 | ____________________________________________________ | |
121 | Confidence level (two-sided) = 0.0500000 | |
122 | Standard Deviation = 1.0000000 | |
123 | ________________________________________ | |
124 | Observations Lower Upper | |
125 | Limit Limit | |
126 | ________________________________________ | |
127 | 2 0.4461 31.9102 | |
128 | 3 0.5207 6.2847 | |
129 | 4 0.5665 3.7285 | |
130 | 5 0.5991 2.8736 | |
131 | 6 0.6242 2.4526 | |
132 | 7 0.6444 2.2021 | |
133 | 8 0.6612 2.0353 | |
134 | 9 0.6755 1.9158 | |
135 | 10 0.6878 1.8256 | |
136 | 15 0.7321 1.5771 | |
137 | 20 0.7605 1.4606 | |
138 | 30 0.7964 1.3443 | |
139 | 40 0.8192 1.2840 | |
140 | 50 0.8353 1.2461 | |
141 | 60 0.8476 1.2197 | |
142 | 100 0.8780 1.1617 | |
143 | 120 0.8875 1.1454 | |
144 | 1000 0.9580 1.0459 | |
145 | 10000 0.9863 1.0141 | |
146 | 50000 0.9938 1.0062 | |
147 | 100000 0.9956 1.0044 | |
148 | 1000000 0.9986 1.0014 | |
149 | '''] | |
150 | ||
151 | With just 2 observations the limits are from *0.445* up to to *31.9*, | |
152 | so the standard deviation might be about *half* | |
153 | the observed value up to [*30 times] the observed value! | |
154 | ||
155 | Estimating a standard deviation with just a handful of values leaves a very great uncertainty, | |
156 | especially the upper limit. | |
157 | Note especially how far the upper limit is skewed from the most likely standard deviation. | |
158 | ||
159 | Even for 10 observations, normally considered a reasonable number, | |
160 | the range is still from 0.69 to 1.8, about a range of 0.7 to 2, | |
161 | and is still highly skewed with an upper limit *twice* the median. | |
162 | ||
163 | When we have 1000 observations, the estimate of the standard deviation is starting to look convincing, | |
164 | with a range from 0.95 to 1.05 - now near symmetrical, but still about + or - 5%. | |
165 | ||
166 | Only when we have 10000 or more repeated observations can we start to be reasonably confident | |
167 | (provided we are sure that other factors like drift are not creeping in). | |
168 | ||
169 | For 10000 observations, the interval is 0.99 to 1.1 - finally a really convincing + or -1% confidence. | |
170 | ||
171 | [endsect] [/section:chi_sq_intervals Confidence Intervals on the Standard Deviation] | |
172 | ||
173 | [section:chi_sq_test Chi-Square Test for the Standard Deviation] | |
174 | ||
175 | We use this test to determine whether the standard deviation of a sample | |
176 | differs from a specified value. Typically this occurs in process change | |
177 | situations where we wish to compare the standard deviation of a new | |
178 | process to an established one. | |
179 | ||
180 | The code for this example is contained in | |
181 | [@../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp], and | |
182 | we'll begin by defining the procedure that will print out the test | |
183 | statistics: | |
184 | ||
185 | void chi_squared_test( | |
186 | double Sd, // Sample std deviation | |
187 | double D, // True std deviation | |
188 | unsigned N, // Sample size | |
189 | double alpha) // Significance level | |
190 | { | |
191 | ||
192 | The procedure begins by printing a summary of the input data: | |
193 | ||
194 | using namespace std; | |
195 | using namespace boost::math; | |
196 | ||
197 | // Print header: | |
198 | cout << | |
199 | "______________________________________________\n" | |
200 | "Chi Squared test for sample standard deviation\n" | |
201 | "______________________________________________\n\n"; | |
202 | cout << setprecision(5); | |
203 | cout << setw(55) << left << "Number of Observations" << "= " << N << "\n"; | |
204 | cout << setw(55) << left << "Sample Standard Deviation" << "= " << Sd << "\n"; | |
205 | cout << setw(55) << left << "Expected True Standard Deviation" << "= " << D << "\n\n"; | |
206 | ||
207 | The test statistic (T) is simply the ratio of the sample and "true" standard | |
208 | deviations squared, multiplied by the number of degrees of freedom (the | |
209 | sample size less one): | |
210 | ||
211 | double t_stat = (N - 1) * (Sd / D) * (Sd / D); | |
212 | cout << setw(55) << left << "Test Statistic" << "= " << t_stat << "\n"; | |
213 | ||
214 | The distribution we need to use, is a Chi Squared distribution with N-1 | |
215 | degrees of freedom: | |
216 | ||
217 | chi_squared dist(N - 1); | |
218 | ||
219 | The various hypothesis that can be tested are summarised in the following table: | |
220 | ||
221 | [table | |
222 | [[Hypothesis][Test]] | |
223 | [[The null-hypothesis: there is no difference in standard deviation from the specified value] | |
224 | [ Reject if T < [chi][super 2][sub (1-alpha/2; N-1)] or T > [chi][super 2][sub (alpha/2; N-1)] ]] | |
225 | [[The alternative hypothesis: there is a difference in standard deviation from the specified value] | |
226 | [ Reject if [chi][super 2][sub (1-alpha/2; N-1)] >= T >= [chi][super 2][sub (alpha/2; N-1)] ]] | |
227 | [[The alternative hypothesis: the standard deviation is less than the specified value] | |
228 | [ Reject if [chi][super 2][sub (1-alpha; N-1)] <= T ]] | |
229 | [[The alternative hypothesis: the standard deviation is greater than the specified value] | |
230 | [ Reject if [chi][super 2][sub (alpha; N-1)] >= T ]] | |
231 | ] | |
232 | ||
233 | Where [chi][super 2][sub (alpha; N-1)] is the upper critical value of the | |
234 | Chi Squared distribution, and [chi][super 2][sub (1-alpha; N-1)] is the | |
235 | lower critical value. | |
236 | ||
237 | Recall that the lower critical value is the same | |
238 | as the quantile, and the upper critical value is the same as the quantile | |
239 | from the complement of the probability, that gives us the following code | |
240 | to calculate the critical values: | |
241 | ||
242 | double ucv = quantile(complement(dist, alpha)); | |
243 | double ucv2 = quantile(complement(dist, alpha / 2)); | |
244 | double lcv = quantile(dist, alpha); | |
245 | double lcv2 = quantile(dist, alpha / 2); | |
246 | cout << setw(55) << left << "Upper Critical Value at alpha: " << "= " | |
247 | << setprecision(3) << scientific << ucv << "\n"; | |
248 | cout << setw(55) << left << "Upper Critical Value at alpha/2: " << "= " | |
249 | << setprecision(3) << scientific << ucv2 << "\n"; | |
250 | cout << setw(55) << left << "Lower Critical Value at alpha: " << "= " | |
251 | << setprecision(3) << scientific << lcv << "\n"; | |
252 | cout << setw(55) << left << "Lower Critical Value at alpha/2: " << "= " | |
253 | << setprecision(3) << scientific << lcv2 << "\n\n"; | |
254 | ||
255 | Now that we have the critical values, we can compare these to our test | |
256 | statistic, and print out the result of each hypothesis and test: | |
257 | ||
258 | cout << setw(55) << left << | |
259 | "Results for Alternative Hypothesis and alpha" << "= " | |
260 | << setprecision(4) << fixed << alpha << "\n\n"; | |
261 | cout << "Alternative Hypothesis Conclusion\n"; | |
262 | ||
263 | cout << "Standard Deviation != " << setprecision(3) << fixed << D << " "; | |
264 | if((ucv2 < t_stat) || (lcv2 > t_stat)) | |
265 | cout << "ACCEPTED\n"; | |
266 | else | |
267 | cout << "REJECTED\n"; | |
268 | ||
269 | cout << "Standard Deviation < " << setprecision(3) << fixed << D << " "; | |
270 | if(lcv > t_stat) | |
271 | cout << "ACCEPTED\n"; | |
272 | else | |
273 | cout << "REJECTED\n"; | |
274 | ||
275 | cout << "Standard Deviation > " << setprecision(3) << fixed << D << " "; | |
276 | if(ucv < t_stat) | |
277 | cout << "ACCEPTED\n"; | |
278 | else | |
279 | cout << "REJECTED\n"; | |
280 | cout << endl << endl; | |
281 | ||
282 | To see some example output we'll use the | |
283 | [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm | |
284 | gear data] from the __handbook. | |
285 | The data represents measurements of gear diameter from a manufacturing | |
286 | process. The program output is deliberately designed to mirror | |
287 | the DATAPLOT output shown in the | |
288 | [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm | |
289 | NIST Handbook Example]. | |
290 | ||
291 | [pre''' | |
292 | ______________________________________________ | |
293 | Chi Squared test for sample standard deviation | |
294 | ______________________________________________ | |
295 | ||
296 | Number of Observations = 100 | |
297 | Sample Standard Deviation = 0.00628 | |
298 | Expected True Standard Deviation = 0.10000 | |
299 | ||
300 | Test Statistic = 0.39030 | |
301 | CDF of test statistic: = 1.438e-099 | |
302 | Upper Critical Value at alpha: = 1.232e+002 | |
303 | Upper Critical Value at alpha/2: = 1.284e+002 | |
304 | Lower Critical Value at alpha: = 7.705e+001 | |
305 | Lower Critical Value at alpha/2: = 7.336e+001 | |
306 | ||
307 | Results for Alternative Hypothesis and alpha = 0.0500 | |
308 | ||
309 | Alternative Hypothesis Conclusion''' | |
310 | Standard Deviation != 0.100 ACCEPTED | |
311 | Standard Deviation < 0.100 ACCEPTED | |
312 | Standard Deviation > 0.100 REJECTED | |
313 | ] | |
314 | ||
315 | In this case we are testing whether the sample standard deviation is 0.1, | |
316 | and the null-hypothesis is rejected, so we conclude that the standard | |
317 | deviation ['is not] 0.1. | |
318 | ||
319 | For an alternative example, consider the | |
320 | [@http://www.itl.nist.gov/div898/handbook/prc/section2/prc23.htm | |
321 | silicon wafer data] again from the __handbook. | |
322 | In this scenario a supplier of 100 ohm.cm silicon wafers claims | |
323 | that his fabrication process can produce wafers with sufficient | |
324 | consistency so that the standard deviation of resistivity for | |
325 | the lot does not exceed 10 ohm.cm. A sample of N = 10 wafers taken | |
326 | from the lot has a standard deviation of 13.97 ohm.cm, and the question | |
327 | we ask ourselves is "Is the suppliers claim correct?". | |
328 | ||
329 | The program output now looks like this: | |
330 | ||
331 | [pre''' | |
332 | ______________________________________________ | |
333 | Chi Squared test for sample standard deviation | |
334 | ______________________________________________ | |
335 | ||
336 | Number of Observations = 10 | |
337 | Sample Standard Deviation = 13.97000 | |
338 | Expected True Standard Deviation = 10.00000 | |
339 | ||
340 | Test Statistic = 17.56448 | |
341 | CDF of test statistic: = 9.594e-001 | |
342 | Upper Critical Value at alpha: = 1.692e+001 | |
343 | Upper Critical Value at alpha/2: = 1.902e+001 | |
344 | Lower Critical Value at alpha: = 3.325e+000 | |
345 | Lower Critical Value at alpha/2: = 2.700e+000 | |
346 | ||
347 | Results for Alternative Hypothesis and alpha = 0.0500 | |
348 | ||
349 | Alternative Hypothesis Conclusion''' | |
350 | Standard Deviation != 10.000 REJECTED | |
351 | Standard Deviation < 10.000 REJECTED | |
352 | Standard Deviation > 10.000 ACCEPTED | |
353 | ] | |
354 | ||
355 | In this case, our null-hypothesis is that the standard deviation of | |
356 | the sample is less than 10: this hypothesis is rejected in the analysis | |
357 | above, and so we reject the manufacturers claim. | |
358 | ||
359 | [endsect] [/section:chi_sq_test Chi-Square Test for the Standard Deviation] | |
360 | ||
361 | [section:chi_sq_size Estimating the Required Sample Sizes for a Chi-Square Test for the Standard Deviation] | |
362 | ||
363 | Suppose we conduct a Chi Squared test for standard deviation and the result | |
364 | is borderline, a legitimate question to ask is "How large would the sample size | |
365 | have to be in order to produce a definitive result?" | |
366 | ||
367 | The class template [link math_toolkit.dist_ref.dists.chi_squared_dist | |
368 | chi_squared_distribution] has a static method | |
369 | `find_degrees_of_freedom` that will calculate this value for | |
370 | some acceptable risk of type I failure /alpha/, type II failure | |
371 | /beta/, and difference from the standard deviation /diff/. Please | |
372 | note that the method used works on variance, and not standard deviation | |
373 | as is usual for the Chi Squared Test. | |
374 | ||
375 | The code for this example is located in | |
376 | [@../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp]. | |
377 | ||
378 | We begin by defining a procedure to print out the sample sizes required | |
379 | for various risk levels: | |
380 | ||
381 | void chi_squared_sample_sized( | |
382 | double diff, // difference from variance to detect | |
383 | double variance) // true variance | |
384 | { | |
385 | ||
386 | The procedure begins by printing out the input data: | |
387 | ||
388 | using namespace std; | |
389 | using namespace boost::math; | |
390 | ||
391 | // Print out general info: | |
392 | cout << | |
393 | "_____________________________________________________________\n" | |
394 | "Estimated sample sizes required for various confidence levels\n" | |
395 | "_____________________________________________________________\n\n"; | |
396 | cout << setprecision(5); | |
397 | cout << setw(40) << left << "True Variance" << "= " << variance << "\n"; | |
398 | cout << setw(40) << left << "Difference to detect" << "= " << diff << "\n"; | |
399 | ||
400 | And defines a table of significance levels for which we'll calculate sample sizes: | |
401 | ||
402 | double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 }; | |
403 | ||
404 | For each value of alpha we can calculate two sample sizes: one where the | |
405 | sample variance is less than the true value by /diff/ and one | |
406 | where it is greater than the true value by /diff/. Thanks to the | |
407 | asymmetric nature of the Chi Squared distribution these two values will | |
408 | not be the same, the difference in their calculation differs only in the | |
409 | sign of /diff/ that's passed to `find_degrees_of_freedom`. Finally | |
410 | in this example we'll simply things, and let risk level /beta/ be the | |
411 | same as /alpha/: | |
412 | ||
413 | cout << "\n\n" | |
414 | "_______________________________________________________________\n" | |
415 | "Confidence Estimated Estimated\n" | |
416 | " Value (%) Sample Size Sample Size\n" | |
417 | " (lower one (upper one\n" | |
418 | " sided test) sided test)\n" | |
419 | "_______________________________________________________________\n"; | |
420 | // | |
421 | // Now print out the data for the table rows. | |
422 | // | |
423 | for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i) | |
424 | { | |
425 | // Confidence value: | |
426 | cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]); | |
427 | // calculate df for a lower single sided test: | |
428 | double df = chi_squared::find_degrees_of_freedom( | |
429 | -diff, alpha[i], alpha[i], variance); | |
430 | // convert to sample size: | |
431 | double size = ceil(df) + 1; | |
432 | // Print size: | |
433 | cout << fixed << setprecision(0) << setw(16) << right << size; | |
434 | // calculate df for an upper single sided test: | |
435 | df = chi_squared::find_degrees_of_freedom( | |
436 | diff, alpha[i], alpha[i], variance); | |
437 | // convert to sample size: | |
438 | size = ceil(df) + 1; | |
439 | // Print size: | |
440 | cout << fixed << setprecision(0) << setw(16) << right << size << endl; | |
441 | } | |
442 | cout << endl; | |
443 | ||
444 | For some example output, consider the | |
445 | [@http://www.itl.nist.gov/div898/handbook/prc/section2/prc23.htm | |
446 | silicon wafer data] from the __handbook. | |
447 | In this scenario a supplier of 100 ohm.cm silicon wafers claims | |
448 | that his fabrication process can produce wafers with sufficient | |
449 | consistency so that the standard deviation of resistivity for | |
450 | the lot does not exceed 10 ohm.cm. A sample of N = 10 wafers taken | |
451 | from the lot has a standard deviation of 13.97 ohm.cm, and the question | |
452 | we ask ourselves is "How large would our sample have to be to reliably | |
453 | detect this difference?". | |
454 | ||
455 | To use our procedure above, we have to convert the | |
456 | standard deviations to variance (square them), | |
457 | after which the program output looks like this: | |
458 | ||
459 | [pre''' | |
460 | _____________________________________________________________ | |
461 | Estimated sample sizes required for various confidence levels | |
462 | _____________________________________________________________ | |
463 | ||
464 | True Variance = 100.00000 | |
465 | Difference to detect = 95.16090 | |
466 | ||
467 | ||
468 | _______________________________________________________________ | |
469 | Confidence Estimated Estimated | |
470 | Value (%) Sample Size Sample Size | |
471 | (lower one (upper one | |
472 | sided test) sided test) | |
473 | _______________________________________________________________ | |
474 | 50.000 2 2 | |
475 | 75.000 2 10 | |
476 | 90.000 4 32 | |
477 | 95.000 5 51 | |
478 | 99.000 7 99 | |
479 | 99.900 11 174 | |
480 | 99.990 15 251 | |
481 | 99.999 20 330''' | |
482 | ] | |
483 | ||
484 | In this case we are interested in a upper single sided test. | |
485 | So for example, if the maximum acceptable risk of falsely rejecting | |
486 | the null-hypothesis is 0.05 (Type I error), and the maximum acceptable | |
487 | risk of failing to reject the null-hypothesis is also 0.05 | |
488 | (Type II error), we estimate that we would need a sample size of 51. | |
489 | ||
490 | [endsect] [/section:chi_sq_size Estimating the Required Sample Sizes for a Chi-Square Test for the Standard Deviation] | |
491 | ||
492 | [endsect] [/section:cs_eg Chi Squared Distribution] | |
493 | ||
494 | [/ | |
495 | Copyright 2006, 2013 John Maddock and Paul A. Bristow. | |
496 | Distributed under the Boost Software License, Version 1.0. | |
497 | (See accompanying file LICENSE_1_0.txt or copy at | |
498 | http://www.boost.org/LICENSE_1_0.txt). | |
499 | ] | |
500 |