2 [section:cs_eg Chi Squared Distribution Examples]
4 [section:chi_sq_intervals Confidence Intervals on the Standard Deviation]
6 Once you have calculated the standard deviation for your data, a legitimate
7 question to ask is "How reliable is the calculated standard deviation?".
8 For this situation the Chi Squared distribution can be used to calculate
9 confidence intervals for the standard deviation.
11 The full example code & sample output is in
12 [@../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp].
14 We'll begin by defining the procedure that will calculate and print out the
17 void confidence_limits_on_std_deviation(
18 double Sd, // Sample Standard Deviation
19 unsigned N) // Sample size
22 We'll begin by printing out some general information:
25 "________________________________________________\n"
26 "2-Sided Confidence Limits For Standard Deviation\n"
27 "________________________________________________\n\n";
28 cout << setprecision(7);
29 cout << setw(40) << left << "Number of Observations" << "= " << N << "\n";
30 cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n";
32 and then define a table of significance levels for which we'll calculate
35 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
37 The distribution we'll need to calculate the confidence intervals is a
38 Chi Squared distribution, with N-1 degrees of freedom:
40 chi_squared dist(N - 1);
42 For each value of alpha, the formula for the confidence interval is given by:
44 [equation chi_squ_tut1]
46 Where [equation chi_squ_tut2] is the upper critical value, and
47 [equation chi_squ_tut3] is the lower critical value of the
48 Chi Squared distribution.
50 In code we begin by printing out a table header:
53 "_____________________________________________\n"
54 "Confidence Lower Upper\n"
55 " Value (%) Limit Limit\n"
56 "_____________________________________________\n";
58 and then loop over the values of alpha and calculate the intervals
59 for each: remember that the lower critical value is the same as the
60 quantile, and the upper critical value is the same as the quantile
61 from the complement of the probability:
63 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
66 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
68 double lower_limit = sqrt((N - 1) * Sd * Sd / quantile(complement(dist, alpha[i] / 2)));
69 double upper_limit = sqrt((N - 1) * Sd * Sd / quantile(dist, alpha[i] / 2));
71 cout << fixed << setprecision(5) << setw(15) << right << lower_limit;
72 cout << fixed << setprecision(5) << setw(15) << right << upper_limit << endl;
76 To see some example output we'll use the
77 [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm
78 gear data] from the __handbook.
79 The data represents measurements of gear diameter from a manufacturing
83 ________________________________________________
84 2-Sided Confidence Limits For Standard Deviation
85 ________________________________________________
87 Number of Observations = 100
88 Standard Deviation = 0.006278908
91 _____________________________________________
92 Confidence Lower Upper
94 _____________________________________________
95 50.000 0.00601 0.00662
96 75.000 0.00582 0.00685
97 90.000 0.00563 0.00712
98 95.000 0.00551 0.00729
99 99.000 0.00530 0.00766
100 99.900 0.00507 0.00812
101 99.990 0.00489 0.00855
102 99.999 0.00474 0.00895
105 So at the 95% confidence level we conclude that the standard deviation
106 is between 0.00551 and 0.00729.
108 [h4 Confidence intervals as a function of the number of observations]
110 Similarly, we can also list the confidence intervals for the standard deviation
111 for the common confidence levels 95%, for increasing numbers of observations.
113 The standard deviation used to compute these values is unity,
114 so the limits listed are *multipliers* for any particular standard deviation.
115 For example, given a standard deviation of 0.0062789 as in the example
116 above; for 100 observations the multiplier is 0.8780
117 giving the lower confidence limit of 0.8780 * 0.006728 = 0.00551.
120 ____________________________________________________
121 Confidence level (two-sided) = 0.0500000
122 Standard Deviation = 1.0000000
123 ________________________________________
124 Observations Lower Upper
126 ________________________________________
148 1000000 0.9986 1.0014
151 With just 2 observations the limits are from *0.445* up to to *31.9*,
152 so the standard deviation might be about *half*
153 the observed value up to [*30 times] the observed value!
155 Estimating a standard deviation with just a handful of values leaves a very great uncertainty,
156 especially the upper limit.
157 Note especially how far the upper limit is skewed from the most likely standard deviation.
159 Even for 10 observations, normally considered a reasonable number,
160 the range is still from 0.69 to 1.8, about a range of 0.7 to 2,
161 and is still highly skewed with an upper limit *twice* the median.
163 When we have 1000 observations, the estimate of the standard deviation is starting to look convincing,
164 with a range from 0.95 to 1.05 - now near symmetrical, but still about + or - 5%.
166 Only when we have 10000 or more repeated observations can we start to be reasonably confident
167 (provided we are sure that other factors like drift are not creeping in).
169 For 10000 observations, the interval is 0.99 to 1.1 - finally a really convincing + or -1% confidence.
171 [endsect] [/section:chi_sq_intervals Confidence Intervals on the Standard Deviation]
173 [section:chi_sq_test Chi-Square Test for the Standard Deviation]
175 We use this test to determine whether the standard deviation of a sample
176 differs from a specified value. Typically this occurs in process change
177 situations where we wish to compare the standard deviation of a new
178 process to an established one.
180 The code for this example is contained in
181 [@../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp], and
182 we'll begin by defining the procedure that will print out the test
185 void chi_squared_test(
186 double Sd, // Sample std deviation
187 double D, // True std deviation
188 unsigned N, // Sample size
189 double alpha) // Significance level
192 The procedure begins by printing a summary of the input data:
195 using namespace boost::math;
199 "______________________________________________\n"
200 "Chi Squared test for sample standard deviation\n"
201 "______________________________________________\n\n";
202 cout << setprecision(5);
203 cout << setw(55) << left << "Number of Observations" << "= " << N << "\n";
204 cout << setw(55) << left << "Sample Standard Deviation" << "= " << Sd << "\n";
205 cout << setw(55) << left << "Expected True Standard Deviation" << "= " << D << "\n\n";
207 The test statistic (T) is simply the ratio of the sample and "true" standard
208 deviations squared, multiplied by the number of degrees of freedom (the
209 sample size less one):
211 double t_stat = (N - 1) * (Sd / D) * (Sd / D);
212 cout << setw(55) << left << "Test Statistic" << "= " << t_stat << "\n";
214 The distribution we need to use, is a Chi Squared distribution with N-1
217 chi_squared dist(N - 1);
219 The various hypothesis that can be tested are summarised in the following table:
223 [[The null-hypothesis: there is no difference in standard deviation from the specified value]
224 [ Reject if T < [chi][super 2][sub (1-alpha/2; N-1)] or T > [chi][super 2][sub (alpha/2; N-1)] ]]
225 [[The alternative hypothesis: there is a difference in standard deviation from the specified value]
226 [ Reject if [chi][super 2][sub (1-alpha/2; N-1)] >= T >= [chi][super 2][sub (alpha/2; N-1)] ]]
227 [[The alternative hypothesis: the standard deviation is less than the specified value]
228 [ Reject if [chi][super 2][sub (1-alpha; N-1)] <= T ]]
229 [[The alternative hypothesis: the standard deviation is greater than the specified value]
230 [ Reject if [chi][super 2][sub (alpha; N-1)] >= T ]]
233 Where [chi][super 2][sub (alpha; N-1)] is the upper critical value of the
234 Chi Squared distribution, and [chi][super 2][sub (1-alpha; N-1)] is the
235 lower critical value.
237 Recall that the lower critical value is the same
238 as the quantile, and the upper critical value is the same as the quantile
239 from the complement of the probability, that gives us the following code
240 to calculate the critical values:
242 double ucv = quantile(complement(dist, alpha));
243 double ucv2 = quantile(complement(dist, alpha / 2));
244 double lcv = quantile(dist, alpha);
245 double lcv2 = quantile(dist, alpha / 2);
246 cout << setw(55) << left << "Upper Critical Value at alpha: " << "= "
247 << setprecision(3) << scientific << ucv << "\n";
248 cout << setw(55) << left << "Upper Critical Value at alpha/2: " << "= "
249 << setprecision(3) << scientific << ucv2 << "\n";
250 cout << setw(55) << left << "Lower Critical Value at alpha: " << "= "
251 << setprecision(3) << scientific << lcv << "\n";
252 cout << setw(55) << left << "Lower Critical Value at alpha/2: " << "= "
253 << setprecision(3) << scientific << lcv2 << "\n\n";
255 Now that we have the critical values, we can compare these to our test
256 statistic, and print out the result of each hypothesis and test:
258 cout << setw(55) << left <<
259 "Results for Alternative Hypothesis and alpha" << "= "
260 << setprecision(4) << fixed << alpha << "\n\n";
261 cout << "Alternative Hypothesis Conclusion\n";
263 cout << "Standard Deviation != " << setprecision(3) << fixed << D << " ";
264 if((ucv2 < t_stat) || (lcv2 > t_stat))
265 cout << "ACCEPTED\n";
267 cout << "REJECTED\n";
269 cout << "Standard Deviation < " << setprecision(3) << fixed << D << " ";
271 cout << "ACCEPTED\n";
273 cout << "REJECTED\n";
275 cout << "Standard Deviation > " << setprecision(3) << fixed << D << " ";
277 cout << "ACCEPTED\n";
279 cout << "REJECTED\n";
280 cout << endl << endl;
282 To see some example output we'll use the
283 [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm
284 gear data] from the __handbook.
285 The data represents measurements of gear diameter from a manufacturing
286 process. The program output is deliberately designed to mirror
287 the DATAPLOT output shown in the
288 [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm
289 NIST Handbook Example].
292 ______________________________________________
293 Chi Squared test for sample standard deviation
294 ______________________________________________
296 Number of Observations = 100
297 Sample Standard Deviation = 0.00628
298 Expected True Standard Deviation = 0.10000
300 Test Statistic = 0.39030
301 CDF of test statistic: = 1.438e-099
302 Upper Critical Value at alpha: = 1.232e+002
303 Upper Critical Value at alpha/2: = 1.284e+002
304 Lower Critical Value at alpha: = 7.705e+001
305 Lower Critical Value at alpha/2: = 7.336e+001
307 Results for Alternative Hypothesis and alpha = 0.0500
309 Alternative Hypothesis Conclusion'''
310 Standard Deviation != 0.100 ACCEPTED
311 Standard Deviation < 0.100 ACCEPTED
312 Standard Deviation > 0.100 REJECTED
315 In this case we are testing whether the sample standard deviation is 0.1,
316 and the null-hypothesis is rejected, so we conclude that the standard
317 deviation ['is not] 0.1.
319 For an alternative example, consider the
320 [@http://www.itl.nist.gov/div898/handbook/prc/section2/prc23.htm
321 silicon wafer data] again from the __handbook.
322 In this scenario a supplier of 100 ohm.cm silicon wafers claims
323 that his fabrication process can produce wafers with sufficient
324 consistency so that the standard deviation of resistivity for
325 the lot does not exceed 10 ohm.cm. A sample of N = 10 wafers taken
326 from the lot has a standard deviation of 13.97 ohm.cm, and the question
327 we ask ourselves is "Is the suppliers claim correct?".
329 The program output now looks like this:
332 ______________________________________________
333 Chi Squared test for sample standard deviation
334 ______________________________________________
336 Number of Observations = 10
337 Sample Standard Deviation = 13.97000
338 Expected True Standard Deviation = 10.00000
340 Test Statistic = 17.56448
341 CDF of test statistic: = 9.594e-001
342 Upper Critical Value at alpha: = 1.692e+001
343 Upper Critical Value at alpha/2: = 1.902e+001
344 Lower Critical Value at alpha: = 3.325e+000
345 Lower Critical Value at alpha/2: = 2.700e+000
347 Results for Alternative Hypothesis and alpha = 0.0500
349 Alternative Hypothesis Conclusion'''
350 Standard Deviation != 10.000 REJECTED
351 Standard Deviation < 10.000 REJECTED
352 Standard Deviation > 10.000 ACCEPTED
355 In this case, our null-hypothesis is that the standard deviation of
356 the sample is less than 10: this hypothesis is rejected in the analysis
357 above, and so we reject the manufacturers claim.
359 [endsect] [/section:chi_sq_test Chi-Square Test for the Standard Deviation]
361 [section:chi_sq_size Estimating the Required Sample Sizes for a Chi-Square Test for the Standard Deviation]
363 Suppose we conduct a Chi Squared test for standard deviation and the result
364 is borderline, a legitimate question to ask is "How large would the sample size
365 have to be in order to produce a definitive result?"
367 The class template [link math_toolkit.dist_ref.dists.chi_squared_dist
368 chi_squared_distribution] has a static method
369 `find_degrees_of_freedom` that will calculate this value for
370 some acceptable risk of type I failure /alpha/, type II failure
371 /beta/, and difference from the standard deviation /diff/. Please
372 note that the method used works on variance, and not standard deviation
373 as is usual for the Chi Squared Test.
375 The code for this example is located in
376 [@../../example/chi_square_std_dev_test.cpp chi_square_std_dev_test.cpp].
378 We begin by defining a procedure to print out the sample sizes required
379 for various risk levels:
381 void chi_squared_sample_sized(
382 double diff, // difference from variance to detect
383 double variance) // true variance
386 The procedure begins by printing out the input data:
389 using namespace boost::math;
391 // Print out general info:
393 "_____________________________________________________________\n"
394 "Estimated sample sizes required for various confidence levels\n"
395 "_____________________________________________________________\n\n";
396 cout << setprecision(5);
397 cout << setw(40) << left << "True Variance" << "= " << variance << "\n";
398 cout << setw(40) << left << "Difference to detect" << "= " << diff << "\n";
400 And defines a table of significance levels for which we'll calculate sample sizes:
402 double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
404 For each value of alpha we can calculate two sample sizes: one where the
405 sample variance is less than the true value by /diff/ and one
406 where it is greater than the true value by /diff/. Thanks to the
407 asymmetric nature of the Chi Squared distribution these two values will
408 not be the same, the difference in their calculation differs only in the
409 sign of /diff/ that's passed to `find_degrees_of_freedom`. Finally
410 in this example we'll simply things, and let risk level /beta/ be the
414 "_______________________________________________________________\n"
415 "Confidence Estimated Estimated\n"
416 " Value (%) Sample Size Sample Size\n"
417 " (lower one (upper one\n"
418 " sided test) sided test)\n"
419 "_______________________________________________________________\n";
421 // Now print out the data for the table rows.
423 for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
426 cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
427 // calculate df for a lower single sided test:
428 double df = chi_squared::find_degrees_of_freedom(
429 -diff, alpha[i], alpha[i], variance);
430 // convert to sample size:
431 double size = ceil(df) + 1;
433 cout << fixed << setprecision(0) << setw(16) << right << size;
434 // calculate df for an upper single sided test:
435 df = chi_squared::find_degrees_of_freedom(
436 diff, alpha[i], alpha[i], variance);
437 // convert to sample size:
440 cout << fixed << setprecision(0) << setw(16) << right << size << endl;
444 For some example output, consider the
445 [@http://www.itl.nist.gov/div898/handbook/prc/section2/prc23.htm
446 silicon wafer data] from the __handbook.
447 In this scenario a supplier of 100 ohm.cm silicon wafers claims
448 that his fabrication process can produce wafers with sufficient
449 consistency so that the standard deviation of resistivity for
450 the lot does not exceed 10 ohm.cm. A sample of N = 10 wafers taken
451 from the lot has a standard deviation of 13.97 ohm.cm, and the question
452 we ask ourselves is "How large would our sample have to be to reliably
453 detect this difference?".
455 To use our procedure above, we have to convert the
456 standard deviations to variance (square them),
457 after which the program output looks like this:
460 _____________________________________________________________
461 Estimated sample sizes required for various confidence levels
462 _____________________________________________________________
464 True Variance = 100.00000
465 Difference to detect = 95.16090
468 _______________________________________________________________
469 Confidence Estimated Estimated
470 Value (%) Sample Size Sample Size
471 (lower one (upper one
472 sided test) sided test)
473 _______________________________________________________________
484 In this case we are interested in a upper single sided test.
485 So for example, if the maximum acceptable risk of falsely rejecting
486 the null-hypothesis is 0.05 (Type I error), and the maximum acceptable
487 risk of failing to reject the null-hypothesis is also 0.05
488 (Type II error), we estimate that we would need a sample size of 51.
490 [endsect] [/section:chi_sq_size Estimating the Required Sample Sizes for a Chi-Square Test for the Standard Deviation]
492 [endsect] [/section:cs_eg Chi Squared Distribution]
495 Copyright 2006, 2013 John Maddock and Paul A. Bristow.
496 Distributed under the Boost Software License, Version 1.0.
497 (See accompanying file LICENSE_1_0.txt or copy at
498 http://www.boost.org/LICENSE_1_0.txt).