ceph/src/boost/libs/math/doc/policies/policy_tutorial.qbk

   1
   2 [section:pol_tutorial Policy Tutorial]
   3
   4 [section:what_is_a_policy So Just What is a Policy Anyway?]
   5
   6 A policy is a compile-time mechanism for customising the behaviour of a
   7 special function, or a statistical distribution.  With Policies you can
   8 control:
   9
  10 * What action to take when an error occurs.
  11 * What happens when you call a function that is mathematically undefined
  12 (for example, if you ask for the mean of a Cauchy distribution).
  13 * What happens when you ask for a quantile of a discrete distribution.
  14 * Whether the library is allowed to internally promote `float` to `double`
  15 and `double` to `long double` in order to improve precision.
  16 * What precision to use when calculating the result.
  17
  18 Some of these policies could arguably be runtime variables, but then we couldn't
  19 use compile-time dispatch internally to select the best evaluation method
  20 for the given policies.
  21
  22 For this reason a Policy is a /type/: in fact it's an instance of the
  23 class template `boost::math::policies::policy<>`.  This class is just a
  24 compile-time-container of user-selected policies (sometimes called a type-list):
  25
  26    using namespace boost::math::policies;
  27    //
  28    // Define a policy that sets ::errno on overflow, and does
  29    // not promote double to long double internally:
  30    //
  31    typedef policy<domain_error<errno_on_error>, promote_double<false> > mypolicy;
  32
  33 [endsect] [/section:what_is_a_policy So Just What is a Policy Anyway?]
  34
  35 [section:policy_tut_defaults Policies Have Sensible Defaults]
  36
  37 Most of the time you can just ignore the policy framework.
  38
  39 ['*The defaults for the various policies are as follows,
  40 if these work OK for you then you can stop reading now!]
  41
  42 [variablelist
  43 [[Domain Error][Throws a `std::domain_error` exception.]]
  44 [[Pole Error][Occurs when a function is evaluated at a pole: throws a `std::domain_error` exception.]]
  45 [[Overflow Error][Throws a `std::overflow_error` exception.]]
  46 [[Underflow][Ignores the underflow, and returns zero.]]
  47 [[Denormalised Result][Ignores the fact that the result is denormalised, and returns it.]]
  48 [[Rounding Error][Throws a `boost::math::rounding_error` exception.]]
  49 [[Internal Evaluation Error][Throws a `boost::math::evaluation_error` exception.]]
  50 [[Indeterminate Result Error][Returns a result that depends on the function where the error occurred.]]
  51 [[Promotion of float to double][Does occur by default - gives full float precision results.]]
  52 [[Promotion of double to long double][Does occur by default if long double offers
  53    more precision than double.]]
  54 [[Precision of Approximation Used][By default uses an approximation that
  55    will result in the lowest level of error for the type of the result.]]
  56 [[Behaviour of Discrete Quantiles]
  57    [
  58    The quantile function will by default return an integer result that has been
  59    /rounded outwards/.  That is to say lower quantiles (where the probability is
  60    less than 0.5) are rounded downward, and upper quantiles (where the probability
  61    is greater than 0.5) are rounded upwards.  This behaviour
  62    ensures that if an X% quantile is requested, then /at least/ the requested
  63    coverage will be present in the central region, and /no more than/
  64    the requested coverage will be present in the tails.
  65
  66 This behaviour can be changed so that the quantile functions are rounded
  67    differently, or even return a real-valued result using
  68    [link math_toolkit.pol_overview Policies].  It is strongly
  69    recommended that you read the tutorial
  70    [link math_toolkit.pol_tutorial.understand_dis_quant
  71    Understanding Quantiles of Discrete Distributions] before
  72    using the quantile function on a discrete distribution.  The
  73    [link math_toolkit.pol_ref.discrete_quant_ref reference docs]
  74    describe how to change the rounding policy
  75    for these distributions.
  76 ]]
  77 ]
  78
  79 What's more, if you define your own policy type, then it automatically
  80 inherits the defaults for any policies not explicitly set, so given:
  81
  82    using namespace boost::math::policies;
  83    //
  84    // Define a policy that sets ::errno on overflow, and does
  85    // not promote double to long double internally:
  86    //
  87    typedef policy<domain_error<errno_on_error>, promote_double<false> > mypolicy;
  88
  89 then `mypolicy` defines a policy where only the overflow error handling and
  90 `double`-promotion policies differ from the defaults.
  91
  92 [endsect][/section:policy_tut_defaults Policies Have Sensible Defaults]
  93
  94 [section:policy_usage So How are Policies Used Anyway?]
  95
  96 The details follow later, but basically policies can be set by either:
  97
  98 * Defining some macros that change the default behaviour: [*this is the
  99    recommended method for setting installation-wide policies].
 100 * By instantiating a distribution object with an explicit policy:
 101    this is mainly reserved for ad hoc policy changes.
 102 * By passing a policy to a special function as an optional final argument:
 103    this is mainly reserved for ad hoc policy changes.
 104 * By using some helper macros to define a set of functions or distributions
 105 in the current namespace that use a specific policy: [*this is the
 106 recommended method for setting policies on a project- or translation-unit-wide
 107 basis].
 108
 109 The following sections introduce these methods in more detail.
 110
 111 [endsect] [/section:policy_usage So How are Policies Used Anyway?]
 112
 113 [section:changing_policy_defaults Changing the Policy Defaults]
 114
 115 The default policies used by the library are changed by the usual
 116 configuration macro method.
 117
 118 For example, passing `-DBOOST_MATH_DOMAIN_ERROR_POLICY=errno_on_error` to
 119 your compiler will cause domain errors to set `::errno` and return a __NaN
 120 rather than the usual default behaviour of throwing a `std::domain_error`
 121 exception.
 122
 123 [tip For Microsoft Visual Studio,you can add to the Project Property Page,
 124 C/C++, Preprocessor, Preprocessor definitions like:
 125
 126 ``BOOST_MATH_ASSERT_UNDEFINED_POLICY=0
 127 BOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error``
 128
 129 This may be helpful to avoid complications with pre-compiled headers
 130 that may mean that the equivalent definitions in source code:
 131
 132 ``#define BOOST_MATH_ASSERT_UNDEFINED_POLICY false
 133 #define BOOST_MATH_OVERFLOW_ERROR_POLICY errno_on_error``
 134
 135 *may be ignored*.
 136
 137 The compiler command line shows:
 138
 139 ``/D "BOOST_MATH_ASSERT_UNDEFINED_POLICY=0"
 140 /D "BOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error"``
 141 ] [/MSVC tip]
 142
 143 There is however a very important caveat to this:
 144
 145 [important
 146 [*['Default policies changed by setting configuration macros must be changed
 147 uniformly in every translation unit in the program.]]
 148
 149 Failure to follow this rule may result in violations of the "One
 150 Definition Rule (ODR)" and result in unpredictable program behaviour.]
 151
 152 That means there are only two safe ways to use these macros:
 153
 154 * Edit them in [@../../../../boost/math/tools/user.hpp boost/math/tools/user.hpp],
 155 so that the defaults are set on an installation-wide basis.
 156 Unfortunately this may not be convenient if
 157 you are using a pre-installed Boost distribution (on Linux for example).
 158 * Set the defines in your project's Makefile or build environment, so that they
 159 are set uniformly across all translation units.
 160
 161 What you should *not* do is:
 162
 163 * Set the defines in the source file using `#define` as doing so
 164 almost certainly will break your program, unless you're absolutely
 165 certain that the program is restricted to a single translation unit.
 166
 167 And, yes, you will find examples in our test programs where we break this
 168 rule: but only because we know there will always be a single
 169 translation unit only: ['don't say that you weren't warned!]
 170
 171 [import ../../example/error_handling_example.cpp]
 172
 173 [error_handling_example]
 174
 175 [endsect] [/section:changing_policy_defaults Changing the Policy Defaults]
 176
 177 [section:ad_hoc_dist_policies Setting Policies for Distributions on an Ad Hoc Basis]
 178
 179 All of the statistical distributions in this library are class templates
 180 that accept two template parameters:
 181 real type (float, double ...) and policy (how to handle exceptional events),
 182 both with sensible defaults, for example:
 183
 184    namespace boost{ namespace math{
 185
 186    template <class RealType = double, class Policy = policies::policy<> >
 187    class fisher_f_distribution;
 188
 189    typedef fisher_f_distribution<> fisher_f;
 190
 191    }}
 192
 193 This policy gets used by all the accessor functions that accept
 194 a distribution as an argument, and forwarded to all the functions called
 195 by these.  So if you use the shorthand-typedef for the distribution, then you get
 196 `double` precision arithmetic and all the default policies.
 197
 198 However, say for example we wanted to evaluate the quantile
 199 of the binomial distribution at float precision, without internal
 200 promotion to double, and with the result rounded to the /nearest/
 201 integer, then here's how it can be done:
 202
 203 [import ../../example/policy_eg_3.cpp]
 204
 205 [policy_eg_3]
 206
 207 Which outputs:
 208
 209 [pre quantile is: 40]
 210
 211 [endsect][/section:ad_hoc_dist_policies Setting Policies for Distributions on an Ad Hoc Basis]
 212
 213 [section:ad_hoc_sf_policies Changing the Policy on an Ad Hoc Basis for the Special Functions]
 214
 215 All of the special functions in this library come in two overloaded forms,
 216 one with a final "policy" parameter, and one without.  For example:
 217
 218    namespace boost{ namespace math{
 219
 220    template <class RealType, class Policy>
 221    RealType tgamma(RealType, const Policy&);
 222
 223    template <class RealType>
 224    RealType tgamma(RealType);
 225
 226    }} // namespaces
 227
 228 Normally, the second version is just a forwarding wrapper to the first
 229 like this:
 230
 231    template <class RealType>
 232    inline RealType tgamma(RealType x)
 233    {
 234       return tgamma(x, policies::policy<>());
 235    }
 236
 237 So calling a special function with a specific policy
 238 is just a matter of defining the policy type to use
 239 and passing it as the final parameter.  For example,
 240 suppose we want `tgamma` to behave in a C-compatible
 241 fashion and set `::errno` when an error occurs, and never
 242 throw an exception:
 243
 244 [import ../../example/policy_eg_1.cpp]
 245
 246 [policy_eg_1]
 247
 248 which outputs:
 249
 250 [pre
 251 Result of tgamma(30000) is: 1.#INF
 252 errno = 34
 253 Result of tgamma(-10) is: 1.#QNAN
 254 errno = 33
 255 ]
 256
 257 Alternatively, for ad hoc use, we can use the `make_policy`
 258 helper function to create a policy for us: this usage is more
 259 verbose, so is probably only preferred when a policy is going
 260 to be used once only:
 261
 262 [import ../../example/policy_eg_2.cpp]
 263
 264 [policy_eg_2]
 265
 266 [endsect] [/section:ad_hoc_sf_policies Changing the Policy on an Ad Hoc Basis for the Special Functions]
 267
 268 [section:namespace_policies Setting Policies at Namespace or Translation Unit Scope]
 269
 270 Sometimes what you want to do is just change a set of policies within
 271 the current scope: *the one thing you should not do in this situation
 272 is use the configuration macros*, as this can lead to "One Definition
 273 Rule" violations.  Instead this library provides a pair of macros
 274 especially for this purpose.
 275
 276 Let's consider the special functions first: we can declare a set of
 277 forwarding functions that all use a specific policy using the
 278 macro BOOST_MATH_DECLARE_SPECIAL_FUNCTIONS(['Policy]).  This
 279 macro should be used either inside a unique namespace set aside for the
 280 purpose (for example, a C namespace for a C-style policy),
 281 or an unnamed namespace if you just want the functions
 282 visible in global scope for the current file only.
 283
 284 [import ../../example/policy_eg_4.cpp]
 285
 286 [policy_eg_4]
 287
 288 The same mechanism works well at file scope as well, by using an unnamed
 289 namespace, we can ensure that these declarations don't conflict with any
 290 alternate policies present in other translation units:
 291
 292 [import ../../example/policy_eg_5.cpp]
 293
 294 [policy_eg_5]
 295
 296 Handling policies for the statistical distributions is very similar except that now
 297 the macro BOOST_MATH_DECLARE_DISTRIBUTIONS accepts two parameters: the
 298 floating point type to use, and the policy type to apply.  For example:
 299
 300    BOOST_MATH_DECLARE_DISTRIBUTIONS(double, mypolicy)
 301
 302 Results a set of typedefs being defined like this:
 303
 304    typedef boost::math::normal_distribution<double, mypolicy> normal;
 305
 306 The name of each typedef is the same as the name of the distribution
 307 class template, but without the "_distribution" suffix.
 308
 309 [import ../../example/policy_eg_6.cpp]
 310
 311 [policy_eg_6]
 312
 313 [note
 314 There is an important limitation to note: you can *not use the macros
 315 BOOST_MATH_DECLARE_DISTRIBUTIONS and BOOST_MATH_DECLARE_SPECIAL_FUNCTIONS
 316 ['in the same namespace]*,  as doing so creates ambiguities between functions
 317 and distributions of the same name.
 318 ]
 319
 320 As before, the same mechanism works well at file scope as well: by using an unnamed
 321 namespace, we can ensure that these declarations don't conflict with any
 322 alternate policies present in other translation units:
 323
 324 [import ../../example/policy_eg_7.cpp]
 325
 326 [policy_eg_7]
 327
 328 [endsect][/section:namespace_policies Setting Policies at Namespace or Translation Unit Scope]
 329
 330 [section:user_def_err_pol Calling User Defined Error Handlers]
 331
 332 [import ../../example/policy_eg_8.cpp]
 333
 334 [policy_eg_8]
 335
 336 [import ../../example/policy_eg_9.cpp]
 337
 338 [policy_eg_9]
 339
 340 [endsect] [/section:user_def_err_pol Calling User Defined Error Handlers]
 341
 342 [section:understand_dis_quant Understanding Quantiles of Discrete Distributions]
 343
 344 Discrete distributions present us with a problem when calculating the
 345 quantile: we are starting from a continuous real-valued variable - the
 346 probability - but the result (the value of the random variable)
 347 should really be discrete.
 348
 349 Consider for example a Binomial distribution, with a sample size of
 350 50, and a success fraction of 0.5.  There are a variety of ways
 351 we can plot a discrete distribution, but if we plot the PDF
 352 as a step-function then it looks something like this:
 353
 354 [$../graphs/binomial_pdf.png]
 355
 356 Now lets suppose that the user asks for a the quantile that corresponds
 357 to a probability of 0.05, if we zoom in on the CDF for that region here's
 358 what we see:
 359
 360 [$../graphs/binomial_quantile_1.png]
 361
 362 As can be seen there is no random variable that corresponds to
 363 a probability of exactly 0.05, so we're left with two choices as
 364 shown in the figure:
 365
 366 * We could round the result down to 18.
 367 * We could round the result up to 19.
 368
 369 In fact there's actually a third choice as well: we could "pretend" that the
 370 distribution was continuous and return a real valued result: in this case we
 371 would calculate a result of approximately 18.701 (this accurately
 372 reflects the fact that the result is nearer to 19 than 18).
 373
 374 By using policies we can offer any of the above as options, but that
 375 still leaves the question: ['What is actually the right thing to do?]
 376
 377 And in particular: ['What policy should we use by default?]
 378
 379 In coming to an answer we should realise that:
 380
 381 * Calculating an integer result is often much faster than
 382 calculating a real-valued result: in fact in our tests it
 383 was up to 20 times faster.
 384 * Normally people calculate quantiles so that they can perform
 385 a test of some kind: ['"If the random variable is less than N
 386 then we can reject our null-hypothesis with 90% confidence."]
 387
 388 So there is a genuine benefit to calculating an integer result
 389 as well as it being "the right thing to do" from a philosophical
 390 point of view.  What's more if someone asks for a quantile at 0.05,
 391 then we can normally assume that they are asking for
 392 ['[*at least] 95% of the probability to the right of the value chosen,
 393 and [*no more than] 5% of the probability to the left of the value chosen.]
 394
 395 In the above binomial example we would therefore round the result down to 18.
 396
 397 The converse applies to upper-quantiles: If the probability is greater than
 398 0.5 we would want to round the quantile up, ['so that [*at least] the requested
 399 probability is to the left of the value returned, and [*no more than] 1 - the
 400 requested probability is to the right of the value returned.]
 401
 402 Likewise for two-sided intervals, we would round lower quantiles down,
 403 and upper quantiles up.  This ensures that we have ['at least the requested
 404 probability in the central region] and ['no more than 1 minus the requested
 405 probability in the tail areas.]
 406
 407 For example, taking our 50 sample binomial distribution with a success fraction
 408 of 0.5, if we wanted a two sided 90% confidence interval, then we would ask
 409 for the 0.05 and 0.95 quantiles with the results ['rounded outwards] so that
 410 ['at least 90% of the probability] is in the central area:
 411
 412 [$../graphs/binomial_pdf_3.png]
 413
 414 So far so good, but there is in fact a trap waiting for the unwary here:
 415
 416    quantile(binomial(50, 0.5), 0.05);
 417
 418 returns 18 as the result, which is what we would expect from the graph above,
 419 and indeed there is no x greater than 18 for which:
 420
 421    cdf(binomial(50, 0.5), x) <= 0.05;
 422
 423 However:
 424
 425    quantile(binomial(50, 0.5), 0.95);
 426
 427 returns 31, and indeed while there is no x less than 31 for which:
 428
 429    cdf(binomial(50, 0.5), x) >= 0.95;
 430
 431 We might naively expect that for this symmetrical distribution the result
 432 would be 32 (since 32 = 50 - 18), but we need to remember that the cdf of
 433 the binomial is /inclusive/ of the random variable.  So while the left tail
 434 area /includes/ the quantile returned, the right tail area always excludes
 435 an upper quantile value: since that "belongs" to the central area.
 436
 437 Look at the graph above to see what's going on here: the lower quantile
 438 of 18 belongs to the left tail, so any value <= 18 is in the left tail.
 439 The upper quantile of 31 on the other hand belongs to the central area,
 440 so the tail area actually starts at 32, so any value > 31 is in the
 441 right tail.
 442
 443 Therefore if U and L are the upper and lower quantiles respectively, then
 444 a random variable X is in the tail area - where we would reject the null
 445 hypothesis if:
 446
 447    X <= L || X > U
 448
 449 And the a variable X is inside the central region if:
 450
 451    L < X <= U
 452
 453 The moral here is to ['always be very careful with your comparisons
 454 when dealing with a discrete distribution], and if in doubt,
 455 ['base your comparisons on CDF's instead].
 456
 457 [heading Other Rounding Policies are Available]
 458
 459 As you would expect from a section on policies, you won't be surprised
 460 to know that other rounding options are available:
 461
 462 [variablelist
 463
 464 [[integer_round_outwards]
 465    [This is the default policy as described above: lower quantiles
 466    are rounded down (probability < 0.5), and upper quantiles
 467    (probability > 0.5) are rounded up.
 468
 469    This gives /no more than/ the requested probability
 470    in the tails, and /at least/ the requested probability
 471    in the central area.]]
 472 [[integer_round_inwards]
 473    [This is the exact opposite of the default policy:
 474    lower quantiles
 475    are rounded up (probability < 0.5),
 476    and upper quantiles (probability > 0.5) are rounded down.
 477
 478    This gives /at least/ the requested probability
 479    in the tails, and /no more than/ the requested probability
 480    in the central area.]]
 481 [[integer_round_down][This policy will always round the result down
 482    no matter whether it is an upper or lower quantile]]
 483 [[integer_round_up][This policy will always round the result up
 484    no matter whether it is an upper or lower quantile]]
 485 [[integer_round_nearest][This policy will always round the result
 486    to the nearest integer
 487    no matter whether it is an upper or lower quantile]]
 488 [[real][This policy will return a real valued result
 489    for the quantile of a discrete distribution: this is
 490    generally much slower than finding an integer result
 491    but does allow for more sophisticated rounding policies.]]
 492
 493 ]
 494
 495 [import ../../example/policy_eg_10.cpp]
 496
 497 [policy_eg_10]
 498
 499 [endsect]
 500
 501 [endsect] [/section:pol_Tutorial Policy Tutorial]
 502
 503
 504 [/ math.qbk
 505   Copyright 2007, 2013 John Maddock and Paul A. Bristow.
 506   Distributed under the Boost Software License, Version 1.0.
 507   (See accompanying file LICENSE_1_0.txt or copy at
 508   http://www.boost.org/LICENSE_1_0.txt).
 509 ]
 510
 511