2 / Copyright (c) 2003 Boost.Test contributors
4 / Distributed under the Boost Software License, Version 1.0. (See accompanying
5 / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
8 [section:test_case_generation Data-driven test cases]
11 [h4 Why data-driven test cases?]
12 Some tests are required to be repeated for a series of different input parameters. One way to achieve this is
13 manually register a test case for each parameter. You can also invoke a test function with
14 all parameters manually from within your test case, like this:
17 void single_test( int i )
19 __BOOST_TEST__( /* test assertion */ );
24 int params[] = { 1, 2, 3, 4, 5 };
25 std::for_each( params, params+5, &single_test );
29 The approach above has several drawbacks:
31 * the logic for running the tests is inside a test itself: `single_test` in the above example is run from the test
32 case `combined_test` while its execution would be better handled by the __UTF__
33 * in case of fatal failure for one of the values in `param` array above (say a failure in __BOOST_TEST_REQUIRE__),
34 the test `combined_test` is aborted and the next test-case in the test tree is executed.
35 * in case of failure, the reporting is not accurate enough: the test should certainly be reran during debugging
36 sessions by a human or additional logic for reporting should be implemented in the test itself.
38 [h4 Parameter generation, scalability and composition]
39 In some circumstance, one would like to run a parametrized test over an /arbitrary large/ set of values. Enumerating the
40 parameters by hand is not a solution that scales well, especially when these parameters can be described in another
41 function that generates these values. However, this solution has also limitations
43 * *Generating functions*: suppose we have a function `func(float f)`, where `f` is any number in [0, 1]. We are not
44 interested that much in the exact value, but we would like to test `func`. What about, instead of writing the `f`
45 for which `func` will be tested against, we choose randomly `f` in [0, 1]? And also what about instead of having
46 only one value for `f`, we run the test on arbitrarily many numbers? We easily understand from this small example
47 that tests requiring parameters are more powerful when, instead of writing down constant values in the test, a
48 generating function is provided.
50 * *Scalability*: suppose we have a test case on `func1`, on which we test `N` values written as constant in the test
51 file. What does the test ensure? We have the guaranty that `func1` is working on these `N` values. Yet in this
52 setting `N` is necessarily finite and usually small. How would we extend or scale `N` easily? One solution is to
53 be able to generate new values, and to be able to define a test on the *class* of possible inputs for `func1` on
54 which the function should have a defined behavior. To some extent, `N` constant written down in the test are just
55 an excerpt of the possible inputs of `func1`, and working on the class of inputs gives more flexibility and power
58 * *Composition*: suppose we already have test cases for two functions `func1` and `func2`, taking as argument the
59 types `T1` and `T2` respectively. Now we would like to test a new functions `func3` that takes as argument a type
60 `T3` containing `T1` and `T2`, and calling `func1` and `func2` through a known algorithm. An example of such a
63 // Returns the log of x
64 // Precondition: x strictly positive.
65 double fast_log(double x);
68 // Precondition: x != 1
69 double fast_inv(double x);
76 double func3(dummy value)
78 return 0.5 * (exp(fast_log(value.field1))/value.field1 + value.field2/fast_inv(value.field2));
85 * `func3` inherits from the preconditions of `fast_log` and `fast_inv`: it is defined in `(0, +infinity)` and in `[-C, +C] - {1}` for `field1` and `field2` respectively (`C`
86 being a constant arbitrarily big).
87 * as defined above, `func3` should be close to 1 everywhere on its definition domain.
88 * we would like to reuse the properties of `fast_log` and `fast_inv` in the compound function `func3` and assert that `func3` is well defined over an arbitrary large definition domain.
90 Having parametrized tests on `func3` hardly tells us about the possible numerical properties or instabilities close to the point `{field1 = 0, field2 = 1}`.
91 Indeed, the parametrized test may test for some points around (0,1), but will fail to provide an *asymptotic behavior* of the function close to this point.
93 [h4 Data driven tests in the Boost.Test framework]
94 The facilities provided by the __UTF__ addressed the issues described above:
96 * the notion of *datasets* eases the description of the class of inputs for test cases. The datasets also implement several
97 operations that enable their combinations to create new, more complex datasets,
98 * two macros, __BOOST_DATA_TEST_CASE__ and __BOOST_DATA_TEST_CASE_F__, respectively without and with fixture support,
99 are used for the declaration and registration of a test case over a collection of values (samples),
100 * each test case, associated to a unique value, is executed independently from others. These tests are guarded in the same
101 way regular test cases are, which makes the execution of the tests over each sample of a dataset isolated, robust,
102 repeatable and ease the debugging,
103 * several datasets generating functions are provided by the __UTF__
105 The remainder of this section covers the notions and feature provided by the __UTF__ about the data-driven test cases, in
108 # the notion of [link boost_test.tests_organization.test_cases.test_case_generation.datasets *dataset* and *sample*] is introduced
109 # [link boost_test.tests_organization.test_cases.test_case_generation.datasets_auto_registration the declaration and registration]
110 of the data-driven test cases are explained,
111 # the [link boost_test.tests_organization.test_cases.test_case_generation.operations /operations/] on datasets are detailed
112 # and finally the built-in [link boost_test.tests_organization.test_cases.test_case_generation.generators dataset generators]
116 [/ ################################################################################################################################## ]
119 To define properly datasets, the notion of *sample* should be introduced first. A *sample* is defined as /polymorphic tuple/.
120 The size of the tuple will be by definition the *arity* of the sample itself.
122 A *dataset* is a /collection of samples/, that
124 * is forward iterable,
125 * can be queried for its `size`, which in turn can be infinite,
126 * has an arity, which is the arity of the samples it contains.
128 Hence the dataset implements the notion of /sequence/.
130 The descriptive power of the datasets in __UTF__ comes from
132 * the [link boost_test.tests_organization.test_cases.test_case_generation.datasets.dataset_interface interface] for creating a custom datasets, which is quite simple,
133 * the [link boost_test.tests_organization.test_cases.test_case_generation.operations operations] they provide for combining different datasets
134 * their interface with other type of collections (`stl` containers, `C` arrays)
135 * the available built-in [link boost_test.tests_organization.test_cases.test_case_generation.generators /dataset generators/]
137 [tip Only "monomorphic" datasets are supported, which means that all samples in a dataset have the same type and same arity
138 [footnote polymorphic datasets will be considered in the future. Their need is mainly driven by the replacement of the
139 [link boost_test.tests_organization.test_cases.test_organization_templates typed parametrized test cases] by the dataset-like API.]
143 As we will see in the next sections, datasets representing collections of different types may be combined together (e.g.. /zip/ or /grid/).
144 These operations result in new datasets, in which the samples are of an augmented type.
146 [h4 Dataset interface]
147 The interface of the /dataset/ should implement the two following functions/fields:
149 * `iterator begin()` where /iterator/ is a forward iterator,
150 * `boost::unit_test::data::size_t size() const` indicates the size of the dataset. The returned type is a dedicated
151 class [classref boost::unit_test::data::size_t size_t] that can indicate an /infinite/ dataset size.
152 * an enum called `arity` indicating the arity of the samples returned by the dataset
154 Once a dataset class `D` is declared, it should be registered to the framework by specializing the class ``boost::unit_test::data::monomorphic::is_dataset``
155 with the condition that ``boost::unit_test::data::monomorphic::is_dataset<D>::value`` evaluates to `true`.
157 The following example implements a custom dataset generating a Fibonacci sequence.
159 [bt_example dataset_example68..Example of custom dataset..run-fail]
161 [endsect] [/ datasets]
164 [/ ################################################################################################################################## ]
165 [/ Main code import for this section ]
166 [import ../snippet/dataset_1/test_file.cpp]
168 [/ ################################################################################################################################## ]
169 [section:datasets_auto_registration Declaring and registering test cases with datasets]
170 In order to declare and register a data-driven test-case, the macros __BOOST_DATA_TEST_CASE__ or __BOOST_DATA_TEST_CASE_F__
171 should be used. Those two forms are equivalent, with the difference that `BOOST_DATA_TEST_CASE_F` supports fixtures.
173 Those macros are variadic and can be used in the following forms:
176 __BOOST_DATA_TEST_CASE__(test_case_name, dataset) { /* dataset1 of arity 1 */ }
177 BOOST_DATA_TEST_CASE(test_case_name, dataset, var1) { /* datasets of arity 1 */ }
178 BOOST_DATA_TEST_CASE(test_case_name, dataset, var1, ..., varN) { /* datasets of arity N */ }
180 __BOOST_DATA_TEST_CASE_F__(fixture, test_case_name, dataset) { /* dataset1 of arity 1 with fixture */ }
181 BOOST_DATA_TEST_CASE_F(fixture, test_case_name, dataset, var1) { /* dataset1 of arity 1 with fixture */ }
182 BOOST_DATA_TEST_CASE_F(fixture, test_case_name, dataset, var1, ..., varN) { /* dataset1 of arity N with fixture */ }
185 The first form of the macro is for datasets of arity 1. The value of the sample being executed by the test body is
186 available through the automatic variable `sample` (`xrange` is as its name suggests a range of values):
190 The second form is also for datasets of arity 1, but instead of the variable `sample`, the current sample is brought into `var1`:
193 The third form is an extension of the previous form for datasets of arity `N`. The sample being a polymorphic tuple, each
194 of the variables `var1`, ..., `varN` corresponds to the index 1, ... `N` of the the sample:
198 The next three forms of declaration, with `BOOST_DATA_TEST_CASE_F`, are equivalent to the previous ones, with the difference being in the support of
199 a fixture that is execute before the test body for each sample. The fixture should follow the expected interface as detailed
200 [link boost_test.tests_organization.fixtures.models here].
202 The arity of the dataset and the number of variables should be exactly the same, the first form being a short-cut for the
205 [tip A compilation-time check is performed on the coherence of the arity of the dataset and the number of variables `var1`... `varN`.
206 For compilers *without C++11* support, the maximal supported arity is controlled by the macro
207 __BOOST_TEST_DATASET_MAX_ARITY__, that can be overridden /prior/ to including the __UTF__ headers.]
209 [caution The macros __BOOST_DATA_TEST_CASE__ and __BOOST_DATA_TEST_CASE_F__ are available only for compilers with support for *variadic macros*.]
211 [h4 Samples and test tree]
212 It should be emphasized that those macros do not declare a single test case (as __BOOST_AUTO_TEST_CASE__ would do) but declare and
213 register as many test cases as there are samples in the dataset given in argument. Each test case runs on exactly *one*
214 sample of the dataset.
217 ``__BOOST_DATA_TEST_CASE__(test_case_name, dataset)``
219 does is the following:
221 * it registers a *test suite* named "`test_case_name`",
222 * it registers as many test cases as they are in "`dataset`", each of which with the name corresponding to the index of the sample in the database prefixed by `_` and
223 starting at index `0` ("`_0`", "`_1`", ... "`_(N-1)`" where `N` is the size of the dataset)
225 This make it easy to:
227 * identify which sample is failing (say "`test_case_name/_3`")
228 * allows a replay of one or several samples (or the full dataset) from the command line using the [link boost_test.runtime_config.test_unit_filtering test filtering facility] provided by the __UTF__
230 Exactly as regular test cases, each test case (associated to a specific sample) is executed within the test body in a /guarded manner/:
232 * the test execution are independent: if an error occurs for one sample, the remaining samples execution is not affected
233 * in case of error, the [link boost_test.test_output.test_tools_support_for_logging.contexts context] within which the error occurred is reported in the [link boost_test.test_output log] along with
234 the failing sample index. This context contains the sample for which the test failed, which would ease the debugging.
241 [/ ################################################################################################################################## ]
242 [section:operations Operations on dataset]
243 As mentioned earlier, one of the major aspects of using the __UTF__ datasets lies in the number of operations provided
244 for their combination.
246 For that purpose, three operators are provided:
248 * joins with `operator+`
249 * zips with `operator^` on datasets
250 * and grids or Cartesian products with `operator*`
252 [tip All these operators are associative, which enables their combination without parenthesis. However, the precedence rule on the
253 operators for the language still apply. ]
256 A ['join], denoted `+`, is an operation on two datasets `dsa` and `dsb` of same arity and compatible types, resulting in the *concatenation* of these two datasets `dsa` and `dsb`
257 from the left to the right order of the symbol `+`:
260 dsa = (a_1, a_2, ... a_i)
261 dsb = (b_1, b_2, ... b_j)
262 dsa + dsb = (a_1, a_2, ... a_i, b_1, b_2, ... b_j)
265 The following properties hold:
267 * the resulting dataset is of same arity as the operand datasets,
268 * the size of the returned dataset is the sum of the size of the joined datasets,
269 * the operation is associative, and it is possible to combine more than two datasets in one expression. The following joins are equivalent for any datasets `dsa`, `dsb` and `dsc`:
272 == dsa + ( dsb + dsc )
276 [warning In the expression `dsa + dsb`, `dsa` and/or `dsb` can be of infinite size. The resulting dataset will have an infinite size as well. If `dsa` is infinite, the content of
277 `dsb` will never be reached. ]
279 [bt_example dataset_example62..Example of join on datasets..run]
285 A ['zip], denoted `^` , is an operation on two datasets `dsa` and `dsb` of same arity and same size, resulting in a dataset where the `k`-th sample of `dsa` is paired with the corresponding `k`-th sample of `dsb`.
286 The resulting dataset samples order follows the left to right order against the symbol `^`.
289 dsa = (a_1, a_2, ... a_i)
290 dsb = (b_1, b_2, ... b_i)
291 dsa ^ dsb = ( (a_1, b_1), (a_2, b_2) ... (a_i, b_i) )
294 The following properties hold:
296 * the arity of the resulting dataset is the sum of the arities of the operand datasets,
297 * the size of the resulting dataset is equal to the size of the datasets (since they are supposed to be of the same size),
298 exception made for the case the operand datasets size mismatch (see below),
299 * the operation is associative, and it is possible to combine more than two datasets in one expression,
302 == dsa ^ ( dsb ^ dsc )
306 A particular handling is performed if `dsa` and `dsb` are of different size. The rule is as follow:
308 * if the both zipped datasets have the same size, this is the size of the resulting dataset (this size can then be infinite).
309 * otherwise if one of the dataset is of size 1 (singleton) or of infinite size, the resulting size is governed by the other dataset.
310 * otherwise an exception is thrown at runtime
313 [caution If the /zip/ operation is not supported for your compiler, the macro [macroref BOOST_TEST_NO_ZIP_COMPOSITION_AVAILABLE `BOOST_TEST_NO_ZIP_COMPOSITION_AVAILABLE`]
314 will be automatically set by the __UTF__]
316 [bt_example dataset_example61..Example of zip on datasets..run]
319 [endsect] [/ zip operation on datasets]
323 [section Grid (Cartesian products)]
324 A ['grid], denoted `*` , is an operation on two any datasets `dsa` and `dsb` resulting in a dataset where each sample of `dsa` is paired with each sample of `dsb`
325 exactly once. The resulting dataset samples order follows the left to right order against the symbol `*`. The rightmost dataset samples are iterated first.
328 dsa = (a_1, a_2, ... a_i)
329 dsb = (b_1, b_2, ... b_j)
330 dsa * dsb = ((a_1, b_1), (a_1, b_2) ... (a_1, b_j), (a_2, b_1), ... (a_2, b_j) ... (a_i, b_1), ... (a_i, b_j))
333 The grid hence is similar to the mathematical notion of Cartesian product [footnote if the sequence is viewed as a set].
335 The following properties hold:
337 * the arity of the resulting dataset is the sum of the arities of the operand datasets,
338 * the size of the resulting dataset is the product of the sizes of the datasets,
339 * the operation is associative, and it is possible to combine more than two datasets in one expression,
340 * as for /zip/, there is no need the dataset to have the same type of samples.
342 [caution If the /grid/ operation is not supported for your compiler, the macro [macroref BOOST_TEST_NO_GRID_COMPOSITION_AVAILABLE `BOOST_TEST_NO_GRID_COMPOSITION_AVAILABLE`]
343 will be automatically set by the __UTF__]
345 In the following example, the random number generator is the second dataset. Its state is evaluated 6 times (3 times for the first `xrange` - first dimension -
346 and twice for the second `xrange` - second dimension - to which it is zipped). Note that the state of the random engine is
347 not copied between two successive evaluations of the first dimension.
349 [bt_example dataset_example64..Example of Cartesian product..run-fail]
357 [endsect] [/ operations on dataset]
361 [/ ################################################################################################################################## ]
362 [section:generators Datasets generators]
363 Several ['generators] for datasets are implemented in __UTF__:
365 * [link boost_test.tests_organization.test_cases.test_case_generation.generators.singletons Singletons]
366 * [link boost_test.tests_organization.test_cases.test_case_generation.generators.stl `forward iterable`] containers and
367 [link boost_test.tests_organization.test_cases.test_case_generation.generators.c_arrays `C` array] like datasets
368 * [link boost_test.tests_organization.test_cases.test_case_generation.generators.ranges ranges] or sequences of values
369 * datasets made of [link boost_test.tests_organization.test_cases.test_case_generation.generators.random random numbers] and following a particular distribution
371 `stl` and `C-array` generators are merely a dataset view on existing collection, while ranges and random number sequences are
372 describing new datasets.
375 [/ ################################################################################################################################## ]
376 [h4:singletons Singletons]
378 A singleton is a dataset containing a unique value. The size and arity of such a dataset is 1. This value can be
380 * either consumed once
381 * or repeated as many times as needed in a zip operation
383 As mentioned in /zip/, when zipped with a distribution of infinite size, the resulting dataset will have
386 The singleton is constructible through the function [funcref boost::unit_test::data::make].
388 [bt_example dataset_example65..Singleton..run]
392 [/ ################################################################################################################################## ]
393 [h4:c_arrays Datasets from C arrays]
394 This type of datasets does not contains the logic for generating the sequence of values, and is used as a wrapper on an existing
395 sequence contained in a `C` array. The arity is 1 and the size is the size of the array.
397 Such datasets are simply constructed from an overload of the [funcref boost::unit_test::data::make `make`] function.
399 [bt_example dataset_example66..Array..run]
401 [/ ################################################################################################################################## ]
402 [h4:stl Datasets from forward iterable containers]
403 As for `C` arrays, this type of datasets does not contain the logic for generating sequence of values, and are used for parsing an existing sequence.
404 The arity is 1 and the size is the same as the one of the container.
407 [tip C++11 implementation enables the dataset generation from any container which iterator implements the forward iterator concept.
408 For C++03, the feature is enabled on most STL containers.]
410 [bt_example dataset_example67..Dataset from `std::vector` and `std::map`..run]
414 [/ ################################################################################################################################## ]
416 A range is a dataset that implements a sequence of equally spaced values, defined by a /start/, and /end/ and a /step/.
418 It is possible to construct a range using the factory [funcref boost::unit_test::data::xrange], available in the overloads below:
421 #include <boost/test/data/test_case.hpp>
422 #include <boost/test/data/monomorphic.hpp>
424 auto range1 = data::xrange( (data::step = 0.5, data::end = 3 ) ); // Constructs with named values, starting at 0
425 auto range2 = data::xrange( begin, end ); // begin < end required
426 auto range5 = data::xrange( begin, end, step ); // begin < end required
427 auto range3 = data::xrange( end ); // begin=0, end cannot be <= 0, see above
428 auto range4 = data::xrange( end, (data::begin=1) ); // named value after end
431 [tip The named value parameters should be declared inside parenthesis]
434 The details of the named value parameters is given in the table below.
435 [table:id_range_parameter_table Range parameters
444 [Beginning of the generated sequence. The `begin` value is included in set of values returned
452 [End of the generated sequence. The `end` value is not included in set of values returned
453 by the generator. If omitted, the generator has infinite size.
460 [Number indicating the step between two consecutive samples of the generated range.
461 The default type is the same as the input type. This value should not be 0. It should be of the same
467 [bt_example dataset_example59..Declaring a test with a range..run-fail]
471 [/ ################################################################################################################################## ]
472 [h4:random Random value dataset]
474 This type of dataset generates a sequence of random numbers following a given /distribution/. The /seed/ and the /engine/ may also be
478 [caution The random value generator is available only for C++11 capable compilers. If this feature is not supported for your compiler,
479 the macro [macroref BOOST_TEST_NO_RANDOM_DATASET_AVAILABLE `BOOST_TEST_NO_RANDOM_DATASET_AVAILABLE`]
480 will be automatically set by the __UTF__]
484 It is possible to construct a random sequence using the factory [funcref boost::unit_test::data::random], available in the overloads below:
487 auto rdgen = random(); // uniform distribution (real) on [0, 1)
488 auto rdgen = random(1, 17); // uniform distribution (integer) on [1, 17]
489 // Default random generator engine, Gaussian distribution (mean=5, sigma=2) and seed set to 100.
490 auto rdgen = random( (data::seed = 100UL,
491 data::distribution = std::normal_distribution<>(5.,2)) );
494 Since the generated datasets will have infinite size, the sequence size should be narrowed by combining the dataset with another
495 one through e.g. a /zip/ operation.
497 [tip In order to be able to reproduce a failure within a randomized parameter test case, the seed that generated the failure may be
498 set in order to generate the same sequence of random values.]
501 The details of the named value parameters is given in the table below.
502 [table:id_range_parameter_table Range parameters
511 [Seed for the generation of the random sequence.]
516 [Distribution instance for generating the random number sequences. The `end` value is not included in set of values returned
517 by the generator for real values, and is included for integers. ]
521 [`std::default_random_engine`]
522 [Random number generator engine.]
526 [bt_example dataset_example63..Declaring a test with a random sequence..run-fail]
529 [endsect] [/ Datasets generators]