]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/icl/doc/projects.qbk
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / icl / doc / projects.qbk
1 [/
2 Copyright (c) 2009-2009 Joachim Faulhaber
3
4 Distributed under the Boost Software License, Version 1.0.
5 (See accompanying file LICENSE_1_0.txt or copy at
6 http://www.boost.org/LICENSE_1_0.txt)
7 ]
8
9 [section Projects]
10
11 ['*Projects*] are examples on the usage of interval containers
12 that go beyond small toy snippets of code. The code presented
13 here addresses more serious applications that approach the
14 quality of real world programming. At the same time it aims to
15 guide the reader more deeply into various aspects of
16 the library. In order not to overburden the reader with implementation
17 details, the code in ['*projects*] tries to be ['*minimal*]. It has a focus on
18 the main aspects of the projects and is not intended to be complete
19 and mature like the library code itself. Cause it's minimal,
20 project code lives in `namespace mini`.
21
22 [/
23 [section Overview]
24
25 [table Overview over Icl Examples
26 [[level] [example] [classes] [features]]
27 [[basic] [[link boost_icl.examples.large_bitset Large Bitset]]
28 [__itv_map__][The most simple application of an interval map:
29 Counting the overlaps of added intervals.]]
30 ]
31
32 [endsect][/ Overview IN Projects]
33 ]
34
35 [section Large Bitset][/ IN Projects]
36
37 Bitsets are just sets. Sets of unsigned integrals,
38 to be more precise.
39 The prefix ['*bit*] usually only indicates,
40 that the representation of those sets is organized in a compressed
41 form that exploits the fact, that we can switch on an off single
42 bits in machine words. Bitsets are therefore known to be very small
43 and thus efficient.
44 The efficiency of bitsets is usually coupled to the
45 precondition that the range of values of elements
46 is relatively small, like [0..32) or [0..64), values that can
47 be typically represented in single or a small number of
48 machine words. If we wanted to represent a set containing
49 two values {1, 1000000}, we would be much better off using
50 other sets like e.g. an `std::set`.
51
52 Bitsets compress well, if elements spread over narrow ranges
53 only. Interval sets compress well, if many elements are clustered
54 over intervals. They can span large sets very efficiently then.
55 In project ['*Large Bitset*] we want to ['*combine the bit compression
56 and the interval compression*] to achieve a set implementation,
57 that is capable of spanning large chunks of contiguous elements
58 using intervals and also to represent more narrow ['nests] of varying
59 bit sequences using bitset compression.
60 As we will see, this can be achieved using only a small
61 amount of code because most of the properties we need
62 are provided by an __itv_map__ of `bitsets`:
63
64 ``
65 typedef interval_map<IntegralT, SomeBitSet<N>, partial_absorber,
66 std::less, inplace_bit_add, inplace_bit_and> IntervalBitmap;
67 ``
68
69 Such an `IntervalBitmap` represents `k*N` bits for every segment.
70 ``
71 [a, a+k)->'1111....1111' // N bits associated: Represents a total of k*N bits.
72 ``
73
74 For the interval `[a, a+k)` above all bits are set.
75 But we can also have individual ['nests] or ['clusters]
76 of bitsequences.
77
78 ``
79 [b, b+1)->'01001011...1'
80 [b+1,b+2)->'11010001...0'
81 . . .
82 ``
83
84 and we can span intervals of equal bit sequences that represent
85 periodic patterns.
86
87 ``
88 [c,d)->'010101....01' // Every second bit is set in range [c,d)
89 [d,e)->'001100..0011' // Every two bits alterate in range [d,e)
90 [e,f)->'bit-sequence' // 'bit-sequence' reoccurs every N bits in range [e,f)
91 ``
92
93 An `IntervalBitmap` can represent
94 `N*(2^M)` elements, if `M` is the number of bits of the
95 integral type `IntegralT`. Unlike bitsets, that usually
96 represent ['*unsigned*] integral numbers, large_bitset may
97 range over negative numbers as well.
98 There are fields where such large
99 bitsets implementations are needed. E.g. for the compact
100 representation of large file allocation tables.
101 What remains to be done for project *Large Bitset* is
102 to code a wrapper `class large_bitset` around `IntervalBitmap`
103 so that `large_bitset` looks and feels like a usual
104 set class.
105
106 [section Using large_bitset]
107
108 To quicken your appetite for a look at the implementation
109 here are a few use cases first. Within the examples that follow,
110 we will use `nat`[^['k]] for unsigned integrals
111 and `bits`[^['k]] for bitsets containing [^['k]] bits.
112
113 Let's start large.
114 In the first example . . .
115
116 [import ../example/large_bitset_/large_bitset.cpp]
117 [large_bitset_test_large_set_all]
118
119 . . . we are testing the limits. First we set all bits and
120 then we switch off the very last bit.
121
122 [large_bitset_test_large_erase_last]
123
124 Program output (/a little beautified/):
125 ``
126 ----- Test function test_large() -----------------------------------------------
127 We have just turned on the awesome amount of 18,446,744,073,709,551,616 bits ;-)
128 [ 0, 288230376151711744) -> 1111111111111111111111111111111111111111111111111111111111111111
129 ---- Let's swich off the very last bit -----------------------------------------
130 [ 0, 288230376151711743) -> 1111111111111111111111111111111111111111111111111111111111111111
131 [288230376151711743, 288230376151711744) -> 1111111111111111111111111111111111111111111111111111111111111110
132 ---- Venti is plenty ... let's do something small: A tall ----------------------
133 ``
134
135 More readable is a smaller version of `large_bitset`.
136 In function `test_small()` we apply a few more operations . . .
137
138 [large_bitset_test_small]
139
140 . . . producing this output:
141 ``
142 ----- Test function test_small() -----------
143 -- Switch on all bits in range [0,64] ------
144 [0,8)->11111111
145 [8,9)->10000000
146 --------------------------------------------
147 -- Turn off bits: 25,27,28 -----------------
148 [0,3)->11111111
149 [3,4)->10100111
150 [4,8)->11111111
151 [8,9)->10000000
152 --------------------------------------------
153 -- Flip bits in range [24,30) --------------
154 [0,3)->11111111
155 [3,4)->01011011
156 [4,8)->11111111
157 [8,9)->10000000
158 --------------------------------------------
159 -- Remove the first 10 bits ----------------
160 [1,2)->00111111
161 [2,3)->11111111
162 [3,4)->01011011
163 [4,8)->11111111
164 [8,9)->10000000
165 -- Remove even bits in range [0,72) --------
166 [1,2)->00010101
167 [2,3)->01010101
168 [3,4)->01010001
169 [4,8)->01010101
170 -- Set odd bits in range [0,72) --------
171 [0,9)->01010101
172 --------------------------------------------
173 ``
174
175 Finally, we present a little /picturesque/ example,
176 that demonstrates that `large_bitset` can also serve
177 as a self compressing bitmap, that we can 'paint' with.
178
179 [large_bitset_test_picturesque]
180
181 Note that we have two `large_bitsets` for the /outline/
182 and the /interior/. Both parts are compressed but we can
183 compose both by `operator +`, because the right /positions/
184 are provided. This is the program output:
185
186 ``
187 ----- Test function test_picturesque() -----
188 -------- empty face: 3 intervals -----
189 ********
190 * *
191 * *
192 * *
193 * *
194 ******
195 -------- compressed smile: 2 intervals -----
196 * *
197 ****
198 -------- staring bitset: 6 intervals -----
199 ********
200 * *
201 * * * *
202 * *
203 * **** *
204 ******
205 --------------------------------------------
206 ``
207
208 So, may be you are curious how this class template
209 is coded on top of __itv_map__ using only about 250 lines
210 of code. This is shown in the sections that follow.
211
212 [endsect][/ Usage of a large_bitset]
213
214 [section The interval_bitmap]
215
216 To begin, let's look at the basic data type again, that
217 will be providing the major functionality:
218
219 ``
220 typedef interval_map<DomainT, BitSetT, partial_absorber,
221 std::less, inplace_bit_add, inplace_bit_and> IntervalBitmap;
222 ``
223
224 `DomainT` is supposed to be an integral
225 type, the bitset type `BitSetT` will be a wrapper class around an
226 unsigned integral type. `BitSetT` has to implement bitwise operators
227 that will be called by the functors `inplace_bit_add<BitSetT>`
228 and `inplace_bit_and<BitSetT>`.
229 The type trait of interval_map is `partial_absorber`, which means
230 that it is /partial/ and that empty `BitSetTs` are not stored in
231 the map. This is desired and keeps the `interval_map` minimal, storing
232 only bitsets, that contain at least one bit switched on.
233 Functor template `inplace_bit_add` for parameter `Combine` indicates that we
234 do not expect `operator +=` as addition but the bitwise operator
235 `|=`. For template parameter `Section` which is instaniated by
236 `inplace_bit_and` we expect the bitwise `&=` operator.
237
238 [endsect][/ section The interval_bitmap]
239
240 [section A class implementation for the bitset type]
241
242 The code of the project is enclosed in a `namespace mini`.
243 The name indicates, that the implementation is a /minimal/
244 example implementation. The name of the bitset class will
245 be `bits` or `mini::bits` if qualified.
246
247 To be used as a codomain parameter of class template __itv_map__,
248 `mini::bits` has to implement all the functions that are required
249 for a codomain_type in general, which are the default constructor `bits()`
250 and an equality `operator==`. Moreover `mini::bits` has to implement operators
251 required by the instantiations for parameter `Combine` and `Section`
252 which are `inplace_bit_add` and `inplace_bit_and`. From functors
253 `inplace_bit_add` and `inplace_bit_and` there are inverse functors
254 `inplace_bit_subtract` and `inplace_bit_xor`. Those functors
255 use operators `|= &= ^=` and `~`. Finally if we want to apply
256 lexicographical and subset comparison on large_bitset, we also
257 need an `operator <`. All the operators that we need can be implemented
258 for `mini::bits` on a few lines:
259
260 [import ../example/large_bitset_/bits.hpp]
261 [mini_bits_class_bits]
262
263 Finally there is one important piece of meta information, we have
264 to provide: `mini::bits` has to be recognized as a `Set` by the
265 icl code. Otherwise we can not exploit the fact that a map of
266 sets is model of `Set` and the resulting large_bitset would not
267 behave like a set. So we have to say that `mini::bits` shall be sets:
268
269 [mini_bits_is_set]
270
271 This is done by adding a partial template specialization to
272 the type trait template `icl::is_set`.
273 For the extension of this type trait template and the result
274 values of inclusion_compare we need these `#includes` for the
275 implementation of `mini::bits`:
276
277 [mini_bits_includes]
278
279 [endsect][/ A class implementation for the bitset type]
280
281 [section Implementation of a large bitset]
282
283 Having finished our `mini::bits` implementation, we can start to
284 code the wrapper class that hides the efficient interval map of mini::bits
285 and exposes a simple and convenient set behavior to the world of users.
286
287 Let's start with the required `#include`s this time:
288
289 [import ../example/large_bitset_/large_bitset.hpp]
290 [large_bitset_includes]
291
292 Besides `boost/icl/interval_map.hpp` and `bits.hpp` the most important
293 include here is `boost/operators.hpp`. We use this library in order
294 to further minimize the code and to provide pretty extensive operator
295 functionality using very little code.
296
297 For a short and concise naming of the most important unsigned integer
298 types and the corresponding `mini::bits` we define this:
299
300 [large_bitset_natural_typedefs]
301
302 [section large_bitset: Public interface][/ ------------------------------------]
303
304 And now let's code `large_bitset`.
305
306 [large_bitset_class_template_header]
307
308 The first template parameter `DomainT` will be instantiated with
309 an integral type that defines the kind of numbers that can
310 be elements of the set. Since we want to go for a large set we
311 use `nat64` as default which is a 64 bit unsigned integer ranging
312 from `0` to `2^64-1`. As bitset parameter we also choose a 64-bit
313 default. Parameters `Combine` and `Interval` are necessary to
314 be passed to dependent type expressions. An allocator can be
315 chosen, if desired.
316
317 The nested list of private inheritance contains groups of template
318 instantiations from
319 [@http://www.boost.org/doc/libs/1_39_0/libs/utility/operators.htm Boost.Operator],
320 that provides derivable operators from more fundamental once.
321 Implementing the fundamental operators, we get the derivable ones
322 /for free/. Below is a short overview of what we get using Boost.Operator,
323 where __S stands for `large_bitset`, __i for it's `interval_type`
324 and __e for it's `domain_type` or `element_type`.
325
326 [table
327 [[Group ][fundamental] [derivable ]]
328 [[Equality, ordering ][`==`] [`!=` ]]
329 [[ ][`<` ] [`> <= >=` ]]
330 [[Set operators (__S x __S)][`+= |= -= &= ^=` ][`+ | - & ^` ]]
331 [[Set operators (__S x __e)][`+= |= -= &= ^=` ][`+ | - & ^` ]]
332 [[Set operators (__S x __i)][`+= |= -= &= ^=` ][`+ | - & ^` ]]
333 ]
334
335 There is a number of associated types
336
337 [large_bitset_associated_types]
338
339 most importantly the implementing `interval_bitmap_type` that is used
340 for the implementing container.
341
342 [large_bitmap_impl_map]
343
344 In order to use
345 [@http://www.boost.org/doc/libs/1_39_0/libs/utility/operators.htm Boost.Operator]
346 we have to implement the fundamental operators as class members. This can be
347 done quite schematically.
348
349 [large_bitset_operators]
350
351 As we can see, the seven most important operators that work on the
352 class type `large_bitset` can be directly implemented by propagating
353 the operation to the implementing `_map`
354 of type `interval_bitmap_type`. For the operators that work on segment and
355 element types, we use member functions `add`, `subtract`, `intersect` and
356 `flip`. As we will see only a small amount of adaper code is needed
357 to couple those functions with the functionality of the implementing
358 container.
359
360 Member functions `add`, `subtract`, `intersect` and `flip`,
361 that allow to combine ['*intervals*] to `large_bitsets` can
362 be uniformly implemented using a private function
363 `segment_apply` that applies /addition/, /subtraction/,
364 /intersection/ or /symmetric difference/, after having
365 translated the interval's borders into the right bitset
366 positions.
367
368 [large_bitset_fundamental_functions]
369
370 In the sample programs, that we will present to demonstrate
371 the capabilities of `large_bitset` we will need a few additional
372 functions specifically output functions in two different
373 flavors.
374
375 [large_bitset_demo_functions]
376
377 * The first one, `show_segments()` shows the container
378 content as it is implemented, in the compressed form.
379
380 * The second function `show_matrix` shows the complete
381 matrix of bits that is represented by the container.
382
383 [endsect][/ large_bitset: Public interface]
384 [section large_bitset: Private implementation] [/ -------------------------------------]
385
386 In order to implement operations like the addition of
387 an element say `42` to
388 the large bitset, we need to translate the /value/ to the
389 /position/ of the associated *bit* representing `42` in the
390 interval container of bitsets. As an example, suppose we
391 use a
392 ``
393 large_bitset<nat, mini::bits8> lbs;
394 ``
395 that carries small bitsets of 8 bits only.
396 The first four interval of `lbs` are assumed to
397 be associated with some bitsets. Now we want to add the interval
398 `[a,b]==[5,27]`. This will result in the following situation:
399 ``
400 [0,1)-> [1,2)-> [2,3)-> [3,4)->
401 [00101100][11001011][11101001][11100000]
402 + [111 11111111 11111111 1111] [5,27] as bitset
403 a b
404
405 => [0,1)-> [1,3)-> [3,4)->
406 [00101111][11111111][11110000]
407 ``
408 So we have to convert values 5 and 27 into a part that
409 points to the interval and a part that refers to the
410 position within the interval, which is done by a
411 /division/ and a /modulo/ computation. (In order to have a
412 consistent representation of the bitsequences across the containers,
413 within this project, bitsets are denoted with the
414 ['*least significant bit on the left!*])
415
416 ``
417 A = a/8 = 5/8 = 0 // refers to interval
418 B = b/8 = 27/8 = 3
419 R = a%8 = 5%8 = 5 // refers to the position in the associated bitset.
420 S = b%8 = 27%8 = 3
421 ``
422
423 All /division/ and /modulo operations/ needed here are always done
424 using a divisor `d` that is a power of `2`: `d = 2^x`. Therefore
425 division and modulo can be expressed by bitset operations.
426 The constants needed for those bitset computations are defined here:
427
428 [large_bitset_impl_constants]
429
430 Looking at the example again, we can see that we have to identify the
431 positions of the beginning and ending of the interval [5,27] that is
432 to insert, and then ['*subdivide*] that range of bitsets into three partitions.
433
434 # The bitset where the interval starts.
435 # the bitset where the interval ends
436 # The bitsets that are completely overlapped by the interval
437
438 ``
439 combine interval [5,27] to large_bitset lbs w.r.t. some operation o
440
441 [0,1)-> [1,2)-> [2,3)-> [3,4)->
442 [00101100][11001011][11101001][11100000]
443 o [111 11111111 11111111 1111]
444 a b
445 subdivide:
446 [first! ][mid_1] . . .[mid_n][ !last]
447 [00000111][1...1] . . .[1...1][11110000]
448 ``
449
450 After subdividing, we perform the operation `o` as follows:
451
452 # For the first bitset: Set all bits from ther starting bit (`!`)
453 to the end of the bitset to `1`. All other bits are `0`. Then
454 perform operation `o`: `_map o= ([0,1)->00000111)`
455
456 # For the last bitset: Set all bits from the beginning of the bitset
457 to the ending bit (`!`) to `1`. All other bits are `0`. Then
458 perform operation `o`: `_map o= ([3,4)->11110000)`
459
460 # For the range of bitsets in between the staring and ending one,
461 perform operation `o`: `_map o= ([1,3)->11111111)`
462
463 The algorithm, that has been outlined and illustrated by the
464 example, is implemented by the private member function
465 `segment_apply`. To make the combiner operation a variable
466 in this algorithm, we use a /pointer to member function type/
467
468 [large_bitset_segment_combiner]
469
470 as first function argument. We will pass member functions `combine_` here,
471 ``
472 combine_(first_of_interval, end_of_interval, some_bitset);
473 ``
474 that take the beginning and ending of an interval and a bitset and combine
475 them to the implementing `interval_bitmap_type _map`. Here are these functions:
476
477 [large_bitmap_combiners]
478
479 Finally we can code function `segment_apply`, that does the partitioning
480 and subsequent combining:
481
482 [large_bitset_segment_apply]
483
484 The functions that help filling bitsets to and from a given bit are
485 implemented here:
486 [large_bitset_bitset_filler]
487
488 This completes the implementation of class template `large_bitset`.
489 Using only a small amount of mostly schematic code,
490 we have been able to provide a pretty powerful, self compressing
491 and generally usable set type for all integral domain types.
492
493 [endsect][/ large_bitset: Private implementation]
494
495 [endsect][/ Implementation of a large bitset]
496
497 [endsect][/ Large Bitsets IN Projects]
498
499
500 [endsect][/ Projects]
501
502
503