ceph/src/boost/libs/geometry/doc/design_rationale.qbk

   1 [/==============================================================================
   2     Copyright (c) 2007-2012 Barend Gehrels, Amsterdam, the Netherlands.
   3     Copyright (c) 2008-2012 Bruno Lalande, Paris, France.
   4     Copyright (c) 2009-2012 Mateusz Loskot, London, UK., London, UK
   5
   6     Use, modification and distribution is subject to the Boost Software License,
   7     Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
   8     http://www.boost.org/LICENSE_1_0.txt)
   9 ===============================================================================/]
  10
  11 [/ note the source code in this QBK is the only not (yet) checked by a compiler]
  12
  13 [section:design Design Rationale]
  14
  15 Suppose you need a C++ program to calculate the distance between two points.
  16 You might define a struct:
  17
  18     struct mypoint
  19     {
  20         double x, y;
  21     };
  22
  23 and a function, containing the algorithm:
  24
  25     double distance(mypoint const& a, mypoint const& b)
  26     {
  27         double dx = a.x - b.x;
  28         double dy = a.y - b.y;
  29         return sqrt(dx * dx + dy * dy);
  30     }
  31
  32 Quite simple, and it is usable, but not generic. For a library it has to be
  33 designed way further. The design above can only be used for 2D points, for the
  34 struct [*mypoint] (and no other struct), in a Cartesian coordinate system. A
  35 generic library should be able to calculate the distance:
  36
  37 * for any point class or struct, not on just this [*mypoint] type
  38 * in more than two dimensions
  39 * for other coordinate systems, e.g. over the earth or on a sphere
  40 * between a point and a line or between other geometry combinations
  41 * in higher precision than ['double]
  42 * avoiding the square root: often we don't want to do that because it is a relatively expensive
  43   function, and for comparing distances it is not necessary
  44
  45 In this and following sections we will make the design step by step more generic.
  46
  47 [heading Using Templates]
  48
  49 The distance function can be changed into a template function. This is trivial and allows
  50 calculating the distance between other point types than just [*mypoint]. We add two template
  51 parameters, allowing input of two different point types.
  52
  53     template <typename P1, typename P2>
  54     double distance(P1 const& a, P2 const& b)
  55     {
  56         double dx = a.x - b.x;
  57         double dy = a.y - b.y;
  58         return std::sqrt(dx * dx + dy * dy);
  59     }
  60
  61 This template version is slightly better, but not much.
  62
  63 Consider a C++ class where member variables are protected...
  64 Such a class does not allow to access `x` and `y` members directly. So, this paragraph is short
  65 and we just move on.
  66
  67 [heading Using Traits]
  68
  69 We need to take a generic approach and allow any point type as input to the distance function.
  70 Instead of accessing `x` and `y` members, we will add a few levels of indirection, using a
  71 traits system. The function then becomes:
  72
  73     template <typename P1, typename P2>
  74     double distance(P1 const& a, P2 const& b)
  75     {
  76         double dx = get<0>(a) - get<0>(b);
  77         double dy = get<1>(a) - get<1>(b);
  78         return std::sqrt(dx * dx + dy * dy);
  79     }
  80
  81 This adapted distance function uses a generic get function, with dimension as a template parameter,
  82 to access the coordinates of a point. This get forwards to the traits system, defined as following:
  83
  84     namespace traits
  85     {
  86         template <typename P, int D>
  87         struct access {};
  88     }
  89
  90 which is then specialized for our [*mypoint] type, implementing a static method called `get`:
  91
  92     namespace traits
  93     {
  94         template <>
  95         struct access<mypoint, 0>
  96         {
  97             static double get(mypoint const& p)
  98             {
  99                 return p.x;
 100             }
 101         };
 102         // same for 1: p.y
 103         ...
 104     }
 105
 106 Calling `traits::access<mypoint, 0>::get(a)` now returns us our `x` coordinate. Nice, isn't it?
 107 It is too verbose for a function like this, used so often in the library. We can shorten the syntax
 108 by adding an extra free function:
 109
 110     template <int D, typename P>
 111     inline double get(P const& p)
 112     {
 113         return traits::access<P, D>::get(p);
 114     }
 115
 116 This enables us to call `get<0>(a)`, for any point having the traits::access specialization,
 117 as shown in the distance algorithm at the start of this paragraph. So we wanted to enable classes
 118 with methods like `x()`, and they are supported as long as there is a specialization of the access
 119 `struct` with a static `get` function returning `x()` for dimension 0, and similar for 1 and `y()`.
 120
 121 [heading Dimension Agnosticism]
 122
 123 Now we can calculate the distance between points in 2D, points of any structure or class.
 124 However, we wanted to have 3D as well. So we have to make it dimension agnostic. This complicates
 125 our distance function. We can use a `for` loop to walk through dimensions, but for loops have
 126 another performance than the straightforward coordinate addition which was there originally.
 127 However, we can make more usage of templates and make the distance algorithm as following,
 128 more complex but attractive for template fans:
 129
 130     template <typename P1, typename P2, int D>
 131     struct pythagoras
 132     {
 133         static double apply(P1 const& a, P2 const& b)
 134         {
 135             double d = get<D-1>(a) - get<D-1>(b);
 136             return d * d + pythagoras<P1, P2, D-1>::apply(a, b);
 137         }
 138     };
 139
 140     template <typename P1, typename P2 >
 141     struct pythagoras<P1, P2, 0>
 142     {
 143         static double apply(P1 const&, P2 const&)
 144         {
 145             return 0;
 146         }
 147     };
 148
 149 The distance function is calling that `pythagoras` structure, specifying the number of dimensions:
 150
 151     template <typename P1, typename P2>
 152     double distance(P1 const& a, P2 const& b)
 153     {
 154         BOOST_STATIC_ASSERT(( dimension<P1>::value == dimension<P2>::value ));
 155
 156         return sqrt(pythagoras<P1, P2, dimension<P1>::value>::apply(a, b));
 157     }
 158
 159 The dimension which is referred to is defined using another traits class:
 160
 161 ``
 162     namespace traits
 163     {
 164         template <typename P>
 165         struct dimension {};
 166     }
 167 ``
 168
 169 which has to be specialized again for the `struct mypoint`.
 170
 171 Because it only has to publish a value, we conveniently derive it from the
 172 __boost_mpl__ `class boost::mpl::int_`:
 173
 174 ``
 175 namespace traits
 176 {
 177     template <>
 178     struct dimension<mypoint> : boost::mpl::int_<2>
 179     {};
 180 }
 181 ``
 182
 183 Like the free get function, the library also contains a dimension meta-function.
 184
 185     template <typename P>
 186     struct dimension : traits::dimension<P>
 187     {};
 188
 189 Below is explained why the extra declaration is useful. Now we have agnosticism in the number of
 190 dimensions. Our more generic distance function now accepts points of three or more dimensions.
 191 The compile-time assertion will prevent point a having two dimension and point b having
 192 three dimensions.
 193
 194 [heading Coordinate Type]
 195
 196 We assumed double above. What if our points are in integer?
 197
 198 We can easily add a traits class, and we will do that. However, the distance between two integer
 199 coordinates can still be a fractionized value. Besides that, a design goal was to avoid square
 200 roots. We handle these cases below, in another paragraph. For the moment we keep returning double,
 201 but we allow integer coordinates for our point types. To define the coordinate type, we add
 202 another traits class, `coordinate_type`, which should be specialized by the library user:
 203
 204     namespace traits
 205     {
 206         template <typename P>
 207         struct coordinate_type{};
 208
 209         // specialization for our mypoint
 210         template <>
 211         struct coordinate_type<mypoint>
 212         {
 213             typedef double type;
 214         };
 215     }
 216
 217 Like the access function, where we had a free get function, we add a proxy here as well.
 218 A longer version is presented later on, the short function would look like this:
 219
 220     template <typename P>
 221     struct coordinate_type :    traits::coordinate_type<P> {};
 222
 223 We now can modify our distance algorithm again. Because it still returns double, we only
 224 modify the `pythagoras` computation class. It should return the coordinate type of its input.
 225 But, it has two input, possibly different, point types. They might also differ in their
 226 coordinate types. Not that that is very likely, but we’re designing a generic library and we
 227 should handle those strange cases. We have to choose one of the coordinate types and of course
 228 we select the one with the highest precision. This is not worked out here, it would be too long,
 229 and it is not related to geometry. We just assume that there is a meta-function `select_most_precise`
 230 selecting the best type.
 231
 232 So our computation class becomes:
 233
 234     template <typename P1, typename P2, int D>
 235     struct pythagoras
 236     {
 237         typedef typename select_most_precise
 238             <
 239                 typename coordinate_type<P1>::type,
 240                 typename coordinate_type<P2>::type
 241             >::type computation_type;
 242
 243         static computation_type apply(P1 const& a, P2 const& b)
 244         {
 245             computation_type d = get<D-1>(a) - get<D-1>(b);
 246             return d * d + pythagoras <P1, P2, D-1> ::apply(a, b);
 247         }
 248     };
 249
 250 [heading Different Geometries]
 251
 252 We have designed a dimension agnostic system supporting any point type of any
 253 coordinate type. There are still some tweaks but they will be worked out later.
 254 Now we will see how we calculate the distance between a point and a polygon, or
 255 between a point and a line-segment. These formulae are more complex, and the
 256 influence on design is even larger. We don’t want to add a function with another
 257 name:
 258
 259     template <typename P, typename S>
 260     double distance_point_segment(P const& p, S const& s)
 261
 262 We want to be generic, the distance function has to be called from code not
 263 knowing the type of geometry it handles, so it has to be named distance. We also
 264 cannot create an overload because that would be ambiguous, having the same
 265 template signature. There are two solutions:
 266
 267 * tag dispatching
 268 * SFINAE
 269
 270 We select tag dispatching because it fits into the traits system. The earlier
 271 versions (previews) of Boost.Geometry used SFINAE but we found it had several
 272 drawbacks for such a big design, so the switch to tag dispatching was made.
 273
 274 With tag dispatching the distance algorithm inspects the type of geometry of the
 275 input parameters. The distance function will be changed into this:
 276
 277     template <typename G1, typename G2>
 278     double distance(G1 const& g1, G2 const& g2)
 279     {
 280         return dispatch::distance
 281             <
 282                 typename tag<G1>::type,
 283                 typename tag<G2>::type,
 284                 G1, G2
 285             >::apply(g1, g2);
 286     }
 287
 288 It is referring to the tag meta-function and forwarding the call to the [*apply] method of
 289 a ['dispatch::distance] structure. The [*tag] meta-function is another traits class, and should
 290 be specialized for per point type, both shown here:
 291
 292     namespace traits
 293     {
 294         template <typename G>
 295         struct tag {};
 296
 297         // specialization
 298         template <>
 299         struct tag<mypoint>
 300         {
 301             typedef point_tag type;
 302         };
 303     }
 304
 305 Free meta-function, like coordinate_system and get:
 306
 307     template <typename G>
 308     struct tag : traits::tag<G> {};
 309
 310 [*Tags] (`point_tag`, `segment_tag`, etc) are empty structures with the purpose to specialize a
 311 dispatch struct. The dispatch struct for distance, and its specializations, are all defined
 312 in a separate namespace and look like the following:
 313
 314     namespace dispatch {
 315         template < typename Tag1, typename Tag2, typename G1, typename G2 >
 316         struct distance
 317         {};
 318
 319         template <typename P1, typename P2>
 320         struct distance < point_tag, point_tag, P1, P2 >
 321         {
 322             static double apply(P1 const& a, P2 const& b)
 323             {
 324                 // here we call pythagoras
 325                 // exactly like we did before
 326                 ...
 327             }
 328         };
 329
 330         template <typename P, typename S>
 331         struct distance
 332         <
 333             point_tag, segment_tag, P, S
 334         >
 335         {
 336             static double apply(P const& p, S const& s)
 337             {
 338                 // here we refer to another function
 339                 // implementing point-segment
 340                 // calculations in 2 or 3
 341                 // dimensions...
 342                 ...
 343             }
 344         };
 345
 346         // here we might have many more
 347         // specializations,
 348         // for point-polygon, box-circle, etc.
 349
 350     } // namespace
 351
 352 So yes, it is possible; the distance algorithm is generic now in the sense that it also
 353 supports different geometry types. One drawback: we have to define two dispatch specializations
 354 for point - segment and for segment - point separately. That will also be solved, in the paragraph
 355 reversibility below. The example below shows where we are now: different point types,
 356 geometry types, dimensions.
 357
 358     point a(1,1);
 359     point b(2,2);
 360     std::cout << distance(a,b) << std::endl;
 361     segment s1(0,0,5,3);
 362     std::cout << distance(a, s1) << std::endl;
 363     rgb red(255, 0, 0);
 364     rbc orange(255, 128, 0);
 365     std::cout << "color distance: " << distance(red, orange) << std::endl;
 366
 367 [heading Kernel Revisited]
 368
 369 We described above that we had a traits class `coordinate_type`, defined in namespace traits,
 370 and defined a separate `coordinate_type` class as well. This was actually not really necessary
 371 before, because the only difference was the namespace clause. But now that we have another
 372 geometry type, a segment in this case, it is essential. We can call the `coordinate_type`
 373 meta-function for any geometry type, point, segment, polygon, etc, implemented again by tag dispatching:
 374
 375     template <typename G>
 376     struct coordinate_type
 377     {
 378         typedef typename dispatch::coordinate_type
 379             <
 380                 typename tag<G>::type, G
 381             >::type type;
 382     };
 383
 384 Inside the dispatch namespace this meta-function is implemented twice: a generic version and
 385 one specialization for points. The specialization for points calls the traits class.
 386 The generic version calls the point specialization, as a sort of recursive meta-function definition:
 387
 388     namespace dispatch
 389     {
 390
 391         // Version for any geometry:
 392         template <typename GeometryTag, typename G>
 393         struct coordinate_type
 394         {
 395             typedef typename point_type
 396                 <
 397                     GeometryTag, G
 398                 >::type point_type;
 399
 400             // Call specialization on point-tag
 401             typedef typename coordinate_type < point_tag, point_type >::type type;
 402         };
 403
 404         // Specialization for point-type:
 405         template <typename P>
 406         struct coordinate_type<point_tag, P>
 407         {
 408             typedef typename
 409                 traits::coordinate_type<P>::type
 410                 type;
 411         };
 412     }
 413
 414 So it calls another meta-function point_type. This is not elaborated in here but realize that it
 415 is available for all geometry types, and typedefs the point type which makes up the geometry,
 416 calling it type.
 417
 418 The same applies for the meta-function dimension and for the upcoming meta-function coordinate system.
 419
 420 [heading Coordinate System]
 421
 422 Until here we assumed a Cartesian system. But we know that the Earth is not flat.
 423 Calculating a distance between two GPS-points with the system above would result in nonsense.
 424 So we again extend our design. We define for each point type a coordinate system type
 425 using the traits system again. Then we make the calculation dependant on that coordinate system.
 426
 427 Coordinate system is similar to coordinate type, a meta-function, calling a dispatch function
 428 to have it for any geometry-type, forwarding to its point specialization, and finally calling
 429 a traits class, defining a typedef type with a coordinate system. We don’t show that all here again.
 430 We only show the definition of a few coordinate systems:
 431
 432     struct cartesian {};
 433
 434     template<typename DegreeOrRadian>
 435     struct geographic
 436     {
 437         typedef DegreeOrRadian units;
 438     };
 439
 440 So Cartesian is simple, for geographic we can also select if its coordinates are stored in degrees
 441 or in radians.
 442
 443 The distance function will now change: it will select the computation method for the corresponding
 444 coordinate system and then call the dispatch struct for distance. We call the computation method
 445 specialized for coordinate systems a strategy. So the new version of the distance function is:
 446
 447     template <typename G1, typename G2>
 448     double distance(G1 const& g1, G2 const& g2)
 449     {
 450         typedef typename strategy_distance
 451             <
 452                 typename coordinate_system<G1>::type,
 453                 typename coordinate_system<G2>::type,
 454                 typename point_type<G1>::type,
 455                 typename point_type<G2>::type,
 456                 dimension<G1>::value
 457             >::type strategy;
 458
 459         return dispatch::distance
 460             <
 461                 typename tag<G1>::type,
 462                 typename tag<G2>::type,
 463                 G1, G2, strategy
 464             >::apply(g1, g2, strategy());
 465     }
 466
 467 The strategy_distance mentioned here is a struct with specializations for different coordinate
 468 systems.
 469
 470     template <typename T1, typename T2, typename P1, typename P2, int D>
 471     struct strategy_distance
 472     {
 473         typedef void type;
 474     };
 475
 476     template <typename P1, typename P2, int D>
 477     struct strategy_distance<cartesian, cartesian, P1, P2, D>
 478     {
 479         typedef pythagoras<P1, P2, D> type;
 480     };
 481
 482 So, here is our `pythagoras` again, now defined as a strategy. The distance dispatch function just
 483 calls its apply method.
 484
 485 So this is an important step: for spherical or geographical coordinate systems, another
 486 strategy (computation method) can be implemented. For spherical coordinate systems
 487 have the haversine formula. So the dispatching traits struct is specialized like this
 488
 489     template <typename P1, typename P2, int D = 2>
 490     struct strategy_distance<spherical, spherical, P1, P2, D>
 491     {
 492         typedef haversine<P1, P2> type;
 493     };
 494
 495     // struct haversine with apply function
 496     // is omitted here
 497
 498 For geography, we have some alternatives for distance calculation. There is the Andoyer method,
 499 fast and precise, and there is the Vincenty method, slower and more precise, and there are some
 500 less precise approaches as well.
 501
 502 Per coordinate system, one strategy is defined as the default strategy. To be able to use
 503 another strategy as well, we modify our design again and add an overload for the distance
 504 algorithm, taking a strategy object as a third parameter.
 505
 506 This new overload distance function also has the advantage that the strategy can be constructed
 507 outside the distance function. Because it was constructed inside above, it could not have
 508 construction parameters. But for Andoyer or Vincenty, or the haversine formula, it certainly
 509 makes sense to have a constructor taking the radius of the earth as a parameter.
 510
 511 So, the distance overloaded function is:
 512
 513     template <typename G1, typename G2, typename S>
 514     double distance(G1 const& g1, G2 const& g2, S const& strategy)
 515     {
 516         return dispatch::distance
 517             <
 518                 typename tag<G1>::type,
 519                 typename tag<G2>::type,
 520                 G1, G2, S
 521             >::apply(g1, g2, strategy);
 522     }
 523
 524 The strategy has to have a method apply taking two points as arguments (for points). It is not
 525 required that it is a static method. A strategy might define a constructor, where a configuration
 526 value is passed and stored as a member variable. In those cases a static method would be
 527 inconvenient. It can be implemented as a normal method (with the const qualifier).
 528
 529 We do not list all implementations here, Vincenty would cover half a page of mathematics,
 530 but you will understand the idea. We can call distance like this:
 531
 532     distance(c1, c2)
 533
 534 where `c1` and `c2` are Cartesian points, or like this:
 535
 536     distance(g1, g2)
 537
 538 where `g1` and `g2` are Geographic points, calling the default strategy for Geographic
 539 points (e.g. Andoyer), and like this:
 540
 541     distance(g1, g2, vincenty<G1, G2>(6275))
 542
 543 where a strategy is specified explicitly and constructed with a radius.
 544
 545 [/ 'TODO: What to do with this section? We have concepts already --mloskot]
 546 [heading Point Concept]
 547 The five traits classes mentioned in the previous sections form together the
 548 Point Concept. Any point type for which specializations are implemented in
 549 the traits namespace should be accepted a as valid type. So the Point Concept
 550 consists of:
 551
 552 * a specialization for `traits::tag`
 553 * a specialization for `traits::coordinate_system`
 554 * a specialization for `traits::coordinate_type`
 555 * a specialization for `traits::dimension`
 556 * a specialization for `traits::access`
 557
 558 The last one is a class, containing the method get and the (optional) method
 559 set, the first four are metafunctions, either defining type or declaring a
 560 value (conform MPL conventions).
 561
 562 So we now have agnosticism for the number of dimensions, agnosticism for
 563 coordinate systems; the design can handle any coordinate type, and it can
 564 handle different geometry types. Furthermore we can specify our own
 565 strategies, the code will not compile in case of two points with different
 566 dimensions (because of the assertion), and it will not compile for two points
 567 with different coordinate systems (because there is no specialization).
 568 A library can check if a point type fulfills the requirements imposed by the
 569 concepts. This is handled in the upcoming section Concept Checking.
 570
 571 [heading Return Type]
 572
 573 We promised that calling `std::sqrt` was not always necessary. So we define a distance result `struct`
 574 that contains the squared value and is convertible to a double value. This, however, only has to
 575 be done for `pythagoras`. The spherical distance functions do not take the square root so for them
 576 it is not necessary to avoid the expensive square root call; they can just return their distance.
 577
 578 So the distance result struct is dependant on strategy, therefore made a member type of
 579 the strategy. The result struct looks like this:
 580
 581     template<typename T = double>
 582     struct cartesian_distance
 583     {
 584         T sq;
 585         explicit cartesian_distance(T const& v) : sq (v) {}
 586
 587         inline operator T() const
 588         {
 589             return std::sqrt(sq);
 590         }
 591     };
 592
 593 It also has operators defined to compare itself to other results without taking the square root.
 594
 595 Each strategy should define its return type, within the strategy class, for example:
 596
 597     typedef cartesian_distance<T> return_type;
 598
 599 or:
 600
 601     typedef double return_type;
 602
 603 for cartesian (pythagoras) and spherical, respectively.
 604
 605 Again our distance function will be modified, as expected, to reflect the new return type.
 606 For the overload with a strategy it is not complex:
 607
 608     template < typename G1, typename G2, typename Strategy >
 609     typename Strategy::return_type distance( G1 const& G1 , G2 const& G2 , S const& strategy)
 610
 611 But for the one without strategy we have to select strategy, coordinate type, etc.
 612 It would be spacious to do it in one line so we add a separate meta-function:
 613
 614     template <typename G1, typename G2 = G1>
 615     struct distance_result
 616     {
 617         typedef typename point_type<G1>::type P1;
 618         typedef typename point_type<G2>::type P2;
 619         typedef typename strategy_distance
 620             <
 621                 typename cs_tag<P1>::type,
 622                 typename cs_tag<P2>::type,
 623                 P1, P2
 624             >::type S;
 625
 626         typedef typename S::return_type type;
 627     };
 628
 629 and modify our distance function:
 630
 631     template <typename G1, typename G2>
 632     inline typename distance_result<G1, G2>::type distance(G1 const& g1, G2 const& g2)
 633     {
 634         // ...
 635     }
 636
 637 Of course also the apply functions in the dispatch specializations will return a result like this.
 638 They have a strategy as a template parameter everywhere, making the less verbose version possible.
 639
 640 [heading Summary]
 641
 642 In this design rationale, __boost_geometry__ is step by step designed using tag dispatching,
 643 concepts, traits, and metaprogramming. We used the well-known distance function
 644 to show the design.
 645
 646 __boost_geometry__ is designed like described here, with
 647 some more techniques as automatically reversing template arguments, tag casting,
 648 and reusing implementation classes or dispatch classes as policies in other
 649 dispatch classes.
 650
 651 [endsect] [/ end of section Design Rationale]