ceph/src/boost/tools/build/doc/src/architecture.xml

   1 <?xml version="1.0" encoding="UTF-8"?>
   2 <!DOCTYPE appendix PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN"
   3   "http://www.boost.org/tools/boostbook/dtd/boostbook.dtd">
   4
   5   <appendix id="bbv2.arch">
   6     <title>Boost.Build v2 architecture</title>
   7
   8   <sidebar>
   9     <para>
  10       This document is work-in progress. Do not expect much from it yet.
  11     </para>
  12   </sidebar>
  13
  14   <section id="bbv2.arch.overview">
  15     <title>Overview</title>
  16
  17     <!-- FIXME: the below does not mention engine at all, making rest of the
  18          text confusing. Things like 'kernel' and 'util' don't have to be
  19          mentioned at all. -->
  20     <para>
  21       Boost.Build implementation is structured in four different components:
  22     "kernel", "util", "build" and "tools". The first two are relatively
  23     uninteresting, so we will focus on the remaining pair. The "build" component
  24     provides classes necessary to declare targets, determining which properties
  25     should be used for their building, and creating the dependency graph. The
  26     "tools" component provides user-visible functionality. It mostly allows
  27     declaring specific kinds of main targets, as well as registering available
  28     tools, which are then used when creating the dependency graph.
  29     </para>
  30   </section>
  31
  32   <section id="bbv2.arch.build">
  33     <title>The build layer</title>
  34
  35     <para>
  36       The build layer has just four main parts -- metatargets (abstract
  37     targets), virtual targets, generators and properties.
  38
  39       <itemizedlist>
  40         <listitem><para>
  41           Metatargets (see the "targets.jam" module) represent all the
  42         user-defined entities that can be built. The "meta" prefix signifies
  43         that they do not need to correspond to exact files or even files at all
  44         -- they can produce a different set of files depending on the build
  45         request. Metatargets are created when Jamfiles are loaded. Each has a
  46         <code>generate</code> method which is given a property set and produces
  47         virtual targets for the passed properties.
  48         </para></listitem>
  49         <listitem><para>
  50           Virtual targets (see the "virtual-targets.jam" module) correspond to
  51         actual atomic updatable entities -- most typically files.
  52         </para></listitem>
  53         <listitem><para>
  54           Properties are just (name, value) pairs, specified by the user and
  55         describing how targets should be built. Properties are stored using the
  56         <code>property-set</code> class.
  57         </para></listitem>
  58         <listitem><para>
  59           Generators are objects that encapsulate specific tools -- they can
  60         take a list of source virtual targets and produce new virtual targets
  61         from them.
  62         </para></listitem>
  63       </itemizedlist>
  64
  65     </para>
  66
  67     <para>
  68       The build process includes the following steps:
  69
  70       <orderedlist>
  71         <listitem><para>
  72           Top-level code calls the <code>generate</code> method of a metatarget
  73         with some properties.
  74         </para></listitem>
  75
  76         <listitem><para>
  77           The metatarget combines the requested properties with its requirements
  78         and passes the result, together with the list of sources, to the
  79         <code>generators.construct</code> function.
  80         </para></listitem>
  81
  82         <listitem><para>
  83           A generator appropriate for the build properties is selected and its
  84         <code>run</code> method is called. The method returns a list of virtual
  85         targets.
  86         </para></listitem>
  87
  88         <listitem><para>
  89           The virtual targets are returned to the top level code, and for each instance,
  90           the <literal>actualize</literal> method is called to setup nodes and updating
  91           actions in the depenendency graph kepts inside Boost.Build engine. This dependency
  92           graph is then updated, which runs necessary commands.
  93         </para></listitem>
  94       </orderedlist>
  95     </para>
  96
  97     <section id="bbv2.arch.build.metatargets">
  98       <title>Metatargets</title>
  99
 100       <para>
 101         There are several classes derived from "abstract-target". The
 102       "main-target" class represents a top-level main target, the
 103       "project-target" class acts like a container holding multiple main
 104       targets, and "basic-target" class is a base class for all further target
 105       types.
 106       </para>
 107
 108       <para>
 109         Since each main target can have several alternatives, all top-level
 110       target objects are actually containers, referring to "real" main target
 111       classes. The type of that container is "main-target". For example, given:
 112 <programlisting>
 113 alias a ;
 114 lib a : a.cpp : &lt;toolset&gt;gcc ;
 115 </programlisting>
 116       we would have one-top level "main-target" instance, containing one
 117       "alias-target" and one "lib-target" instance. "main-target"'s "generate"
 118       method decides which of the alternative should be used, and calls
 119       "generate" on the corresponding instance.
 120       </para>
 121
 122       <para>
 123         Each alternative is an instance of a class derived from "basic-target".
 124       "basic-target.generate" does several things that should always be done:
 125
 126         <itemizedlist>
 127           <listitem><para>
 128             Determines what properties should be used for building the target.
 129           This includes looking at requested properties, requirements, and usage
 130           requirements of all sources.
 131           </para></listitem>
 132
 133           <listitem><para>
 134             Builds all sources.
 135           </para></listitem>
 136
 137           <listitem><para>
 138             Computes usage requirements that should be passed back to targets
 139           depending on this one.
 140           </para></listitem>
 141         </itemizedlist>
 142
 143       For the real work of constructing a virtual target, a new method
 144       "construct" is called.
 145       </para>
 146
 147       <para>
 148         The "construct" method can be implemented in any way by classes derived
 149       from "basic-target", but one specific derived class plays the central role
 150       -- "typed-target". That class holds the desired type of file to be
 151       produced, and its "construct" method uses the generators module to do the
 152       actual work.
 153       </para>
 154
 155       <para>
 156         This means that a specific metatarget subclass may avoid using
 157       generators all together. However, this is deprecated and we are trying to
 158       eliminate all such subclasses at the moment.
 159       </para>
 160
 161       <para>
 162         Note that the <filename>build/targets.jam</filename> file contains an
 163       UML diagram which might help.
 164       </para>
 165     </section>
 166
 167     <section id="bbv2.arch.build.virtual">
 168       <title>Virtual targets</title>
 169
 170       <para>
 171         Virtual targets are atomic updatable entities. Each virtual
 172       target can be assigned an updating action -- instance of the
 173       <code>action</code> class. The action class, in turn, contains a list of
 174       source targets, properties, and a name of an action which
 175       should be executed.
 176       </para>
 177
 178       <para>
 179         We try hard to never create equal instances of the
 180       <code>virtual-target</code> class. Code creating virtual targets passes
 181       them though the <code>virtual-target.register</code> function, which
 182       detects if a target with the same name, sources, and properties has
 183       already been created. In that case, the preexisting target is returned.
 184       </para>
 185
 186       <!-- FIXME: the below 2 para are rubbish, must be totally rewritten. -->
 187       <para>
 188         When all virtual targets are produced, they are "actualized". This means
 189       that the real file names are computed, and the commands that should be run
 190       are generated. This is done by the <code>virtual-target.actualize</code>
 191       and <code>action.actualize</code> methods. The first is conceptually
 192       simple, while the second needs additional explanation. Commands in Boost.Build
 193       are generated in a two-stage process. First, a rule with an appropriate
 194       name (for example "gcc.compile") is called and is given a list of target
 195       names. The rule sets some variables, like "OPTIONS". After that, the
 196       command string is taken, and variable are substitutes, so use of OPTIONS
 197       inside the command string gets transformed into actual compile options.
 198       </para>
 199
 200       <para>
 201         Boost.Build added a third stage to simplify things. It is now possible
 202       to automatically convert properties to appropriate variable assignments.
 203       For example, &lt;debug-symbols&gt;on would add "-g" to the OPTIONS
 204       variable, without requiring to manually add this logic to gcc.compile.
 205       This functionality is part of the "toolset" module.
 206       </para>
 207
 208       <para>
 209         Note that the <filename>build/virtual-targets.jam</filename> file
 210       contains an UML diagram which might help.
 211       </para>
 212     </section>
 213
 214     <section id="bbv2.arch.build.properties">
 215       <title>Properties</title>
 216
 217       <para>
 218         Above, we noted that metatargets are built with a set of properties.
 219       That set is represented by the <code>property-set</code> class. An
 220       important point is that handling of property sets can get very expensive.
 221       For that reason, we make sure that for each set of (name, value) pairs
 222       only one <code>property-set</code> instance is created. The
 223       <code>property-set</code> uses extensive caching for all operations, so
 224       most work is avoided. The <code>property-set.create</code> is the factory
 225       function used to create instances of the <code>property-set</code> class.
 226       </para>
 227     </section>
 228   </section>
 229
 230   <section id="bbv2.arch.tools">
 231     <title>The tools layer</title>
 232
 233     <para>Write me!</para>
 234   </section>
 235
 236   <section id="bbv2.arch.targets">
 237     <title>Targets</title>
 238
 239     <para>NOTE: THIS SECTION IS NOT EXPECTED TO BE READ!
 240       There are two user-visible kinds of targets in Boost.Build. First are
 241     "abstract" &#x2014; they correspond to things declared by the user, e.g.
 242     projects and executable files. The primary thing about abstract targets is
 243     that it is possible to request them to be built with a particular set of
 244     properties. Each property combination may possibly yield different built
 245     files, so abstract target do not have a direct correspondence to built
 246     files.
 247     </para>
 248
 249     <para>
 250       File targets, on the other hand, are associated with concrete files.
 251     Dependency graphs for abstract targets with specific properties are
 252     constructed from file targets. User has no way to create file targets but
 253     can specify rules for detecting source file types, as well as rules for
 254     transforming between file targets of different types. That information is
 255     used in constructing the final dependency graph, as described in the <link
 256     linkend="bbv2.arch.depends">next section</link>.
 257     <emphasis role="bold">Note:</emphasis>File targets are not the same entities
 258     as Jam targets; the latter are created from file targets at the latest
 259     possible moment.
 260     <emphasis role="bold">Note:</emphasis>"File target" is an originally
 261     proposed name for what we now call virtual targets. It is more
 262     understandable by users, but has one problem: virtual targets can
 263     potentially be "phony", and not correspond to any file.
 264     </para>
 265   </section>
 266
 267   <section id="bbv2.arch.depends">
 268     <title>Dependency scanning</title>
 269
 270     <para>
 271       Dependency scanning is the process of finding implicit dependencies, like
 272       "#include" statements in C++. The requirements for correct dependency
 273       scanning mechanism are:
 274     </para>
 275
 276     <itemizedlist>
 277       <listitem><simpara>
 278         <link linkend="bbv2.arch.depends.different-scanning-algorithms">Support
 279       for different scanning algorithms</link>. C++ and XML have quite different
 280       syntax for includes and rules for looking up the included files.
 281       </simpara></listitem>
 282
 283       <listitem><simpara>
 284         <link linkend="bbv2.arch.depends.same-file-different-scanners">Ability
 285       to scan the same file several times</link>. For example, a single C++ file
 286       may be compiled using different include paths.
 287       </simpara></listitem>
 288
 289       <listitem><simpara>
 290         <link linkend="bbv2.arch.depends.dependencies-on-generated-files">Proper
 291       detection of dependencies on generated files.</link>
 292       </simpara></listitem>
 293
 294       <listitem><simpara>
 295         <link
 296       linkend="bbv2.arch.depends.dependencies-from-generatedfiles">Proper
 297       detection of dependencies from a generated file.</link>
 298       </simpara></listitem>
 299     </itemizedlist>
 300
 301     <section id="bbv2.arch.depends.different-scanning-algorithms">
 302       <title>Support for different scanning algorithms</title>
 303
 304       <para>
 305         Different scanning algorithm are encapsulated by objects called
 306       "scanners". Please see the "scanner" module documentation for more
 307       details.
 308       </para>
 309     </section>
 310
 311     <section id="bbv2.arch.depends.same-file-different-scanners">
 312       <title>Ability to scan the same file several times</title>
 313
 314       <para>
 315         As stated above, it is possible to compile a C++ file multiple times,
 316       using different include paths. Therefore, include dependencies for those
 317       compilations can be different. The problem is that Boost.Build engine does
 318       not allow multiple scans of the same target. To solve that, we pass the
 319       scanner object when calling <literal>virtual-target.actualize</literal>
 320       and it creates different engine targets for different scanners.
 321       </para>
 322
 323       <para>
 324         For each engine target created with a specified scanner, a
 325       corresponding one is created without it. The updating action is
 326       associated with the scanner-less target, and the target with the scanner
 327       is made to depend on it. That way if sources for that action are touched,
 328       all targets &#x2014; with and without the scanner are considered outdated.
 329       </para>
 330
 331       <para>
 332         Consider the following example: "a.cpp" prepared from "a.verbatim",
 333       compiled by two compilers using different include paths and copied into
 334       some install location. The dependency graph would look like:
 335       </para>
 336
 337 <programlisting>
 338 a.o (&lt;toolset&gt;gcc)        &lt;--(compile)-- a.cpp (scanner1) ----+
 339 a.o (&lt;toolset&gt;msvc)       &lt;--(compile)-- a.cpp (scanner2) ----|
 340 a.cpp (installed copy)    &lt;--(copy) ----------------------- a.cpp (no scanner)
 341                                                                  ^
 342                                                                  |
 343                        a.verbose --------------------------------+
 344 </programlisting>
 345     </section>
 346
 347     <section id="bbv2.arch.depends.dependencies-on-generated-files">
 348       <title>Proper detection of dependencies on generated files.</title>
 349
 350       <para>
 351         This requirement breaks down to the following ones.
 352       </para>
 353
 354       <orderedlist>
 355         <listitem><simpara>
 356           If when compiling "a.cpp" there is an include of "a.h", the "dir"
 357         directory is on the include path, and a target called "a.h" will be
 358         generated in "dir", then Boost.Build should discover the include, and create
 359         "a.h" before compiling "a.cpp".
 360         </simpara></listitem>
 361
 362         <listitem><simpara>
 363           Since Boost.Build almost always generates targets under the "bin"
 364         directory, this should be supported as well. I.e. in the scenario above,
 365         Jamfile in "dir" might create a main target, which generates "a.h". The
 366         file will be generated to "dir/bin" directory, but we still have to
 367         recognize the dependency.
 368         </simpara></listitem>
 369       </orderedlist>
 370
 371       <para>
 372         The first requirement means that when determining what "a.h" means when
 373       found in "a.cpp", we have to iterate over all directories in include
 374       paths, checking for each one:
 375       </para>
 376
 377       <orderedlist>
 378         <listitem><simpara>
 379           If there is a file named "a.h" in that directory, or
 380         </simpara></listitem>
 381
 382         <listitem><simpara>
 383           If there is a target called "a.h", which will be generated in that
 384         that directory.
 385         </simpara></listitem>
 386       </orderedlist>
 387
 388       <para>
 389         Classic Jam has built-in facilities for point (1) above, but that is not
 390       enough. It is hard to implement the right semantics without builtin
 391       support. For example, we could try to check if there exists a target
 392       called "a.h" somewhere in the dependency graph, and add a dependency to
 393       it. The problem is that without a file search in the include path, the
 394       semantics may be incorrect. For example, one can have an action that
 395       generated some "dummy" header, for systems which do not have a native one.
 396       Naturally, we do not want to depend on that generated header on platforms
 397       where a native one is included.
 398       </para>
 399
 400       <para>
 401         There are two design choices for builtin support. Suppose we have files
 402       a.cpp and b.cpp, and each one includes header.h, generated by some action.
 403       Dependency graph created by classic Jam would look like:
 404
 405 <programlisting>
 406 a.cpp -----&gt; &lt;scanner1&gt;header.h  [search path: d1, d2, d3]
 407
 408                   &lt;d2&gt;header.h  --------&gt; header.y
 409                   [generated in d2]
 410
 411 b.cpp -----&gt; &lt;scanner2&gt;header.h  [search path: d1, d2, d4]
 412 </programlisting>
 413       </para>
 414
 415       <para>
 416         In this case, Jam thinks all header.h target are not related. The
 417       correct dependency graph might be:
 418
 419 <programlisting>
 420 a.cpp ----
 421           \
 422            &gt;----&gt;  &lt;d2&gt;header.h  --------&gt; header.y
 423           /       [generated in d2]
 424 b.cpp ----
 425 </programlisting>
 426
 427     or
 428
 429 <programlisting>
 430 a.cpp -----&gt; &lt;scanner1&gt;header.h  [search path: d1, d2, d3]
 431                           |
 432                        (includes)
 433                           V
 434                   &lt;d2&gt;header.h  --------&gt; header.y
 435                   [generated in d2]
 436                           ^
 437                       (includes)
 438                           |
 439 b.cpp -----&gt; &lt;scanner2&gt;header.h [ search path: d1, d2, d4]
 440 </programlisting>
 441       </para>
 442
 443       <para>
 444         The first alternative was used for some time. The problem however is:
 445       what include paths should be used when scanning header.h? The second
 446       alternative was suggested by Matt Armstrong. It has a similar effect: Any
 447       target depending on &lt;scanner1&gt;header.h will also depend on
 448       &lt;d2&gt;header.h. This way though we now have two different targets with
 449       two different scanners, so those targets can be scanned independently. The
 450       first alternative's problem is avoided, so the second alternative is
 451       implemented now.
 452       </para>
 453
 454       <para>
 455         The second sub-requirements is that targets generated under the "bin"
 456       directory are handled as well. Boost.Build implements a semi-automatic
 457       approach. When compiling C++ files the process is:
 458       </para>
 459
 460       <orderedlist>
 461         <listitem><simpara>
 462           The main target to which the compiled file belongs to is found.
 463         </simpara></listitem>
 464
 465         <listitem><simpara>
 466           All other main targets that the found one depends on are found. These
 467         include: main targets used as sources as well as those specified as
 468         "dependency" properties.
 469         </simpara></listitem>
 470
 471         <listitem><simpara>
 472           All directories where files belonging to those main targets will be
 473         generated are added to the include path.
 474         </simpara></listitem>
 475       </orderedlist>
 476
 477       <para>
 478         After this is done, dependencies are found by the approach explained
 479       previously.
 480       </para>
 481
 482       <para>
 483         Note that if a target uses generated headers from another main target,
 484       that main target should be explicitly specified using the dependency
 485       property. It would be better to lift this requirement, but it does not
 486       seem to be causing any problems in practice.
 487       </para>
 488
 489       <para>
 490         For target types other than C++, adding of include paths must be
 491       implemented anew.
 492       </para>
 493     </section>
 494
 495     <section id="bbv2.arch.depends.dependencies-from-generated-files">
 496       <title>Proper detection of dependencies from generated files</title>
 497
 498       <para>
 499         Suppose file "a.cpp" includes "a.h" and both are generated by some
 500       action. Note that classic Jam has two stages. In the first stage the
 501       dependency graph is built and actions to be run are determined. In the
 502       second stage the actions are executed. Initially, neither file exists, so
 503       the include is not found. As the result, Jam might attempt to compile
 504       a.cpp before creating a.h, causing the compilation to fail.
 505       </para>
 506
 507       <para>
 508         The solution in Boost.Jam is to perform additional dependency scans
 509       after targets are updated. This breaks separation between build stages in
 510       Jam &#x2014; which some people consider a good thing &#x2014; but I am not
 511       aware of any better solution.
 512       </para>
 513
 514       <para>
 515         In order to understand the rest of this section, you better read some
 516       details about Jam's dependency scanning, available at <ulink url=
 517       "http://public.perforce.com:8080/@md=d&amp;cd=//public/jam/src/&amp;ra=s&amp;c=kVu@//2614?ac=10">
 518       this link</ulink>.
 519       </para>
 520
 521       <para>
 522         Whenever a target is updated, Boost.Jam rescans it for includes.
 523       Consider this graph, created before any actions are run.
 524 <programlisting>
 525 A -------&gt; C ----&gt; C.pro
 526      /
 527 B --/         C-includes   ---&gt; D
 528 </programlisting>
 529       </para>
 530
 531       <para>
 532         Both A and B have dependency on C and C-includes (the latter dependency
 533       is not shown). Say during building we have tried to create A, then tried
 534       to create C and successfully created C.
 535       </para>
 536
 537       <para>
 538         In that case, the set of includes in C might well have changed. We do
 539       not bother to detect precisely which includes were added or removed.
 540       Instead we create another internal node C-includes-2. Then we determine
 541       what actions should be run to update the target. In fact this means that
 542       we perform the first stage logic when already in the execution stage.
 543       </para>
 544
 545       <para>
 546         After actions for C-includes-2 are determined, we add C-includes-2 to
 547       the list of A's dependents, and stage 2 proceeds as usual. Unfortunately,
 548       we can not do the same with target B, since when it is not visited, C
 549       target does not know B depends on it. So, we add a flag to C marking it as
 550       rescanned. When visiting the B target, the flag is noticed and
 551       C-includes-2 is added to the list of B's dependencies as well.
 552       </para>
 553
 554       <para>
 555         Note also that internal nodes are sometimes updated too. Consider this
 556       dependency graph:
 557 <programlisting>
 558 a.o ---&gt; a.cpp
 559             a.cpp-includes --&gt;  a.h (scanned)
 560                                    a.h-includes ------&gt; a.h (generated)
 561                                                                  |
 562                                                                  |
 563             a.pro &lt;-------------------------------------------+
 564 </programlisting>
 565       </para>
 566
 567       <para>
 568         Here, our handling of generated headers come into play. Say that a.h
 569       exists but is out of date with respect to "a.pro", then "a.h (generated)"
 570       and "a.h-includes" will be marked for updating, but "a.h (scanned)" will
 571       not. We have to rescan "a.h" after it has been created, but since "a.h
 572       (generated)" has no associated scanner, it is only possible to rescan
 573       "a.h" after "a.h-includes" target has been updated.
 574       </para>
 575
 576       <para>
 577         The above consideration lead to the decision to rescan a target whenever
 578       it is updated, no matter if it is internal or not.
 579       </para>
 580
 581     </section>
 582   </section>
 583
 584   <warning>
 585     <para>
 586       The remainder of this document is not intended to be read at all. This
 587     will be rearranged in the future.
 588     </para>
 589   </warning>
 590
 591   <section>
 592     <title>File targets</title>
 593
 594     <para>
 595       As described above, file targets correspond to files that Boost.Build
 596     manages. Users may be concerned about file targets in three ways: when
 597     declaring file target types, when declaring transformations between types
 598     and when determining where a file target is to be placed. File targets can
 599     also be connected to actions that determine how the target is to be created.
 600     Both file targets and actions are implemented in the
 601     <literal>virtual-target</literal> module.
 602     </para>
 603
 604     <section>
 605       <title>Types</title>
 606
 607       <para>
 608         A file target can be given a type, which determines what transformations
 609       can be applied to the file. The <literal>type.register</literal> rule
 610       declares new types. File type can also be assigned a scanner, which is
 611       then used to find implicit dependencies. See "<link
 612       linkend="bbv2.arch.depends">dependency scanning</link>".
 613       </para>
 614     </section>
 615
 616     <section>
 617       <title>Target paths</title>
 618
 619       <para>
 620         To distinguish targets build with different properties, they are put in
 621       different directories. Rules for determining target paths are given below:
 622       </para>
 623
 624       <orderedlist>
 625         <listitem><simpara>
 626           All targets are placed under a directory corresponding to the project
 627         where they are defined.
 628         </simpara></listitem>
 629
 630         <listitem><simpara>
 631           Each non free, non incidental property causes an additional element to
 632         be added to the target path. That element has the the form
 633         <literal>&lt;feature-name&gt;-&lt;feature-value&gt;</literal> for
 634         ordinary features and <literal>&lt;feature-value&gt;</literal> for
 635         implicit ones. [TODO: Add note about composite features].
 636         </simpara></listitem>
 637
 638         <listitem><simpara>
 639           If the set of free, non incidental properties is different from the
 640         set of free, non incidental properties for the project in which the main
 641         target that uses the target is defined, a part of the form
 642         <literal>main_target-&lt;name&gt;</literal> is added to the target path.
 643         <emphasis role="bold">Note:</emphasis>It would be nice to completely
 644         track free features also, but this appears to be complex and not
 645         extremely needed.
 646         </simpara></listitem>
 647       </orderedlist>
 648
 649       <para>
 650         For example, we might have these paths:
 651 <programlisting>
 652 debug/optimization-off
 653 debug/main-target-a
 654 </programlisting>
 655       </para>
 656     </section>
 657   </section>
 658
 659   </appendix>
 660
 661 <!--
 662      Local Variables:
 663      mode: xml
 664      sgml-indent-data: t
 665      sgml-parent-document: ("userman.xml" "chapter")
 666      sgml-set-face: t
 667      End:
 668 -->