ceph/src/boost/tools/auto_index/doc/auto_index.qbk

   1 [article Boost.AutoIndex
   2     [quickbook 1.5]
   3     [copyright 2008, 2011 John Maddock]
   4     [license
   5         Distributed under the Boost Software License, Version 1.0.
   6         (See accompanying file LICENSE_1_0.txt or copy at
   7         [@http://www.boost.org/LICENSE_1_0.txt])
   8     ]
   9     [authors [Maddock, John]]
  10     [/last-revision $Date: 2008-11-04 17:11:53 +0000 (Tue, 04 Nov 2008) $]
  11 ]
  12
  13 [def __quickbook  [@http://www.boost.org/doc/tools/quickbook/index.html Quickbook]]
  14 [def __boostbook [@http://www.boost.org/doc/html/boostbook.html BoostBook]]
  15 [def __boostbook_docs [@http://www.boost.org/doc/libs/1_41_0/doc/html/boostbook.html BoostBook documentation]]
  16 [def __quickbook_syntax [@http://www.boost.org/doc/libs/1_41_0/doc/html/quickbook/ref.html Quickbook Syntax Compendium]]
  17 [def __docbook [@http://www.docbook.org/ DocBook]]
  18 [def __docbook_params [@http://docbook.sourceforge.net/release/xsl/current/doc/ Docbook xsl:param format options]]
  19 [def __DocObjMod [@http://en.wikipedia.org/wiki/Document_Object_Model Document Object Model (DOM)]]
  20
  21 [def __doxygen [@http://www.doxygen.org/ Doxygen]]
  22 [def __pdf [@http://www.adobe.com/products/acrobat/adobepdf.html PDF]]
  23
  24 [template deg[]'''&#xB0;'''] [/ degree sign ]
  25
  26
  27 [section:overview Overview]
  28
  29 AutoIndex is a tool for taking the grunt work out of indexing a
  30 Boostbook\/Docbook document
  31 (perhaps generated by your Quickbook file mylibrary.qbk,
  32 and perhaps using also Doxygen autodoc)
  33 that describes C\/C++ code.
  34
  35 Traditionally, in order to index a Docbook document you would
  36 have to manually add a large amount of `<indexterm>` markup:
  37 in fact one `<indexterm>` for each occurrence of each term to be
  38 indexed.
  39
  40 Instead AutoIndex will automatically scan one or more C\/C++ header files
  41 and extract all the ['function], ['class], ['macro] and ['typedef]
  42 names that are defined by those headers, and then insert the
  43 `<indexterm>`s into the Docbook XML document for you.
  44
  45 AutoIndex can also scan using a list of index terms
  46 specified in a script file, for example index.idx.
  47 These manually provided terms can optionally be regular expressions,
  48 and may allow the user to find references to terms
  49 that may not occur in the C++ header files.  Of course providing a manual
  50 list of search terms in to index is a tedious task
  51 (especially handling plurals and variants),
  52 and requires enough knowledge of the library
  53  to guess what users may be seeking to know,
  54 but at least the real 'grunt work' of
  55 finding the term and listing the page number is automated.
  56
  57 AutoIndex creates index entries as follows:
  58
  59 for each occurrence of each search term, it creates two index entries:
  60
  61 # The search term as the ['primary index key] and
  62  the ['title of the section it appears in] as a subterm.
  63
  64 # The section title as the main index entry and the search term as the subentry.
  65
  66 Thus the user has two chances to find what they're
  67 looking for, based upon either the section name
  68 or the ['function], ['class], ['macro] or ['typedef] name.
  69
  70 [note This behaviour can be changed so that only one index entry is created
  71  (using the search term as the key and
  72  not using the section name except as a sub-entry of the search term).]
  73
  74 So for example in Boost.Math the class name `students_t_distribution` has a primary
  75 entry that lists all sections the class name appears in:
  76
  77 [$../students_t_eg_1.png]
  78
  79 Then those sections also have primary entries, which list all the search terms those
  80 sections contain:
  81
  82 [$../students_t_eg_2.png]
  83
  84 Of course these automated index entries may not be quite
  85 what you're looking for: often you'll get a few spurious entries, a few missing entries,
  86 and a few entries where the section name used as an index entry is less than ideal.
  87 So AutoIndex provides some powerful regular expression based rules that allow you
  88 to add, remove, constrain, or rewrite entries.  Normally just a few lines in
  89 AutoIndex's script file are enough to tailor the output to match the author's
  90 expectations (and thus hopefully the index user's expectations too!).
  91
  92 AutoIndex also supports multiple indexes (as does Docbook), and since it knows
  93 which search terms are ['function], ['class], ['macro] or ['typedef] names, it
  94 can add the necessary attributes to the XML so that you can have separate
  95 indexes for each of these different types.  These specialised indexes only contain
  96 entries for the ['function], ['class], ['macro] or ['typedef] names, ['section
  97 names] are never used as primary index terms here, unlike the main "include everything"
  98 index.
  99
 100 Finally, while the Docbook XSL stylesheets create nice indexes complete with page
 101 numbers for PDF output, the HTML indexes look poorer by comparison, as these use
 102 section titles in place of page numbers... but as AutoIndex uses section titles
 103 as index entries this leads to a lot of repetition, so as an alternative AutoIndex
 104 can be instructed to construct the index itself.  This is faster than using
 105 the XSL stylesheets, and now each index entry is a hyperlink to the
 106 appropriate section:
 107
 108 [$../students_t_eg_3.png]
 109
 110 With internal index generation there is also a helpful navigation bar
 111 at the start of each Index:
 112
 113 [$../students_t_eg_4.png]
 114
 115 Finally, you can choose what kind of XML container wraps an internally generated index -
 116 this defaults to `<section>...</section>` but you can use either command line options
 117 or Boost.Build Jamfile features, to select an alternative wrapper - for example ['appendix]
 118 or ['chapter] would be good choices, whatever fits best into the flow of the
 119 document.  You can even set the container wrapper to type ['index] provided you turn
 120 off index generation by the XSL stylesheets, for example by setting the following
 121 build requirements in the Jamfile:
 122
 123 [pre
 124 <format>html:<auto-index-internal>on       # Use internally generated indexes.
 125 <auto-index-type>index                     # Use <index>...</index> as the XML wrapper.
 126 <format>html:<xsl:param>generate.index=0   # Don't let the XSL stylesheets generate indexes.
 127 ]
 128
 129 [endsect] [/section:overview Overview]
 130
 131 [section:tut Getting Started and Tutorial]
 132
 133 [section:build Step 1: Build the AutoIndex tool]
 134
 135 [note This step is strictly optional, but very desirable to speed up build times.]
 136
 137 cd into `tools/auto_index/build` and invoke bjam as:
 138
 139    bjam release
 140
 141 Optionally pass the name of the compiler toolset you want to use to bjam as well:
 142
 143    bjam release gcc
 144
 145 This will build the tool and place a copy in the current directory (which is to say `tools/auto_index/build`)
 146
 147 Now open up your `user-config.jam` file and at the end of the file add the line:
 148
 149 [pre
 150 using auto-index : ['full-path-to-boost-tree]/tools/auto_index/build/auto-index.exe ;
 151 ]
 152
 153 [note
 154 This declaration must go towards the end of `user-config.jam`, or in any case after the Boostbook initialisation.
 155
 156 Also note that Windows users must use forward slashes in the paths in `user-config.jam`]
 157
 158 [endsect] [/section:build Step 1: Build the AutoIndex tool]
 159
 160 [section:configure Step 2: Configure Boost.Build jamfile to use AutoIndex]
 161
 162 Assuming you have a Jamfile for building your documentation that looks
 163 something like:
 164
 165 [pre
 166 boostbook standalone
 167     :
 168         mylibrary
 169     :
 170         # build requirements go here:
 171     ;
 172 ]
 173
 174 Then add the line:
 175
 176 [pre using auto-index ; ]
 177
 178 to the start of the Jamfile, and then add whatever auto-index options
 179 you want to the ['build requirements section], for example:
 180
 181 [pre
 182    boostbook standalone
 183     :
 184         mylibrary
 185     :
 186         # Build requirements go here:
 187
 188         # <auto-index>on (or off) one turns on (or off) indexing:
 189         <auto-index>on
 190
 191         # Turns on (or off) auto-index-verbose for diagnostic info.
 192         # This is highly recommended until you have got all the many details correct!
 193         <auto-index-verbose>on
 194
 195         # Choose the indexing method (separately for html and PDF) - see manual.
 196         # Choose indexing method for PDFs:
 197         <format>pdf:<auto-index-internal>off
 198
 199         # Choose indexing method for html:
 200         <format>html:<auto-index-internal>on
 201
 202         # Set the name of the script file to use (index.idx is popular):
 203         <auto-index-script>index.idx
 204         # Commands in the script file should all use RELATIVE PATHS
 205         # otherwise the script will not be portable to other machines.
 206         # Relative paths are normally taken as relative to the location
 207         # of the script file, but we can add a prefix to all
 208         # those relative paths using the <auto-index-prefix> feature.
 209         # The path specified by <auto-index-prefix> may be either relative or
 210         # absolute, for example the following will get us up to the boost root
 211         # directory for most Boost libraries:
 212         <auto-index-prefix>..\/..\/..
 213
 214         # Tell Quickbook that it should enable indexing.
 215         <quickbook-define>enable_index ;
 216
 217     ;
 218 ] [/pre]
 219
 220 [section:options Available Indexing Options]
 221
 222 The available options are:
 223
 224 [variablelist
 225 [[<auto-index>off/on][Turns indexing of the document on, defaults to
 226 "off", so be sure to set this if you want AutoIndex invoked!]]
 227 [[<auto-index-internal>off/on][Chooses whether AutoIndex creates the index
 228 itself (feature on), or whether it simply inserts the necessary DocBook
 229 markup so that the DocBook XSL stylesheets can create the index.  Defaults to "off".]]
 230 [[<auto-index-script>filename][Specifies the name of the script to load.]]
 231 [[<auto-index-no-duplicates>off/on][When ['on] AutoIndex will only index a term
 232 once in any given section, otherwise (the default) multiple index entries per
 233 term may be created if the term occurs more than once in the section.]]
 234 [[<auto-index-section-names>off/on][When ['on] AutoIndex will use create two
 235 index entries for each term found - one uses the term itself as the primary
 236 index key, the other uses the enclosing section name.  When off the index
 237 entry that uses the section title is not created.  Defaults to "on"]]
 238 [[<auto-index-verbose>off/on][Defaults to "off".  When turned on AutoIndex
 239 prints progress information - useful for debugging purposes during setup.]]
 240 [[<auto-index-prefix>filename][Optionally specifies a directory to apply
 241 as a prefix to all relative file paths in the script file.
 242
 243 You may wish to do this to reduce typing of pathnames, and\/or where the
 244 paths can't be located relative to the script file location,
 245 typically if the headers are in the Boost trunk,
 246 but the script file is in Boost sandbox.
 247
 248 For Boost standard library layout,
 249 [^<auto-index-prefix>..\/..\/..] will get you back up to the 'root' of the Boost tree,
 250 so [^!scan-path boost\/mylibrary\/] is where your headers will be, and [^libs\/mylibrary] for other files.
 251 Without a prefix all relative paths are relative to the location of the script file.
 252 ]]
 253
 254 [[<auto-index-type>element-name][Specifies the name of the XML element in which to enclose an internally generated indexes:
 255   defaults to ['section], but could equally be ['appendix] or ['chapter] or some other block level element that has a formal title.
 256    The actual list of available options depends upon the Quickbook document type, the following table gives the available options,
 257    assuming that the index is placed at the top level, and not in some sub-section or other container:]]
 258 ]
 259
 260 [table
 261 [[Document Type][Permitted Index Types]]
 262 [[book][appendix index article chapter reference part]]
 263 [[article][section appendix index sect1]]
 264 [[chapter][section index sect1]]
 265 [[library][The same as Chapter (section index sect1)]]
 266 [[part][appendix index article chapter reference]]
 267 [[appendix][section index sect1]]
 268 [[preface][section index sect1]]
 269 [[qandadiv][N/A: an index would have to be placed within a subsection of the document.]]
 270 [[qandaset][N/A: an index would have to be placed within a subsection of the document.]]
 271 [[reference][N/A: an index would have to be placed within a subsection of the document.]]
 272 [[set][N/A: an index would have to be placed within a subsection of the document.]]
 273 ]
 274
 275 In large part then the choice of `<auto-index-type>element-name` depends on the
 276 formatting you want to be applied to the index:
 277
 278 [table
 279 [[XML Container Used for the Index][Formatting Applied by the XSL Stylesheets]]
 280 [[appendix][Starts a new page.]]
 281 [[article][Starts a new page.]]
 282 [[chapter][Starts a new page.]]
 283 [[index][Starts a new page only if it's contained within an article or book.]]
 284 [[part][Starts a new page.]]
 285 [[reference][Starts a new page.]]
 286 [[sect1][Starts a new page as long as it's not the first section (but is controlled by the XSL parameters chunk.section.depth and/or chunk.first.sections).]]
 287 [[section][Starts a new page as long as it's not the first section or nested within another section (but is controlled by the XSL parameters chunk.section.depth and/or chunk.first.sections).]]
 288 ]
 289
 290 In almost all cases the default (section) is the correct choice - the exception is when the index is to be placed
 291 directly inside a /book/ or /part/, in which case you should probably use the same XML container for the index as
 292 you use for whatever subdivisions are in the /book/ or /part/.  In any event placing a /section/ within a /book/ or
 293 /part/ will result in invalid XML.
 294
 295 Finally, if you are using Quickbook to generate the documentation, then you may wish to add:
 296
 297 [pre <include>$boost-root/tools/auto_index/include]
 298
 299 to your projects requirements (replacing $boost-root with the path to the root of the Boost tree), so that
 300 the file auto_index_helpers.qbk can be included in your quickbook source with simply a:
 301
 302 [pre \[include auto_index_helpers.qbk\]]
 303
 304 [endsect] [/section:options Available Indexing Options]
 305
 306 [section:optional Making AutoIndex optional]
 307
 308 It is considerate to make the [*use of auto-index optional] in Boost.Build,
 309 to allow users who do not have AutoIndex installed to still be able to build your documentation.
 310
 311 This also very convenient while you are refining your documentation,
 312 to allow you to decide to build indexes, or not:
 313 building indexes can take long time, if you are just correcting typos,
 314 you won't want to wait while you keep rebuilding the index!
 315
 316 One method of setting up optional AutoIndex support is to place all
 317 AutoIndex configuration in a the body of a bjam if statement:
 318
 319 [pre
 320   if --enable-index in  \[ modules.peek : ARGV \]
 321   {
 322      ECHO "Building the  docs with automatic index generation enabled." ;
 323
 324      using auto-index ;
 325      project : requirements
 326           <auto-index>on
 327           <auto-index-script>index.idx
 328
 329            ... other AutoIndex options here...
 330
 331         # And tell Quickbook that it should enable indexing.
 332         <quickbook-define>enable_index
 333     ;
 334   }
 335   else
 336   {
 337      ECHO "Building the my_library docs with automatic index generation disabled. To get an Index, try building with --enable-index." ;
 338   }
 339 ] [/pre]
 340
 341 You will also need to add a conditional statement at the end of your Quickbook file,
 342 so that the index(es) is/are only added after the last section if indexing is enabled.
 343
 344 [pre
 345 \[\? '''enable_index'''
 346 \'\'\'
 347   <index/>
 348 \'\'\'
 349 \]
 350 ] [/pre]
 351
 352
 353 To use this jamfile, you need to cd to your docs folder, for example:
 354
 355  cd \boost-sandbox\guild\mylibrary\libs\mylibrary\doc
 356
 357 and then run `bjam` to build the docs without index, for example:
 358
 359   bjam -a html > mylibrary_html.log
 360
 361 or with index(es)
 362
 363   bjam -a html --enable-index > mylibrary_html_index.log
 364
 365 [endsect] [/section:optional Making AutoIndex optional]
 366
 367 [tip Always send the output to a log file.
 368 It will contain of lot of stuff, but is invaluable to check if all has gone right,
 369 or else diagnose what has gone wrong.
 370 ]  [/tip]
 371
 372 [tip A return code of 0 is not a reliable indication
 373 that you have got what you really want -
 374 inspecting the log file is the only certain way.
 375 ] [/tip]
 376
 377 [tip If you upgrade compiler version, for example MSVC from 9 to 10,
 378 then you may need to rebuild Autoindex
 379 to avoid what Microsoft call a 'side-by-side' error.
 380 And make sure that the autoindex.exe version you are using is the new one.
 381 ] [/tip]
 382
 383 [endsect] [/section:configure Step 2: Configure Boost.Build to use AutoIndex]
 384
 385 [section:add_indexes Step 3: Add indexes to your documentation]
 386
 387 To add a single "include everything"  index to a BoostBook\/Docbook document,
 388 (perhaps generated using Quickbook, and perhaps also using Doxygen reference section),
 389 add `<index/>` at the location where you want the index to appear.
 390 The index will be rendered as a separate section called "Index"
 391 when the documentation is built.
 392
 393 To add multiple indexes, then give each one a title and set its
 394 `type` attribute to specify which terms will be included, for example
 395 to place the ['function], ['class], ['macro] or ['typedef] names
 396 indexed by ['AutoIndex] in separate indexes along with a main
 397 "include everything" index as well, one could add:
 398
 399 [pre
 400 <index type\="class_name">
 401 <title>Class Index<\/title>
 402 <\/index>
 403
 404 <index type\="typedef_name">
 405 <title>Typedef Index<\/title>
 406 <\/index>
 407
 408 <index type\="function_name">
 409 <title>Function Index<\/title>
 410 <\/index>
 411
 412 <index type\="macro_name">
 413 <title>Macro Index<\/title>
 414 <\/index>
 415
 416 <index\/>
 417 ]
 418
 419 [note Multiple indexes like this only work correctly if you tell the XSL stylesheets
 420 to honor the "type" attribute on each index as by default [/[*they do not do this]].
 421 You can turn the feature on by adding `<xsl:param>index.on.type=1` to your projects
 422 requirements in the Jamfile.]
 423
 424 In Quickbook, you add the same markup but enclose it between two triple-tick \'\'\' escapes,
 425 thus
 426
 427 [pre   \'\'\'<index\/>\'\'\' ]
 428
 429 Or more easily via the helper file auto_index_helpers.qbk, so that given:
 430
 431 [pre \[include auto_index_helpers.qbk\]]
 432
 433 one can simply write:
 434
 435 [pre
 436 \[named_index class_name Class Index\]
 437 \[named_index function_name Function Index\]
 438 \[named_index typedef_name Typedef Index\]
 439 \[named_index macro_name Macro Index\]
 440 \[index\]
 441 ]
 442
 443 [note AutoIndex knows nothing of the XML `xinclude` element, so if
 444 you're writing raw Docbook XML then you may want to run this through an
 445 XSL processor to flatten everything to one XML file before passing to
 446 AutoIndex.  If you're using Boostbook or quickbook though, this all
 447 happens for you anyway, and AutoIndex will index the whole document
 448 including any sections included with `xinclude`.]
 449
 450 If you are using AutoIndex's internal index generation on
 451
 452 [pre
 453 <auto-index-internal>on
 454 ]
 455 (usually recommended for HTML output, but ['not] the default)
 456 then you can also decide what kind of XML wrapper the generated index is placed in.
 457 By default this is a `<section>...</section>` XML block (this replaces the original
 458 `<index>...</index>` block).  However, depending upon the structure of the document
 459 and whether or not you want the index on a separate page - or else on the front page after
 460 the TOC - you may want to place the index inside a different type of XML block.  For example
 461 if your document uses `<chapter>` top level content rather than `<section>`s then
 462 it may be preferable to place the index in a `<chapter>` or `<appendix>` block.
 463 You can also place the index inside an `<index>` block if you prefer, in which case the index
 464 does not appear in on a page of its own, but after the TOC in the HTML output.
 465
 466 You control the type of XML block used by setting the =<auto-index-type>element-name=
 467 attribute in the Jamfile, or via the `index-type=element-name` command line option to
 468 AutoIndex itself.  For example, to place the index in an appendix, your Jamfile might
 469 look like:
 470
 471 [pre
 472 using quickbook ;
 473 using auto-index ;
 474
 475 xml mylibrary : mylibary.qbk ;
 476 boostbook standalone
 477     :
 478         mylibrary
 479     :
 480         # auto-indexing is on:
 481         <auto-index>on
 482
 483         # PDFs rely on the XSL stylesheets to generate the index:
 484         <format>pdf:<auto-index-internal>off
 485
 486         # HTML output uses auto-index to generate the index:
 487         <format>html:<auto-index-internal>on
 488
 489         # Name of script file to use:
 490         <auto-index-script>index.idx
 491
 492         # Set the XML wrapper for HML Indexes to "appendix":
 493         <format>html:<auto-index-type>appendix
 494
 495         # Turn on multiple index support:
 496         <xsl:param>index.on.type=1
 497 ]
 498
 499
 500 [endsect] [/section:add_indexes Step 3: Add indexes to your documentation]
 501
 502 [section:script Step 4: Create the .idx script file - to control what to terms to index]
 503
 504 AutoIndex works by reading a script file that tells it what terms to index.
 505
 506 If your document contains largely text, and only a small amount of simple C++,
 507 and/or if you are using Doxygen to provide a C++ Reference section
 508 (that lists the C++ elements),
 509 and/or if you are relying on the indexing provided from a Standalone Doxygen Index,
 510 you may decide that a index is not needed
 511 and that you may only want the text part indexed.
 512
 513 But if you want C++ classes functions, typedefs and/or macros AutoIndexed,
 514 optionally, the script file also tells which other C++ files to scan.
 515
 516 At its simplest, it will scan one or more headers for terms that
 517 should be indexed in the documentation.  So for example to scan
 518 "myheader.hpp" the script file would just contain:
 519
 520    !scan myheader.hpp
 521    !scan mydetailsheader.hpp
 522
 523 Or, more likely in practice, so
 524 we can recursively scan through directories looking for all
 525 the files to scan whose [*name matches a particular regular expression]:
 526
 527 [pre !scan-path "boost\/mylibrary" ".*\.hpp" true ]
 528
 529 Each argument is whitespace separated and can be optionally
 530 enclosed in "double quotes" (recommended).
 531
 532 The final ['true] argument indicates
 533 that subdirectories in `/boost/math/mylibrary` should be searched
 534 recursively in addition to that directory.
 535
 536 [caution The second ['file-name-regex] argument is a regular expression and not a filename GLOB!]
 537
 538 [caution The scan-path is modified by any setting of <auto-index-prefix>.
 539 The examples here assume that this is [^<auto-index-prefix>..\/..\/..]
 540 so that `boost/mylibrary` will be your header files,
 541 `libs/mylibrary/doc` will contain your documentation files and
 542 `libs/mylibrary/example` will contain your examples.
 543 ]
 544
 545 You could also scan any examples (.cpp) files,
 546 typically in folder `/mylibrary/lib/example`.
 547
 548 [pre
 549 # All example source files, assuming no sub-folders.
 550 !scan-path "libs\/mylibrary\/example" ".*\.cpp"
 551 ] [/pre]
 552
 553 Often the ['scan] or ['scan-path] rules will bring in too many terms
 554 to search for, so we need to be able to exclude terms as well:
 555
 556    !exclude type
 557
 558 Which excludes the term "type" from being indexed.
 559
 560 We can also add terms manually:
 561
 562    foobar
 563
 564 will index occurrences of "foobar" and:
 565
 566    foobar \<\w*(foo|bar)\w*\>
 567
 568 will index any whole word containing either "foo" or "bar" within it,
 569 this is useful when you want to index a lot of similar or related
 570 words under one entry, for example:
 571
 572    reflex
 573
 574 Will only index occurrences of "reflex" as a whole word, but:
 575
 576    reflex \<reflex\w*\>
 577
 578 will index occurrences of "reflex", "reflexing" and
 579 "reflexed" all under the same entry ['reflex].
 580 You will very often need to use this to deal with plurals and other variants.
 581
 582 This inclusion rule can also restrict the term to
 583 certain sections, and add an index category that
 584 the term should belong to (so it only appears in certain
 585 indexes).
 586
 587 Finally the script can add rewrite rules, that rename section names
 588 that are automatically used as index entries.  For example we might
 589 want to remove leading "A" or "The" prefixes from section titles
 590 when AutoIndex uses them as an index entry:
 591
 592    !rewrite-name "(?i)(?:A|The)\s+(.*)" "\1"
 593
 594 [endsect] [/section:script Step 4: Create the script file -  to control what to terms to index]
 595
 596 [section:entries Step 5: Add Manual Index Entries to Docbook XML - Optional]
 597
 598 If you add manual `<indexentry>` markup to your Docbook XML then these will be
 599 passed through unchanged.  Please note however, that if you are using
 600 AutoIndex's internal index generation then it only recognises
 601 `<primary>`, `<secondary>` and `<tertiary>` elements within the `<indexterm>`.
 602 `<see>` and `<seealso>` elements are not currently recognised
 603 and AutoIndex will emit a warning if these are used.
 604
 605 Likewise none of the  attributes which can be applied to these elements are used when
 606 AutoIndex generates the index itself, with the exception of the `<type>` attribute.
 607
 608 For Quickbook users, there are some templates in auto_index_helpers.qbk that assist
 609 in adding manual entries without having to escape to Docbook.
 610
 611 [endsect]  [/section:entries Step 5: Add Manual Index Entries to Docbook XML - Optional]
 612
 613 [section:pis Step 6: Using XML processing instructions to control what gets indexed.]
 614
 615 Sometimes when you need to exclude certain sections of text from indexing,
 616 then you can achieve this with the following XML processing instructions:
 617
 618 [table
 619 [[Instruction][Effect]]
 620 [[`<?BoostAutoIndex IgnoreSection?>`]
 621    [Causes the whole of the current section to be excluded from indexing.
 622     By "section" we mean either a true "section" or any sibling XML element:
 623     "dedication", "toc", "lot", "glossary", "bibliography", "preface", "chapter",
 624       "reference", "part", "article", "appendix", "index", "setindex", "colophon",
 625       "sect1", "refentry", "simplesect", "section" or "partintro".]]
 626 [[`<?BoostAutoIndex IgnoreBlock?>`]
 627    [Causes the whole of the current text block to be excluded from indexing.
 628     A text block may be any of the section/chapter elements listed above, or a
 629     paragraph, code listing, table etc.  The complete list is:
 630     "calloutlist", "glosslist", "bibliolist", "itemizedlist", "orderedlist",
 631       "segmentedlist", "simplelist", "variablelist", "caution", "important", "note",
 632       "tip", "warning", "literallayout", "programlisting", "programlistingco",
 633       "screen", "screenco", "screenshot", "synopsis", "cmdsynopsis", "funcsynopsis",
 634       "classsynopsis", "fieldsynopsis", "constructorsynopsis",
 635       "destructorsynopsis", "methodsynopsis", "formalpara", "para", "simpara",
 636       "address", "blockquote", "graphic", "graphicco", "mediaobject",
 637       "mediaobjectco", "informalequation", "informalexample", "informalfigure",
 638       "informaltable", "equation", "example", "figure", "table", "msgset", "procedure",
 639       "sidebar", "qandaset", "task", "productionset", "constraintdef", "anchor",
 640       "bridgehead", "remark", "highlights", "abstract", "authorblurb" or "epigraph".]]
 641 ]
 642
 643 For Quickbook users the file auto_index_helpers.qbk contains a helper template
 644 that assists in inserting these processing instructions, for example:
 645
 646 [pre \[AutoIndex IgnoreSection\]]
 647
 648 Will cause that section to not be indexed.
 649
 650 [endsect] [/section:pis Step 6: Using XML processing instructions to control what gets indexed.]
 651
 652 [section:build_docs Step 7: Build the Docs]
 653
 654 Using Boost.Build you build the docs with either:
 655
 656    bjam release > mylibrary_html.log
 657
 658 To build the html docs or:
 659
 660    bjam pdf release > mylibrary_pdf.log
 661
 662 To build the pdf.
 663
 664 During the build process you should see AutoIndex emit a message in the log file
 665 such as:
 666
 667 [pre Indexing 990 terms... ]
 668
 669 If you don't see that, or if it's indexing 0 terms then something is wrong!
 670
 671 Likewise when index generation is complete, AutoIndex will emit another message:
 672
 673 [pre 38 Index entries were created.]
 674
 675 Again, if you see that 0 entries were created then something is wrong!
 676
 677 Examine the log file, and if the cause is not obvious,
 678 make sure that you have [^<auto-index-verbose>on] and that
 679 any needed
 680 [^!debug regular-expression] directives are in your script file.
 681
 682 [endsect] [/section:build_docs Step 7: Build the Docs]
 683
 684 [section:refine Step 8: Iterate - to refine your index]
 685
 686 Creating a good index is an iterative process, often the first step is
 687 just to add a header scanning rule to the script file and then generate
 688 the documentation and see:
 689
 690 * What's missing.
 691 * What's been included that shouldn't be.
 692 * What's been included under a poor name.
 693
 694 Further rules can then be added to the script to handle these cases
 695 and the next iteration examined, and so on.
 696
 697 [tip If you don't understand why a particular term is (or is not) present in the index,
 698 try adding a ['!debug regular-expression]
 699 directive to the [link boost_autoindex.script_ref script file].
 700 ] [/tip]
 701
 702 [heading Restricting which Sections are indexed for a particular term]
 703
 704 You can restrict which sections are indexed for a particular term.
 705 So assuming that the docbook document has the usual hierarchical names for section ID's
 706 (as Quickbook generates, for example),
 707 you can easily place a constraint on which sections are examined for a particular term.
 708
 709 For example, if you want to index occurrences of Lord Kelvin's name,
 710 but only in the introduction section, you might then add:
 711
 712   Kelvin "" ".*introduction.*"
 713
 714 to the script file,
 715 assuming that the section ID of the intro is "some_library_or_chapter_name.introduction".
 716
 717 This would avoid an index entry every time 'Kelvin' is found,
 718 something the user is unlikely to find helpful.
 719
 720 [endsect] [/section:refine Step 8: Iterate - to refine your index]
 721
 722 [endsect] [/section:tut Getting Started and Tutorial]
 723
 724
 725 [section:script_ref Script File (.idx) Reference]
 726
 727 The following elements can occur in a script:
 728
 729 [h4 Comments and blank lines]
 730
 731 Blank lines consisting of only whitespace are ignored, so are lines that [*start with a #].
 732
 733 [note You can't append \# comments onto the end of a line\!]
 734
 735 [h4 Inclusion of Index terms]
 736
 737    term [regular-expression1 [regular-expression2 [category]]]
 738
 739 [variablelist
 740 [[term][
 741 ['Term to index.]
 742
 743 The index term will form a primary entry in the Index
 744 with the section title(s) containing the term as secondary entries, and
 745 also will be used as a secondary entry beneath each of the section
 746 titles that the index term occurs in.]
 747 ] [/term]
 748
 749 [[regular-expression1][
 750 ['Index term Searcher.]
 751
 752 An optional regular expression: each occurrence
 753 of the regular expression in the text of the document will result
 754 in one index term being emitted.
 755
 756 If the regular expression is omitted (default) or is "", then the ['index term] itself
 757 will be used as the search text - and only occurrence of whole words matching
 758 ['index term] will be indexed.
 759
 760 For example:
 761
 762 ``foobar``
 763
 764 will index occurrences of "foobar" in any section, but
 765
 766 ``foobar \<\w*(foo|bar)\w*\>``
 767
 768 will index any whole word containing either "foo" or "bar" within it.
 769 This is useful when you want to index a lot of similar or related words under one entry.
 770
 771 ``reflex``
 772
 773 will only index occurrences of "reflex" as a whole word, but:
 774
 775 ``reflex \<reflex\w*\>``
 776
 777 will index occurrences of "reflex", "reflexes", "reflexing" and "reflexed" ...
 778 all under the same entry reflex.
 779
 780 You will very often need to use this to deal with plurals and other variants.]
 781 ] [/regular-expression1]
 782
 783 [[regular-expression2]
 784 [['Section(s) Selector.]
 785
 786 A constraint that specifies which sections are
 787 indexed for ['term]: only if the ID of the section matches
 788 ['regular-expression2] exactly will that section be indexed
 789 for occurrences of ['term].
 790
 791 For example, to limit indexing to just [*one specific section] (but not sub-sections below):
 792
 793 ``myclass "" "mylib\.examples"``
 794
 795
 796 For example, to limit indexing to specific sections, [*and sub-sections below]:
 797
 798 ``myclass "" "mylib\.examples.*"``
 799
 800 will index occurrences of "myclass" as a whole word,
 801 but only in sections whose section ID [*begins] "mylib.examples", while
 802
 803 ``myclass "\<myclass\w*\>" "mylib\.examples.*"``
 804
 805 will also index plurals myclass, myclasses, myclasss ...
 806
 807 and:
 808
 809 ``myclass "" "(?!mylib\.introduction).*"``
 810
 811 will index occurrences of "myclass" in any section,
 812 except those whose section IDs begin "mylib.introduction".
 813
 814 Finally, two (or more) sections can be excluded by OR'ing them together:
 815
 816 ``myclass "" "(?!mylib\.introduction|mylib\.reference).*"``
 817
 818 which excludes searching for this term in sections whose ID's start with either "mylib.introduction" or "mylib.reference".
 819
 820 If this third section selection field is omitted (the default)
 821 or is "", then [*all sections] are indexed for this term.
 822 ]
 823 ] [/regular-expression2]
 824
 825 [[category][
 826 ['Index Category Constraint.]
 827
 828 Optionally a category to place occurrences of ['index term] in.
 829 If you have multiple indexes then this is the name
 830 assigned to the indexes "type" attribute.
 831
 832 For example:
 833
 834   myclass "" "" class_name
 835
 836 Will index occurances of ['myclass] and place them in the class-index if there is one.
 837
 838 ]] [/category]
 839
 840 ]  [/variablelist]
 841
 842 You can have an index term appear more than once in the script file:
 843
 844 * If they have different /category/ names then they are treated quite separately.
 845 * Otherwise they are combined, so that the logical or of the regular expressions provided are taken.
 846
 847 Thus:
 848
 849    myterm search_expression1 constrait_expression2 foo
 850    myterm search_expression1 constrait_expression2 bar
 851
 852 Will be treated as different terms each with their own entries, while:
 853
 854    myterm search_expression1 constrait_expression2 mycategory
 855    myterm search_expression1 constrait_expression2 mycategory
 856
 857 Will be combined into a single term equivalent to:
 858
 859    myterm (?:search_expression1|search_expression1) (?:constrait_expression2|constrait_expression2) mycategory
 860
 861 [h4 Source File Scanning]
 862
 863    !scan source-file-name
 864
 865 Scans the C\/C++ source file ['source-file-name] for definitions of
 866 ['function]s, ['class]s, ['macro]s or ['typedef]s and makes each of
 867 these a term to be indexed.  Terms found are assigned to the index category
 868 "function_name", "class_name", "macro_name" or "typedef_name" depending
 869 on how they were seen in the source file.  These may then be included
 870 in a specialised index whose "type" attribute has the same category name.
 871
 872 [important
 873 When actually indexing a document, the scanner will not index just any old occurrence of the
 874 terms found in the source files.  Instead it searches for class definitions or function or
 875 typedef declarations.  This reduces the number of spurious matches placed in the index, but
 876 may also miss some legitimate terms:
 877 refer to the /define-scanner/ command for information on how to change this.
 878 ]
 879
 880 [h4 Directory and Source File Scanning]
 881
 882    !scan-path directory-name file-name-regex [recurse]
 883
 884 [variablelist
 885 [[directory-name][The directory to scan: this should be a path relative
 886 to the script file (or to the path specified with the prefix=path option on the command line)
 887 and should use all forward slashes in its file name.]]
 888
 889 [[file-name-regex][A regular expression: any file in the directory whose name
 890 matches the regular expression will be scanned for terms to index.]]
 891
 892 [[recurse][An optional boolean value - either "true" or "false" - that
 893 indicates whether to recurse into subdirectories.  This defaults to "false".]]
 894 ]
 895
 896 [h4 Excluding Terms]
 897
 898    !exclude term-list
 899
 900 Excludes all the terms in whitespace separated ['term-list] from being indexed.
 901 This should be placed /after/ any ['!scan] or ['!scan-path] rules which may
 902 result in the terms becoming included.  In other words this removes terms from
 903 the scanners internal list of things to index.
 904
 905 [h4 Rewriting Section Names]
 906
 907 [pre !rewrite-id regular-expression new-name]
 908
 909 [variablelist
 910 [[regular-expression][A regular expression: all section ID's that match
 911 the expression exactly will have index entries ['new-name] instead of
 912 their title(s).]]
 913
 914 [[new-name][The name that the section will appear under in the index.]]
 915 ]
 916
 917    !rewrite-name regular-expression format-text
 918
 919 [variablelist
 920 [[regular-expression][A regular expression: all sections whose titles
 921 match the regular expression exactly, will have index entries composed
 922 of the regular expression match combined with the regex format string
 923 ['format-text].]]
 924 [[format-text][The Perl-style format string used to reformat the title.]]
 925 ]
 926
 927 For example:
 928
 929 [pre
 930 !rewrite-name "(?:A|An|The)\s+(.*)" "\1"
 931 ]
 932
 933 Will remove any leading "A", "An" or "The" from all index entries - thus preventing lots of
 934 entries under "The" etc!
 935
 936 [h4 Defining or Changing the File Scanners]
 937
 938    !define-scanner type file-search-expression xml-regex-formatter term-formatter id-filter filename-filter
 939
 940 When a source file is scanned using the =!scan= or =!scan-path= rules, then the file is searched using
 941 a series of regular expressions to look for classes, functions, macros or typedefs that should be indexed.
 942 A set of default regular expressions are provided for this (see below), but sometimes you may want to replace
 943 the defaults, or add new scanners.  The arguments to this rule are:
 944
 945 [variablelist
 946 [[type][The ['type] to which items found using this rule will assigned, index terms created from the
 947 source file and then found in the XML, will have the type attribute set to this value, and may then appear in a
 948 specialized index with the same type attribute]]
 949 [[file-search-expression][A regular expression that is used to scan the source file for index terms, the result of
 950 a match against this expression will be transformed by the next two arguments.]]
 951 [[xml-regex-formatter][A regular expression format string that extracts the salient information from whatever
 952 matched the ['file-search-expression] in the source file, and creates ['a new regular expression] that will
 953 be used to search the document being indexed for occurrences of this index term.]]
 954 [[term-formatter][A regular expression format string that extracts the salient information from whatever
 955 matched the ['file-search-expression] in the source file, and creates the index term that will appear in
 956 the index.]]
 957 [[id-filter][Optional.  A regular expression that restricts the section-id's that are searched in the document being indexed:
 958 only sections whose ID attribute matches this expression exactly will be considered for indexing terms found by this scanner.]]
 959 [[filename-filter][Optional.  A regular expression that restricts which files are scanned by this scanner: only files whose file name
 960 matches this expression exactly will be scanned for index terms to use.  Note that the filename matched against this may
 961 well be an absolute path, and contain either forward or backward slash path separators.]]
 962 ]
 963
 964 If, when the first file is scanned, there are no scanners whose ['type] is "class_name", "typedef_name", "macro_name" or
 965 "function_name", then the defaults are installed.  These are equivalent to:
 966
 967    !define-scanner class_name "^[[:space:]]*(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([[:blank:]]*\([^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>)?[[:space:]]*(\{|:[^;\{()]*\{)" "(?:class|struct)[^;{]+\<\5\>[^;{]+\{" \5
 968    !define-scanner typedef_name "typedef[^;{}#]+?(\w+)\s*;"  "typedef[^;]+\<\1\>\s*;" "\1"
 969    !define-scanner "macro_name" "^\s*#\s*define\s+(\w+)" "\<\1\>" "\1"
 970    !define-scanner "function_name" "\w++(?:\s*+<[^>]++>)?[\s&*]+?(\w+)\s*(?:BOOST_[[:upper:]_]+\s*)?\([^;{}]*\)\s*[;{]" "\\<\\w+\\>(?:\\s+<[^>]*>)*[\\s&*]+\\<\1\\>\\s*\\([^;{]*\\)" "\1"
 971
 972 Note that these defaults are not installed if you have provided your own versions with these ['type] names. In this case if
 973 you want the default scanners to be in effect as well as your own, you should include the above in your script file.
 974 It is also perfectly allowable to have multiple scanners with the same ['type], but with the other fields differing.
 975
 976 Finally you should note that the default scanners are quite strict
 977 in what they will find, for example the class
 978 scanner will only create index entries for classes that have class definitions of the form:
 979
 980    class my_class : public base_classes
 981    {
 982       // etc
 983
 984 In the documentation, so that simple mentions of the class name will ['not] get indexed,
 985 only the class synopsis if there is one.
 986 If this isn't how you want things, then include the ['class_name] scanner definition
 987 above in your script file, and change
 988 the ['xml-regex-formatter] field to something more permissive, for example:
 989
 990    !define-scanner class_name "^[[:space:]]*(template[[:space:]]*<[^;:{]+>[[:space:]]*)?(class|struct)[[:space:]]*(\<\w+\>([[:blank:]]*\([^)]*\))?[[:space:]]*)*(\<\w*\>)[[:space:]]*(<[^;:{]+>)?[[:space:]]*(\{|:[^;\{()]*\{)" "\<\5\>" \5
 991
 992 Will look for ['any] occurrence of whatever class names the scanner may find in the documentation.
 993
 994 [h4 Debugging scanning]
 995
 996 If you see a term in the index, and you don't understand why it's there, add a ['debug] directive:
 997
 998 [pre
 999 !debug regular-expression
1000 ]
1001
1002 Now, whenever ['regular-expression] matches either the found index term,
1003 or the section title it appears in, or the ['type] field of a scanner, then
1004 some diagnostic information will be printed that will look something like:
1005
1006 [pre
1007 Debug term found, in block with ID: spirit.qi.reference.parser_concepts.parser
1008 Current section title is: Notation
1009 The main index entry will be : Notation
1010 The indexed term is: parser
1011 The search regex is: \[P\|p\]arser
1012 The section constraint is: .*qi.reference.parser_concepts.*
1013 The index type for this entry is: qi_index
1014 ]
1015
1016 This can produce a lot of output in your log file,
1017 but until you are satisfied with your file selection and scanning process,
1018 it is worth switching it on.
1019
1020 [endsect] [/section:script_ref Script File Reference]
1021
1022 [section:workflow  Understanding The AutoIndex Workflow]
1023
1024 # Load the script file (usually index.idx)
1025   and process it one line at a time,
1026   producing one or more index term per (non-comment) line.
1027
1028 # Reading all lines builds a list of ['terms to index].
1029   Some of those may be terms defined (by you) directly in the script file,
1030   others may be terms found by scanning C++ header and source files
1031   that were specified by the ['!scan-path] directive.
1032
1033 # Once the complete list of ['terms to index] is complete,
1034   it loads the Docbook XML file.
1035   (If this comes from Quickbook\/Doxygen\/Boostbook\/Docbook then this is
1036   the complete documentation after conversion to Docbook format).
1037
1038 # AutoIndex builds an internal __DocObjMod of the Docbook XML.
1039   This internal representation then gets scanned for occurrences of the ['terms to index].
1040   This scanning works at the XML paragraph level
1041   (or equivalent sibling such as a table or code block)
1042   - so all the XML encoding within a paragraph gets flattened to plain text.[br]
1043   This flattening means the regular expressions used to search for ['terms to index]
1044   can find anything that is completely contained within a paragraph
1045   (or code block etc).
1046
1047 # For each term found then an ['indexterm] Docbook element is inserted
1048   into the __DocObjMod (provided internal index generation is off),
1049
1050 # Also the AutoIndex's internal index representation gets updated.
1051
1052 # Once the whole XML document has been indexed,
1053   then, if AutoIndex has been instructed to generate the index itself,
1054   it creates the necessary XML and inserts this into the __DocObjMod.
1055
1056 # Finally the whole __DocObjMod is written out as a new Docbook XML file,
1057   and normal processing of this continues via the XSL stylesheets (with xsltproc)
1058   to actually build the final human-readable docs.
1059
1060 [endsect] [/section:workflow  AutoIndex Workflow]
1061
1062
1063 [section:xml XML Handling]
1064
1065 AutoIndex is rather simplistic in its handling of XML:
1066
1067 * When indexing a document, all block content at the paragraph level gets collapsed into a single
1068 string for matching against the regular expressions representing each index term.  In other words,
1069 for the most part, you can assume that you're indexing plain text when writing regular expressions.
1070 * Named XML entities for &, ", ', < or > are converted to their corresponding characters before indexing
1071 a section of text.  However, decimal or hex escape sequences are not currently converted.
1072 * Index terms are assumed to be plain text (whether they originate from the script file
1073 or from scanning source files) and the characters &, ", < and > will be escaped to
1074 &amp; &quot; &lt; and &gt; respectively.
1075
1076 [endsect] [/section:xml XML Handling]
1077
1078 [section:qbk Quickbook Support]
1079
1080 The file auto_index_helpers.qbk in ['boost-path]/tools/auto_index/include contains various Quickbook
1081 templates to assist with AutoIndex support.  One would normally add the above path to your include
1082 search path via an `<include>path` statement in your Jamfile, and then make the templates available
1083 to your Quickbook source via a:
1084
1085 [pre \[include auto_index_helpers.qbk\]]
1086
1087 statement at the start of your Quickbook file.
1088
1089 The available templates are then:
1090
1091 [table
1092 [[Template][Description]]
1093 [[`[index]`][Creates a main index, with no "type" category set, which will be titled simply "Index".]]
1094 [[`[named_index type title]`][Creates an index with the type attribute set to "type" and the title will be "title".[br]
1095          For example to create an index containing only class names one would typically add `[named_index class_name Class Index]`
1096          to your Quickbook source.]]
1097 [[`[AutoIndex Arg]`][Creates a Docbook processing instruction that will be handled by AutoIndex, valid values for "Arg"
1098                      are either "IgnoreSection" or "IgnoreBlock".]]
1099 [[`[indexterm1 primary-key]`][Creates a manual index entry that will link to the current section, and have a single primary key "primary-key".
1100          Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]]
1101 [[`[indexterm2 primary-key secondary-key]`][Creates a manual index entry that will link to the current section, and has
1102          "primary-key" and "secondary key" as the primary and secondary keys respectively.
1103          Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]]
1104 [[`[indexterm3 primary-key secondary-key tertiary-key]`][Creates a manual index entry that will link to the current section,
1105          and have primary, secondary and tertiary keys: "primary-key", "secondary key" and "tertiary key".
1106          Note that this index key will not have a "type" attribute set, and so will only appear in the main index.]]
1107
1108 [[`[typed_indexterm1 type primary-key]`][Creates a manual index entry that will link to the current section, and have a single primary key "primary-key".
1109          Note that this index key will have the  "type" attribute set to the "type" argument, and so may appear in named sub-indexes
1110          that also have their type attribute set.]]
1111 [[`[typed_indexterm2 type primary-key secondary-key]`][Creates a manual index entry that will link to the current section, and has
1112          "primary-key" and "secondary key" as the primary and secondary keys respectively.
1113          Note that this index key will have the  "type" attribute set to the "type" argument, and so may appear in named sub-indexes
1114          that also have their type attribute set.]]
1115 [[`[typed_indexterm3 type primary-key secondary-key tertiary-key]`][Creates a manual index entry that will link to the current section,
1116          and have primary, secondary and tertiary keys: "primary-key", "secondary key" and "tertiary key".
1117          Note that this index key will have the  "type" attribute set to the "type" argument, and so may appear in named sub-indexes
1118          that also have their type attribute set.]]
1119 ]
1120
1121 [endsect]
1122
1123 [section:comm_ref Command Line Reference]
1124
1125 The following command line options are supported by AutoIndex:
1126
1127 [variablelist
1128 [[--in=infilename][Specifies the name of the XML input file to be indexed.]]
1129 [[--out=outfilename][Specifies the name of the new XML file to create.]]
1130 [[--scan=source-filename][Specifies that ['source-filename] should be scanned
1131 for terms to index.]]
1132 [[--script=script-filename][Specifies the name of the script file to process.]]
1133 [[--no-duplicates][If a term occurs more than once in the same section, then
1134 include only one index entry.]]
1135 [[--internal-index][Specifies that AutoIndex should generate the actual
1136 indexes rather than inserting `<indexterm>`s and leaving index generation
1137 to the XSL stylesheets.]]
1138 [[--no-section-names][Prevents AutoIndex from using section names as index entries.]]
1139 [[--prefix=pathname][Specifies a directory to apply as a prefix to all relative file paths in the script file.]]
1140 [[--index-type=element-name][Specifies the name of the XML element to enclose internally generated indexes in:
1141   defaults to ['section], but could equally be ['appendix] or ['chapter]
1142   or some other block level element that has a formal title.]]
1143 ]
1144
1145 [endsect]  [/section:comm_ref Command Line Reference]
1146
1147 [include ../include/auto_index_helpers.qbk]
1148
1149 [index]