]> git.proxmox.com Git - ceph.git/blame - ceph/src/boost/libs/spirit/doc/lex/lexer_quickstart1.qbk
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / spirit / doc / lex / lexer_quickstart1.qbk
CommitLineData
7c673cae
FG
1[/==============================================================================
2 Copyright (C) 2001-2011 Joel de Guzman
3 Copyright (C) 2001-2011 Hartmut Kaiser
4
5 Distributed under the Boost Software License, Version 1.0. (See accompanying
6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7===============================================================================/]
8
9[section:lexer_quickstart1 Quickstart 1 - A word counter using __lex__]
10
11__lex__ is very modular, which follows the general building principle of the
12__spirit__ libraries. You never pay for features you don't use. It is nicely
13integrated with the other parts of __spirit__ but nevertheless can be used
14separately to build stand alone lexical analyzers.
15The first quick start example describes a stand alone application:
16counting characters, words, and lines in a file, very similar to what the well
17known Unix command `wc` is doing (for the full example code see here:
18[@../../example/lex/word_count_functor.cpp word_count_functor.cpp]).
19
20[import ../example/lex/word_count_functor.cpp]
21
22
23[heading Prerequisites]
24
25The only required `#include` specific to /Spirit.Lex/ follows. It is a wrapper
26for all necessary definitions to use /Spirit.Lex/ in a stand alone fashion, and
27on top of the __lexertl__ library. Additionally we `#include` two of the Boost
28headers to define `boost::bind()` and `boost::ref()`.
29
30[wcf_includes]
31
32To make all the code below more readable we introduce the following namespaces.
33
34[wcf_namespaces]
35
36
37[heading Defining Tokens]
38
39The most important step while creating a lexer using __lex__ is to define the
40tokens to be recognized in the input sequence. This is normally done by
41defining the regular expressions describing the matching character sequences,
42and optionally their corresponding token ids. Additionally the defined tokens
43need to be associated with an instance of a lexer object as provided by the
44library. The following code snippet shows how this can be done using __lex__.
45
46[wcf_token_definition]
47
48
49[heading Doing the Useful Work]
50
51We will use a setup, where we want the __lex__ library to invoke a given
52function after any of the generated tokens is recognized. For this reason
53we need to implement a functor taking at least the generated token as an
54argument and returning a boolean value allowing to stop the tokenization
55process. The default token type used in this example carries a token value of
56the type __boost_iterator_range__`<BaseIterator>` pointing to the matched
57range in the underlying input sequence.
58
59[wcf_functor]
60
61All what is left is to write some boilerplate code helping to tie together the
62pieces described so far. To simplify this example we call the `lex::tokenize()`
63function implemented in __lex__ (for a more detailed description of this
64function see here: __fixme__), even if we could have written a loop to iterate
65over the lexer iterators [`first`, `last`) as well.
66
67
68[heading Pulling Everything Together]
69
70[wcf_main]
71
72
73[heading Comparing __lex__ with __flex__]
74
75This example was deliberately chosen to be as much as possible similar to the
76equivalent __flex__ program (see below), which isn't too different from what
77has to be written when using __lex__.
78
79[note Interestingly enough, performance comparisons of lexical analyzers
80 written using __lex__ with equivalent programs generated by
81 __flex__ show that both have comparable execution speeds!
82 Generally, thanks to the highly optimized __lexertl__ library and
83 due its carefully designed integration with __spirit__ the
84 abstraction penalty to be paid for using __lex__ is negligible.
85]
86
87The remaining examples in this tutorial will use more sophisticated features
88of __lex__, mainly to allow further simplification of the code to be written,
89while maintaining the similarity with corresponding features of __flex__.
90__lex__ has been designed to be as similar to __flex__ as possible. That
91is why this documentation will provide the corresponding __flex__ code for the
92shown __lex__ examples almost everywhere. So consequently, here is the __flex__
93code corresponding to the example as shown above.
94
95[wcf_flex_version]
96
97[endsect]