1 [/==============================================================================
2 Copyright (C) 2001-2011 Joel de Guzman
3 Copyright (C) 2001-2011 Hartmut Kaiser
5 Distributed under the Boost Software License, Version 1.0. (See accompanying
6 file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
7 ===============================================================================/]
8 [section:lexer_api Lexer API]
12 The library provides a couple of free functions to make using the lexer a snap.
13 These functions have three forms. The first form, `tokenize`, simplifies the
14 usage of a stand alone lexer (without parsing). The second form,
15 `tokenize_and_parse`, combines a lexer step with parsing on
16 the token level (without a skipper). The third, `tokenize_and_phrase_parse`,
17 works on the token level as well, but additionally employs a skip parser. The
18 latter two versions can take in attributes by reference that will hold the
19 parsed values on a successful parse.
23 // forwards to <boost/spirit/home/lex/tokenize_and_parse.hpp>
24 #include <boost/spirit/include/lex_tokenize_and_parse.hpp>
26 For variadic attributes:
28 // forwards to <boost/spirit/home/lex/tokenize_and_parse_attr.hpp>
29 #include <boost/spirit/include/lex_tokenize_and_parse_attr.hpp>
31 The variadic attributes version of the API allows one or more
32 attributes to be passed into the API functions. The functions taking two
33 or more attributes are usable when the parser expression is a
34 __qi_sequence__ only. In this case each of the
35 attributes passed have to match the corresponding part of the sequence.
37 Also, see __include_structure__.
43 [[`boost::spirit::lex::tokenize` ]]
44 [[`boost::spirit::lex::tokenize_and_parse` ]]
45 [[`boost::spirit::lex::tokenize_and_phrase_parse` ]]
46 [[`boost::spirit::qi::skip_flag::postskip` ]]
47 [[`boost::spirit::qi::skip_flag::dont_postskip` ]]
52 The `tokenize` function is one of the main lexer API functions. It
53 simplifies using a lexer to tokenize a given input sequence. It's main
54 purpose is to use the lexer to tokenize all the input.
56 Both functions take a pair of iterators spanning the underlying input
57 stream to scan, the lexer object (built from the token definitions),
58 and an (optional) functor being called for each of the generated tokens. If no
59 function object `f` is given, the generated tokens will be discarded.
61 The functions return `true` if the scanning of the input succeeded (the
62 given input sequence has been successfully matched by the given
65 The argument `f` is expected to be a function (callable) object taking a single
66 argument of the token type and returning a bool, indicating whether
67 the tokenization should be canceled. If it returns `false` the function
68 `tokenize` will return `false` as well.
70 The `initial_state` argument forces lexing to start with the given lexer state.
71 If this is omitted lexing starts in the `"INITIAL"` state.
73 template <typename Iterator, typename Lexer>
79 , typename Lexer::char_type const* initial_state = 0);
81 template <typename Iterator, typename Lexer, typename F>
88 , typename Lexer::char_type const* initial_state = 0);
90 The `tokenize_and_parse` function is one of the main lexer API
91 functions. It simplifies using a lexer as the underlying token source
92 while parsing a given input sequence.
94 The functions take a pair of iterators spanning the underlying input
95 stream to parse, the lexer object (built from the token definitions)
96 and a parser object (built from the parser grammar definition). Additionally
97 they may take the attributes for the parser step.
99 The function returns `true` if the parsing succeeded (the given input
100 sequence has been successfully matched by the given grammar).
102 template <typename Iterator, typename Lexer, typename ParserExpr>
108 , ParserExpr const& expr)
110 template <typename Iterator, typename Lexer, typename ParserExpr
111 , typename Attr1, typename Attr2, ..., typename AttrN>
117 , ParserExpr const& expr
118 , Attr1 const& attr1, Attr2 const& attr2, ..., AttrN const& attrN);
120 The functions `tokenize_and_phrase_parse` take a pair of iterators spanning
121 the underlying input stream to parse, the lexer object (built from the token
122 definitions) and a parser object (built from the parser grammar definition).
123 The additional skipper parameter will be used as the skip parser during
124 the parsing process. Additionally they may take the attributes for the parser
127 The function returns `true` if the parsing succeeded (the given input
128 sequence has been successfully matched by the given grammar).
130 template <typename Iterator, typename Lexer, typename ParserExpr
133 tokenize_and_phrase_parse(
137 , ParserExpr const& expr
138 , Skipper const& skipper
139 , BOOST_SCOPED_ENUM(skip_flag) post_skip = skip_flag::postskip);
141 template <typename Iterator, typename Lexer, typename ParserExpr
142 , typename Skipper, typename Attr1, typename Attr2, ..., typename AttrN>
144 tokenize_and_phrase_parse(
148 , ParserExpr const& expr
149 , Skipper const& skipper
150 , Attr1 const& attr1, Attr2 const& attr2, ..., AttrN const& attrN);
152 template <typename Iterator, typename Lexer, typename ParserExpr
153 , typename Skipper, typename Attr1, typename Attr2, ..., typename AttrN>
155 tokenize_and_phrase_parse(
159 , ParserExpr const& expr
160 , Skipper const& skipper
161 , BOOST_SCOPED_ENUM(skip_flag) post_skip
162 , Attr1 const& attr1, Attr2 const& attr2, ..., AttrN const& attrN);
164 The maximum number of supported arguments is limited by the preprocessor
165 constant `SPIRIT_ARGUMENTS_LIMIT`. This constant defaults to the value defined
166 by the preprocessor constant `PHOENIX_LIMIT` (which in turn defaults to `10`).
168 [note The variadic function with two or more attributes internally combine
169 references to all passed attributes into a `fusion::vector` and forward
170 this as a combined attribute to the corresponding one attribute function.]
172 The `tokenize_and_phrase_parse` functions not taking an explicit `skip_flag`
173 as one of their arguments invoke the passed skipper after a successful match
174 of the parser expression. This can be inhibited by using the other versions of
175 that function while passing `skip_flag::dont_postskip` to the corresponding
178 [heading Template parameters]
181 [[Parameter] [Description]]
182 [[`Iterator`] [__fwditer__ pointing to the underlying input sequence to parse.]]
183 [[`Lexer`] [A lexer (token definition) object.]]
184 [[`F`] [A function object called for each generated token.]]
185 [[`ParserExpr`] [An expression that can be converted to a Qi parser.]]
186 [[`Skipper`] [Parser used to skip white spaces.]]
187 [[`Attr1`, `Attr2`, ..., `AttrN`][One or more attributes.]]