2 / Copyright (c) 2007 Eric Niebler
4 / Distributed under the Boost Software License, Version 1.0. (See accompanying
5 / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
12 generator metafunctions
14 metafunctions for tag type and arity
18 introspection with grammars and proto::matches
21 [/================================================================================]
22 [section:intermediate_form Intermediate Form:
23 Understanding and Introspecting Expressions]
24 [/================================================================================]
26 By now, you know a bit about how to build a front-end for your EDSL "compiler" -- you can define terminals and functions that generate expression templates. But we haven't said anything about the expression templates themselves. What do they look like? What can you do with them? In this section we'll see.
28 [/=========================]
29 [heading The [^expr<>] Type]
30 [/=========================]
32 All Proto expressions are an instantiation of a template called _expr_ (or a wrapper around such an instantiation). When we define a terminal as below, we are really initializing an instance of the _expr_ template.
34 // Define a placeholder type
39 // Define the Protofied placeholder terminal
40 proto::terminal< placeholder<0> >::type const _1 = {{}};
42 The actual type of `_1` looks like this:
44 proto::expr< proto::tag::terminal, proto::term< placeholder<0> >, 0 >
46 The _expr_ template is the most important type in Proto. Although you will
47 rarely need to deal with it directly, it's always there behind the scenes
48 holding your expression trees together. In fact, _expr_ /is/ the expression
49 tree -- branches, leaves and all.
51 The _expr_ template makes up the nodes in expression trees. The first template
52 parameter is the node type; in this case, `proto::tag::terminal`. That means
53 that `_1` is a leaf-node in the expression tree. The second template parameter
54 is a list of child types, or in the case of terminals, the terminal's value
55 type. Terminals will always have only one type in the type list. The last
56 parameter is the arity of the expression. Terminals have arity 0, unary
57 expressions have arity 1, etc.
59 The _expr_ struct is defined as follows:
61 template< typename Tag, typename Args, long Arity = Args::arity >
64 template< typename Tag, typename Args >
65 struct expr< Tag, Args, 1 >
67 typedef typename Args::child0 proto_child0;
72 The _expr_ struct does not define a constructor, or anything else that would
73 prevent static initialization. All _expr_ objects are initialized using
74 ['aggregate initialization], with curly braces. In our example, `_1` is
75 initialized with the initializer `{{}}`. The outer braces are the initializer
76 for the _expr_ struct, and the inner braces are for the member `_1.child0` which
77 is of type `placeholder<0>`. Note that we use braces to initialize `_1.child0`
78 because `placeholder<0>` is also an aggregate.
80 [/================================]
81 [heading Building Expression Trees]
82 [/================================]
84 The `_1` node is an instantiation of _expr_, and expressions containing
85 `_1` are also instantiations of _expr_. To use Proto effectively, you
86 won't have to bother yourself with the actual types that Proto generates.
87 These are details, but you're likely to encounter these types in compiler
88 error messages, so it's helpful to be familiar with them. The types look
91 // The type of the expression -_1
98 , proto::term< placeholder<0> >
104 negate_placeholder_type;
106 negate_placeholder_type x = -_1;
108 // The type of the expression _1 + 42
115 , proto::term< placeholder<0> >
120 , proto::term< int const & >
126 placeholder_plus_int_type;
128 placeholder_plus_int_type y = _1 + 42;
130 There are a few things to note about these types:
132 * Terminals have arity zero, unary expressions have arity one and binary
133 expressions have arity two.
134 * When one Proto expression is made a child node of another Proto expression,
135 it is held by reference, ['even if it is a temporary object]. This last
136 point becomes important later.
137 * Non-Proto expressions, such as the integer literal, are turned into Proto
138 expressions by wrapping them in new `expr<>` terminal objects. These new
139 wrappers are not themselves held by reference, but the object wrapped /is/.
140 Notice that the type of the Protofied `42` literal is `int const &` -- held
143 The types make it clear: everything in a Proto expression tree is held by
144 reference. That means that building an expression tree is exceptionally cheap.
145 It involves no copying at all.
147 [note An astute reader will notice that the object `y` defined above will be
148 left holding a dangling reference to a temporary int. In the sorts of
149 high-performance applications Proto addresses, it is typical to build and
150 evaluate an expression tree before any temporary objects go out of scope, so
151 this dangling reference situation often doesn't arise, but it is certainly
152 something to be aware of. Proto provides utilities for deep-copying expression
153 trees so they can be passed around as value types without concern for dangling
156 [/========================================================]
157 [section:left_right_child Accessing Parts of an Expression]
158 [/========================================================]
160 After assembling an expression into a tree, you'll naturally want to be
161 able to do the reverse, and access a node's children. You may even want
162 to be able to iterate over the children with algorithms from the
163 Boost.Fusion library. This section shows how.
165 [/==========================================]
166 [heading Getting Expression Tags and Arities]
167 [/==========================================]
169 Every node in an expression tree has both a /tag/ type that describes the node, and an /arity/ corresponding to the number of child nodes it has. You can use the _tag_of_ and _arity_of_ metafunctions to fetch them. Consider the following:
171 template<typename Expr>
172 void check_plus_node(Expr const &)
174 // Assert that the tag type is proto::tag::plus
175 BOOST_STATIC_ASSERT((
177 typename proto::tag_of<Expr>::type
182 // Assert that the arity is 2
183 BOOST_STATIC_ASSERT( proto::arity_of<Expr>::value == 2 );
186 // Create a binary plus node and use check_plus_node()
187 // to verify its tag type and arity:
188 check_plus_node( proto::lit(1) + 2 );
190 For a given type `Expr`, you could access the tag and arity directly as `Expr::proto_tag` and `Expr::proto_arity`, where `Expr::proto_arity` is an MPL Integral Constant.
192 [/==============================]
193 [heading Getting Terminal Values]
194 [/==============================]
196 There is no simpler expression than a terminal, and no more basic operation than extracting its value. As we've already seen, that is what _value_ is for.
198 proto::terminal< std::ostream & >::type cout_ = {std::cout};
200 // Get the value of the cout_ terminal:
201 std::ostream & sout = proto::value( cout_ );
203 // Assert that we got back what we put in:
204 assert( &sout == &std::cout );
206 To compute the return type of the _value_ function, you can use _result_of_value_. When the parameter to _result_of_value_ is a non-reference type, the result type of the metafunction is the type of the value as suitable for storage by value; that is, top-level reference and qualifiers are stripped from it. But when instantiated with a reference type, the result type has a reference /added/ to it, yielding a type suitable for storage by reference. If you want to know the actual type of the terminal's value including whether it is stored by value or reference, you can use `fusion::result_of::value_at<Expr, 0>::type`.
208 The following table summarizes the above paragraph.
210 [def _unless_ [footnote If `T` is a reference-to-function type, then the result type is simply `T`.]]
212 [table Accessing Value Types
213 [[Metafunction Invocation][When the Value Type Is ...][The Result Is ...]]
214 [[`proto::result_of::value<Expr>::type`][`T`][``typename boost::remove_const<
215 typename boost::remove_reference<T>::type
217 [[`proto::result_of::value<Expr &>::type`][`T`][``typename boost::add_reference<T>::type``]]
218 [[`proto::result_of::value<Expr const &>::type`][`T`][``typename boost::add_reference<
219 typename boost::add_const<T>::type
221 [[`fusion::result_of::value_at<Expr, 0>::type`][`T`][`T`]]
224 [/================================]
225 [heading Getting Child Expressions]
226 [/================================]
228 Each non-terminal node in an expression tree corresponds to an operator in an expression, and the children correspond to the operands, or arguments of the operator. To access them, you can use the _child_c_ function template, as demonstrated below:
230 proto::terminal<int>::type i = {42};
232 // Get the 0-th operand of an addition operation:
233 proto::terminal<int>::type &ri = proto::child_c<0>( i + 2 );
235 // Assert that we got back what we put in:
238 You can use the _result_of_child_c_ metafunction to get the type of the Nth child of an expression node. Usually you don't care to know whether a child is stored by value or by reference, so when you ask for the type of the Nth child of an expression `Expr` (where `Expr` is not a reference type), you get the child's type after references and cv-qualifiers have been stripped from it.
240 template<typename Expr>
241 void test_result_of_child_c(Expr const &expr)
243 typedef typename proto::result_of::child_c<Expr, 0>::type type;
245 // Since Expr is not a reference type,
246 // result_of::child_c<Expr, 0>::type is a
247 // non-cv qualified, non-reference type:
249 boost::is_same< type, proto::terminal<int>::type >
254 proto::terminal<int>::type i = {42};
255 test_result_of_child_c( i + 2 );
257 However, if you ask for the type of the Nth child of `Expr &` or `Expr const &`
258 (note the reference), the result type will be a reference, regardless of whether
259 the child is actually stored by reference or not. If you need to know exactly
260 how the child is stored in the node, whether by reference or by value, you can
261 use `fusion::result_of::value_at<Expr, N>::type`. The following table summarizes
262 the behavior of the _result_of_child_c_ metafunction.
264 [table Accessing Child Types
265 [[Metafunction Invocation][When the Child Is ...][The Result Is ...]]
266 [[`proto::result_of::child_c<Expr, N>::type`][`T`][``typename boost::remove_const<
267 typename boost::remove_reference<T>::type
269 [[`proto::result_of::child_c<Expr &, N>::type`][`T`][``typename boost::add_reference<T>::type``]]
270 [[`proto::result_of::child_c<Expr const &, N>::type`][`T`][``typename boost::add_reference<
271 typename boost::add_const<T>::type
273 [[`fusion::result_of::value_at<Expr, N>::type`][`T`][`T`]]
276 [/=======================]
277 [heading Common Shortcuts]
278 [/=======================]
280 Most operators in C++ are unary or binary, so accessing the only operand, or the left and right operands, are very common operations. For this reason, Proto provides the _child_, _left_, and _right_ functions. _child_ and _left_ are synonymous with `proto::child_c<0>()`, and _right_ is synonymous with `proto::child_c<1>()`.
282 There are also _result_of_child_, _result_of_left_, and _result_of_right_ metafunctions that merely forward to their _result_of_child_c_ counterparts.
286 [/===============================]
287 [section Deep-copying Expressions]
288 [/===============================]
290 When you build an expression template with Proto, all the intermediate child nodes are held /by reference/. The avoids needless copies, which is crucial if you want your EDSL to perform well at runtime. Naturally, there is a danger if the temporary objects go out of scope before you try to evaluate your expression template. This is especially a problem in C++0x with the new `decltype` and `auto` keywords. Consider:
292 // OOPS: "ex" is left holding dangling references
293 auto ex = proto::lit(1) + 2;
295 The problem can happen in today's C++ also if you use `BOOST_TYPEOF()` or `BOOST_AUTO()`, or if you try to pass an expression template outside the scope of its constituents.
297 In these cases, you want to deep-copy your expression template so that all intermediate nodes and the terminals are held /by value/. That way, you can safely assign the expression template to a local variable or return it from a function without worrying about dangling references. You can do this with _deep_copy_ as fo
300 // OK, "ex" has no dangling references
301 auto ex = proto::deep_copy( proto::lit(1) + 2 );
303 If you are using _typeof_, it would look like this:
305 // OK, use BOOST_AUTO() and proto::deep_copy() to
306 // store an expression template in a local variable
307 BOOST_AUTO( ex, proto::deep_copy( proto::lit(1) + 2 ) );
309 For the above code to work, you must include the [headerref boost/proto/proto_typeof.hpp] header, which also defines the _AUTO_ macro which automatically deep-copies its argument. With _AUTO_, the above code can be writen as:
311 // OK, BOOST_PROTO_AUTO() automatically deep-copies
313 BOOST_PROTO_AUTO( ex, proto::lit(1) + 2 );
315 When deep-copying an expression tree, all intermediate nodes and all terminals are stored by value. The only exception is terminals that are function references, which are left alone.
317 [note _deep_copy_ makes no exception for arrays, which it stores by value. That can potentially cause a large amount of data to be copied.]
321 [/============================]
322 [section Debugging Expressions]
323 [/============================]
325 Proto provides a utility for pretty-printing expression trees that comes in very handy when you're trying to debug your EDSL. It's called _display_expr_, and you pass it the expression to print and optionally, an `std::ostream` to which to send the output. Consider:
327 // Use display_expr() to pretty-print an expression tree
329 proto::lit("hello") + 42
332 The above code writes this to `std::cout`:
339 In order to call _display_expr_, all the terminals in the expression must be Streamable (that is, they can be written to a `std::ostream`). In addition, the tag types must all be Streamable as well. Here is an example that includes a custom terminal type and a custom tag:
341 // A custom tag type that is Streamable
344 friend std::ostream &operator<<(std::ostream &s, MyTag)
350 // Some other Streamable type
353 friend std::ostream &operator<<(std::ostream &s, MyTerminal)
355 return s << "MyTerminal";
361 // Display an expression tree that contains a custom
362 // tag and a user-defined type in a terminal
364 proto::make_expr<MyTag>(MyTerminal()) + 42
368 The above code prints the following:
379 [/=============================================================]
380 [section:tags_and_metafunctions Operator Tags and Metafunctions]
381 [/=============================================================]
383 The following table lists the overloadable C++ operators, the Proto tag types for each, and the name of the metafunctions for generating the corresponding Proto expression types. And as we'll see later, the metafunctions are also usable as grammars for matching such nodes, as well as pass-through transforms.
385 [table Operators, Tags and Metafunctions
388 [Proto Metafunction]]
391 [`proto::tag::unary_plus`]
392 [`proto::unary_plus<>`]]
395 [`proto::tag::negate`]
399 [`proto::tag::dereference`]
400 [`proto::dereference<>`]]
403 [`proto::tag::complement`]
404 [`proto::complement<>`]]
407 [`proto::tag::address_of`]
408 [`proto::address_of<>`]]
411 [`proto::tag::logical_not`]
412 [`proto::logical_not<>`]]
415 [`proto::tag::pre_inc`]
416 [`proto::pre_inc<>`]]
419 [`proto::tag::pre_dec`]
420 [`proto::pre_dec<>`]]
422 [[unary postfix `++`]
423 [`proto::tag::post_inc`]
424 [`proto::post_inc<>`]]
426 [[unary postfix `--`]
427 [`proto::tag::post_dec`]
428 [`proto::post_dec<>`]]
431 [`proto::tag::shift_left`]
432 [`proto::shift_left<>`]]
435 [`proto::tag::shift_right`]
436 [`proto::shift_right<>`]]
439 [`proto::tag::multiplies`]
440 [`proto::multiplies<>`]]
443 [`proto::tag::divides`]
444 [`proto::divides<>`]]
447 [`proto::tag::modulus`]
448 [`proto::modulus<>`]]
455 [`proto::tag::minus`]
463 [`proto::tag::greater`]
464 [`proto::greater<>`]]
467 [`proto::tag::less_equal`]
468 [`proto::less_equal<>`]]
471 [`proto::tag::greater_equal`]
472 [`proto::greater_equal<>`]]
475 [`proto::tag::equal_to`]
476 [`proto::equal_to<>`]]
479 [`proto::tag::not_equal_to`]
480 [`proto::not_equal_to<>`]]
483 [`proto::tag::logical_or`]
484 [`proto::logical_or<>`]]
487 [`proto::tag::logical_and`]
488 [`proto::logical_and<>`]]
491 [`proto::tag::bitwise_and`]
492 [`proto::bitwise_and<>`]]
495 [`proto::tag::bitwise_or`]
496 [`proto::bitwise_or<>`]]
499 [`proto::tag::bitwise_xor`]
500 [`proto::bitwise_xor<>`]]
503 [`proto::tag::comma`]
507 [`proto::tag::mem_ptr`]
508 [`proto::mem_ptr<>`]]
511 [`proto::tag::assign`]
515 [`proto::tag::shift_left_assign`]
516 [`proto::shift_left_assign<>`]]
519 [`proto::tag::shift_right_assign`]
520 [`proto::shift_right_assign<>`]]
523 [`proto::tag::multiplies_assign`]
524 [`proto::multiplies_assign<>`]]
527 [`proto::tag::divides_assign`]
528 [`proto::divides_assign<>`]]
531 [`proto::tag::modulus_assign`]
532 [`proto::modulus_assign<>`]]
535 [`proto::tag::plus_assign`]
536 [`proto::plus_assign<>`]]
539 [`proto::tag::minus_assign`]
540 [`proto::minus_assign<>`]]
543 [`proto::tag::bitwise_and_assign`]
544 [`proto::bitwise_and_assign<>`]]
547 [`proto::tag::bitwise_or_assign`]
548 [`proto::bitwise_or_assign<>`]]
551 [`proto::tag::bitwise_xor_assign`]
552 [`proto::bitwise_xor_assign<>`]]
555 [`proto::tag::subscript`]
556 [`proto::subscript<>`]]
559 [`proto::tag::if_else_`]
560 [`proto::if_else_<>`]]
562 [[n-ary function call]
563 [`proto::tag::function`]
564 [`proto::function<>`]]
569 [/======================================]
570 [section Expressions as Fusion Sequences]
571 [/======================================]
573 Boost.Fusion is a library of iterators, algorithms, containers and adaptors for manipulating heterogeneous sequences. In essence, a Proto expression is just a heterogeneous sequence of its child expressions, and so Proto expressions are valid Fusion random-access sequences. That means you can apply Fusion algorithms to them, transform them, apply Fusion filters and views to them, and access their elements using `fusion::at()`. The things Fusion can do to heterogeneous sequences are beyond the scope of this users' guide, but below is a simple example. It takes a lazy function invocation like `fun(1,2,3,4)` and uses Fusion to print the function arguments in order.
578 void operator()(T const &t) const
580 std::cout << t << std::endl;
585 proto::terminal<fun_t>::type const fun = {{}};
590 // pop_front() removes the "fun" child
591 fusion::pop_front(fun(1,2,3,4))
592 // Extract the ints from the terminal nodes
593 , proto::functional::value()
598 Recall from the Introduction that types in the `proto::functional` namespace
599 define function objects that correspond to Proto's free functions. So
600 `proto::functional::value()` creates a function object that is equivalent to
601 the `proto::value()` function. The above invocation of `fusion::for_each()`
602 displays the following:
611 Terminals are also valid Fusion sequences. They contain exactly one element: their value.
613 [/========================================]
614 [heading Flattening Proto Expression Tress]
615 [/========================================]
617 Imagine a slight variation of the above example where, instead of iterating over the arguments of a lazy function invocation, we would like to iterate over the terminals in an addition expression:
619 proto::terminal<int>::type const _1 = {1};
621 // ERROR: this doesn't work! Why?
625 , proto::functional::value()
630 The reason this doesn't work is because the expression `_1 + 2 + 3 + 4` does not describe a flat sequence of terminals --- it describes a binary tree. We can treat it as a flat sequence of terminals, however, using Proto's _flatten_ function. _flatten_ returns a view which makes a tree appear as a flat Fusion sequence. If the top-most node has a tag type `T`, then the elements of the flattened sequence are the child nodes that do /not/ have tag type `T`. This process is evaluated recursively. So the above can correctly be written as:
632 proto::terminal<int>::type const _1 = {1};
634 // OK, iterate over a flattened view
637 proto::flatten(_1 + 2 + 3 + 4)
638 , proto::functional::value()
643 The above invocation of `fusion::for_each()` displays the following:
654 [/============================================================================]
655 [section:expression_introspection Expression Introspection: Defining a Grammar]
656 [/============================================================================]
658 Expression trees can have a very rich and complicated structure. Often, you need to know some things about an expression's structure before you can process it. This section describes the tools Proto provides for peering inside an expression tree and discovering its structure. And as you'll see in later sections, all the really interesting things you can do with Proto begin right here.
660 [/===============================================]
661 [section:patterns Finding Patterns in Expressions]
662 [/===============================================]
664 Imagine your EDSL is a miniature I/O facility, with iostream operations that execute lazily. You might want expressions representing input operations to be processed by one function, and output operations to be processed by a different function. How would you do that?
666 The answer is to write patterns (a.k.a, /grammars/) that match the structure of input and output expressions. Proto provides utilities for defining the grammars, and the _matches_ template for checking whether a given expression type matches the grammar.
668 First, let's define some terminals we can use in our lazy I/O expressions:
670 proto::terminal< std::istream & >::type cin_ = { std::cin };
671 proto::terminal< std::ostream & >::type cout_ = { std::cout };
673 Now, we can use `cout_` instead of `std::cout`, and get I/O expression trees that we can execute later. To define grammars that match input and output expressions of the form `cin_ >> i` and `cout_ << 1` we do this:
676 : proto::shift_right< proto::terminal< std::istream & >, proto::_ >
680 : proto::shift_left< proto::terminal< std::ostream & >, proto::_ >
683 We've seen the template `proto::terminal<>` before, but here we're using it
684 without accessing the nested `::type`. When used like this, it is a very simple
685 grammar, as are `proto::shift_right<>` and `proto::shift_left<>`. The newcomer
686 here is `_` in the `proto` namespace. It is a wildcard that matches anything.
687 The `Input` struct is a grammar that matches any right-shift expression that
688 has a `std::istream` terminal as its left operand.
690 We can use these grammars together with the _matches_ template to query at
691 compile time whether a given I/O expression type is an input or output
692 operation. Consider the following:
694 template< typename Expr >
695 void input_output( Expr const & expr )
697 if( proto::matches< Expr, Input >::value )
699 std::cout << "Input!\n";
702 if( proto::matches< Expr, Output >::value )
704 std::cout << "Output!\n";
711 input_output( cout_ << 1 );
712 input_output( cin_ >> i );
717 This program prints the following:
724 If we wanted to break the `input_output()` function into two functions, one that handles input expressions and one for output expressions, we can use `boost::enable_if<>`, as follows:
726 template< typename Expr >
727 typename boost::enable_if< proto::matches< Expr, Input > >::type
728 input_output( Expr const & expr )
730 std::cout << "Input!\n";
733 template< typename Expr >
734 typename boost::enable_if< proto::matches< Expr, Output > >::type
735 input_output( Expr const & expr )
737 std::cout << "Output!\n";
740 This works as the previous version did. However, the following does not compile at all:
742 input_output( cout_ << 1 << 2 ); // oops!
744 What's wrong? The problem is that this expression does not match our grammar. The expression groups as if it were written like `(cout_ << 1) << 2`. It will not match the `Output` grammar, which expects the left operand to be a terminal, not another left-shift operation. We need to fix the grammar.
746 We notice that in order to verify an expression as input or output, we'll need to recurse down to the bottom-left-most leaf and check that it is a `std::istream` or `std::ostream`. When we get to the terminal, we must stop recursing. We can express this in our grammar using _or_. Here are the correct `Input` and `Output` grammars:
750 proto::shift_right< proto::terminal< std::istream & >, proto::_ >
751 , proto::shift_right< Input, proto::_ >
757 proto::shift_left< proto::terminal< std::ostream & >, proto::_ >
758 , proto::shift_left< Output, proto::_ >
762 This may look a little odd at first. We seem to be defining the `Input` and `Output` types in terms of themselves. This is perfectly OK, actually. At the point in the grammar that the `Input` and `Output` types are being used, they are /incomplete/, but by the time we actually evaluate the grammar with _matches_, the types will be complete. These are recursive grammars, and rightly so because they must match a recursive data structure!
764 Matching an expression such as `cout_ << 1 << 2` against the `Output` grammar procedes as follows:
766 # The first alternate of the _or_ is tried first. It will fail, because the
767 expression `cout_ << 1 << 2` does not match the grammar `proto::shift_left<
768 proto::terminal< std::ostream & >, proto::_ >`.
769 # Then the second alternate is tried next. We match the expression against
770 `proto::shift_left< Output, proto::_ >`. The expression is a left-shift, so we
771 next try to match the operands.
772 # The right operand `2` matches `proto::_` trivially.
773 # To see if the left operand `cout_ << 1` matches `Output`, we must recursively
774 evaluate the `Output` grammar. This time we succeed, because `cout_ << 1` will
775 match the first alternate of the _or_.
777 We're done -- the grammar matches successfully.
781 [/===========================================]
782 [section Fuzzy and Exact Matches of Terminals]
783 [/===========================================]
785 The terminals in an expression tree could be const or non-const references, or they might not be references at all. When writing grammars, you usually don't have to worry about it because _matches_ gives you a little wiggle room when matching terminals. A grammar such as `proto::terminal<int>` will match a terminal of type `int`, `int &`, or `int const &`.
787 You can explicitly specify that you want to match a reference type. If you do, the type must match exactly. For instance, a grammar such as `proto::terminal<int &>` will only match an `int &`. It will not match an `int` or an `int const &`.
789 The table below shows how Proto matches terminals. The simple rule is: if you want to match only reference types, you must specify the reference in your grammar. Otherwise, leave it off and Proto will ignore const and references.
791 [table proto::matches<> and Reference / CV-Qualification of Terminals
792 [[Terminal] [Grammar] [Matches?]]
795 [[T const &] [T] [yes]]
798 [[T const &] [T &] [no]]
799 [[T] [T const &] [no]]
800 [[T &] [T const &] [no]]
801 [[T const &] [T const &] [yes]]
804 This begs the question: What if you want to match an `int`, but not an `int &` or an `int const &`? For forcing exact matches, Proto provides the _exact_ template. For instance, `proto::terminal< proto::exact<int> >` would only match an `int` held by value.
806 Proto gives you extra wiggle room when matching array types. Array types match themselves or the pointer types they decay to. This is especially useful with character arrays. The type returned by `proto::as_expr("hello")` is `proto::terminal<char const[6]>::type`. That's a terminal containing a 6-element character array. Naturally, you can match this terminal with the grammar `proto::terminal<char const[6]>`, but the grammar `proto::terminal<char const *>` will match it as well, as the following code fragment illustrates.
809 : proto::terminal< char const * >
812 typedef proto::terminal< char const[6] >::type char_array;
814 BOOST_MPL_ASSERT(( proto::matches< char_array, CharString > ));
816 What if we only wanted `CharString` to match terminals of exactly the type `char const *`? You can use _exact_ here to turn off the fuzzy matching of terminals, as follows:
819 : proto::terminal< proto::exact< char const * > >
822 typedef proto::terminal<char const[6]>::type char_array;
823 typedef proto::terminal<char const *>::type char_string;
825 BOOST_MPL_ASSERT(( proto::matches< char_string, CharString > ));
826 BOOST_MPL_ASSERT_NOT(( proto::matches< char_array, CharString > ));
828 Now, `CharString` does not match array types, only character string pointers.
830 The inverse problem is a little trickier: what if you wanted to match all character arrays, but not character pointers? As mentioned above, the expression `as_expr("hello")` has the type `proto::terminal< char const[ 6 ] >::type`. If you wanted to match character arrays of arbitrary size, you could use `proto::N`, which is an array-size wildcard. The following grammar would match any string literal: `proto::terminal< char const[ proto::N ] >`.
832 Sometimes you need even more wiggle room when matching terminals. For example, maybe you're building a calculator EDSL and you want to allow any terminals that are convertible to `double`. For that, Proto provides the _convertible_to_ template. You can use it as: `proto::terminal< proto::convertible_to< double > >`.
834 There is one more way you can perform a fuzzy match on terminals. Consider the problem of trying to match a `std::complex<>` terminal. You can easily match a `std::complex<float>` or a `std::complex<double>`, but how would you match any instantiation of `std::complex<>`? You can use `proto::_` here to solve this problem. Here is the grammar to match any `std::complex<>` instantiation:
837 : proto::terminal< std::complex< proto::_ > >
840 When given a grammar like this, Proto will deconstruct the grammar and the terminal it is being matched against and see if it can match all the constituents.
844 [/====================================================]
845 [section:if_and_not [^if_<>], [^and_<>], and [^not_<>]]
846 [/====================================================]
848 We've already seen how to use expression generators like `proto::terminal<>` and `proto::shift_right<>` as grammars. We've also seen _or_, which we can use to express a set of alternate grammars. There are a few others of interest; in particular, _if_, _and_ and _not_.
850 The _not_ template is the simplest. It takes a grammar as a template parameter and logically negates it; `not_<Grammar>` will match any expression that `Grammar` does /not/ match.
852 The _if_ template is used together with a Proto transform that is evaluated against expression types to find matches. (Proto transforms will be described later.)
854 The _and_ template is like _or_, except that each argument of the _and_ must match in order for the _and_ to match. As an example, consider the definition of `CharString` above that uses _exact_. It could have been written without _exact_ as follows:
858 proto::terminal< proto::_ >
859 , proto::if_< boost::is_same< proto::_value, char const * >() >
863 This says that a `CharString` must be a terminal, /and/ its value type must be the same as `char const *`. Notice the template argument of _if_: `boost::is_same< proto::_value, char const * >()`. This is Proto transform that compares the value type of a terminal to `char const *`.
865 The _if_ template has a couple of variants. In addition to `if_<Condition>` you can also say `if_<Condition, ThenGrammar>` and `if_<Condition, ThenGrammar, ElseGrammar>`. These let you select one sub-grammar or another based on the `Condition`.
869 [/=======================================================]
870 [section:switch Improving Compile Times With [^switch_<>]]
871 [/=======================================================]
873 When your Proto grammar gets large, you'll start to run into some scalability problems with _or_, the construct you use to specify alternate sub-grammars. First, due to limitations in C++, _or_ can only accept up to a certain number of sub-grammars, controlled by the `BOOST_PROTO_MAX_LOGICAL_ARITY` macro. This macro defaults to eight, and you can set it higher, but doing so will aggravate another scalability problem: long compile times. With _or_, alternate sub-grammars are tried in order -- like a series of cascading `if`'s -- leading to lots of unnecessary template instantiations. What you would prefer instead is something like `switch` that avoids the expense of cascading `if`'s. That's the purpose of _switch_; although less convenient than _or_, it improves compile times for larger grammars and does not have an arbitrary fixed limit on the number of sub-grammars.
875 Let's illustrate how to use _switch_ by first writing a big grammar with _or_ and then translating it to an equivalent grammar using _switch_:
877 // Here is a big, inefficient grammar
881 , proto::terminal<double>
882 , proto::unary_plus<ABigGrammar>
883 , proto::negate<ABigGrammar>
884 , proto::complement<ABigGrammar>
885 , proto::plus<ABigGrammar, ABigGrammar>
886 , proto::minus<ABigGrammar, ABigGrammar>
888 proto::multiplies<ABigGrammar, ABigGrammar>
889 , proto::divides<ABigGrammar, ABigGrammar>
890 , proto::modulus<ABigGrammar, ABigGrammar>
895 The above might be the grammar to a more elaborate calculator EDSL. Notice that since there are more than eight sub-grammars, we had to chain the sub-grammars with a nested _or_ -- not very nice.
897 The idea behind _switch_ is to dispatch based on an expression's tag type to a sub-grammar that handles expressions of that type. To use _switch_, you define a struct with a nested `case_<>` template, specialized on tag types. The above grammar can be expressed using _switch_ as follows. It is described below.
899 // Redefine ABigGrammar more efficiently using proto::switch_<>
902 struct ABigGrammarCases
904 // The primary template matches nothing:
905 template<typename Tag>
911 // Terminal expressions are handled here
913 struct ABigGrammarCases::case_<proto::tag::terminal>
916 , proto::terminal<double>
920 // Non-terminals are handled similarly
922 struct ABigGrammarCases::case_<proto::tag::unary_plus>
923 : proto::unary_plus<ABigGrammar>
927 struct ABigGrammarCases::case_<proto::tag::negate>
928 : proto::negate<ABigGrammar>
932 struct ABigGrammarCases::case_<proto::tag::complement>
933 : proto::complement<ABigGrammar>
937 struct ABigGrammarCases::case_<proto::tag::plus>
938 : proto::plus<ABigGrammar, ABigGrammar>
942 struct ABigGrammarCases::case_<proto::tag::minus>
943 : proto::minus<ABigGrammar, ABigGrammar>
947 struct ABigGrammarCases::case_<proto::tag::multiplies>
948 : proto::multiplies<ABigGrammar, ABigGrammar>
952 struct ABigGrammarCases::case_<proto::tag::divides>
953 : proto::divides<ABigGrammar, ABigGrammar>
957 struct ABigGrammarCases::case_<proto::tag::modulus>
958 : proto::modulus<ABigGrammar, ABigGrammar>
961 // Define ABigGrammar in terms of ABigGrammarCases
962 // using proto::switch_<>
964 : proto::switch_<ABigGrammarCases>
967 Matching an expression type `E` against `proto::switch_<C>` is equivalent to matching it against `C::case_<E::proto_tag>`. By dispatching on the expression's tag type, we can jump to the sub-grammar that handles expressions of that type, skipping over all the other sub-grammars that couldn't possibly match. If there is no specialization of `case_<>` for a particular tag type, we select the primary template. In this case, the primary template inherits from `proto::not_<_>` which matches no expressions.
969 Notice the specialization that handles terminals:
971 // Terminal expressions are handled here
973 struct ABigGrammarCases::case_<proto::tag::terminal>
976 , proto::terminal<double>
980 The `proto::tag::terminal` type by itself isn't enough to select an appropriate sub-grammar, so we use _or_ to list the alternate sub-grammars that match terminals.
982 [note You might be tempted to define your `case_<>` specializations /in situ/ as follows:
985 struct ABigGrammarCases
987 template<typename Tag>
988 struct case_ : proto::not_<_> {};
990 // ERROR: not legal C++
992 struct case_<proto::tag::terminal>
997 Unfortunately, for arcane reasons, it is not legal to define an explicit nested specialization /in situ/ like this. It is, however, perfectly legal to define /partial/ specializations /in situ/, so you can add a extra dummy template parameter that has a default, as follows:
1000 struct ABigGrammarCases
1002 // Note extra "Dummy" template parameter here:
1003 template<typename Tag, int Dummy = 0>
1004 struct case_ : proto::not_<_> {};
1006 // OK: "Dummy" makes this a partial specialization
1007 // instead of an explicit specialization.
1009 struct case_<proto::tag::terminal, Dummy>
1014 You might find this cleaner than defining explicit `case_<>` specializations outside of their enclosing struct.
1019 [/==================================]
1020 [section Matching Vararg Expressions]
1021 [/==================================]
1023 Not all of C++'s overloadable operators are unary or binary. There is the oddball `operator()` -- the function call operator -- which can have any number of arguments. Likewise, with Proto you may define your own "operators" that could also take more that two arguments. As a result, there may be nodes in your Proto expression tree that have an arbitrary number of children (up to _MAX_ARITY_, which is configurable). How do you write a grammar to match such a node?
1025 For such cases, Proto provides the _vararg_ class template. Its template argument is a grammar, and the _vararg_ will match the grammar zero or more times. Consider a Proto lazy function called `fun()` that can take zero or more characters as arguments, as follows:
1028 struct FunTag : proto::terminal< fun_tag > {};
1029 FunTag::type const fun = {{}};
1037 Below is the grammar that matches all the allowable invocations of `fun()`:
1040 : proto::function< FunTag, proto::vararg< proto::terminal< char > > >
1043 The `FunCall` grammar uses _vararg_ to match zero or more character literals as arguments of the `fun()` function.
1045 As another example, can you guess what the following grammar matches?
1049 proto::terminal< proto::_ >
1050 , proto::nary_expr< proto::_, proto::vararg< Foo > >
1054 Here's a hint: the first template parameter to `proto::nary_expr<>` represents the node type, and any additional template parameters represent child nodes. The answer is that this is a degenerate grammar that matches every possible expression tree, from root to leaves.
1058 [/=============================]
1059 [section Defining EDSL Grammars]
1060 [/=============================]
1062 In this section we'll see how to use Proto to define a grammar for your EDSL and use it to validate expression templates, giving short, readable compile-time errors for invalid expressions.
1064 [tip You might think that this is a backwards way of doing things. ["If Proto let
1065 me select which operators to overload, my users wouldn't be able to create invalid
1066 expressions in the first place, and I wouldn't need a grammar at all!] That may be
1067 true, but there are reasons for preferring to do things this way.
1069 First, it lets you develop your EDSL rapidly -- all the operators are there for you
1070 already! -- and worry about invalid syntax later.
1072 Second, it might be the case that some operators are only allowed in certain
1073 contexts within your EDSL. This is easy to express with a grammar, and hard to do
1074 with straight operator overloading.
1076 Third, using an EDSL grammar to flag invalid expressions can often yield better
1077 errors than manually selecting the overloaded operators.
1079 Fourth, the grammar can be used for more than just validation. You can use your
1080 grammar to define ['tree transformations] that convert expression templates into
1081 other more useful objects.
1083 If none of the above convinces you, you actually /can/ use Proto to control which
1084 operators are overloaded within your domain. And to do it, you need to define a
1087 In a previous section, we used Proto to define an EDSL for a lazily evaluated calculator that allowed any combination of placeholders, floating-point literals, addition, subtraction, multiplication, division and grouping. If we were to write the grammar for this EDSL in [@http://en.wikipedia.org/wiki/Extended_Backus_Naur_Form EBNF], it might look like this:
1090 group ::= '(' expression ')'
1091 factor ::= double | '_1' | '_2' | group
1092 term ::= factor (('\*' factor) | ('/' factor))\*
1093 expression ::= term (('+' term) | ('-' term))*
1096 This captures the syntax, associativity and precedence rules of a calculator.
1097 Writing the grammar for our calculator EDSL using Proto is /even simpler/.
1098 Since we are using C++ as the host language, we are bound to the associativity
1099 and precedence rules for the C++ operators. Our grammar can assume them. Also,
1100 in C++ grouping is already handled for us with the use of parenthesis, so we
1101 don't have to code that into our grammar.
1103 Let's begin our grammar for forward-declaring it:
1105 struct CalculatorGrammar;
1107 It's an incomplete type at this point, but we'll still be able to use it to
1108 define the rules of our grammar. Let's define grammar rules for the terminals:
1111 : proto::terminal< proto::convertible_to< double > >
1115 : proto::terminal< placeholder<0> >
1119 : proto::terminal< placeholder<1> >
1123 : proto::or_< Double, Placeholder1, Placeholder2 >
1126 Now let's define the rules for addition, subtraction, multiplication and division.
1127 Here, we can ignore issues of associativity and precedence -- the C++ compiler will
1128 enforce that for us. We only must enforce that the arguments to the operators must
1129 themselves conform to the `CalculatorGrammar` that we forward-declared above.
1132 : proto::plus< CalculatorGrammar, CalculatorGrammar >
1136 : proto::minus< CalculatorGrammar, CalculatorGrammar >
1140 : proto::multiplies< CalculatorGrammar, CalculatorGrammar >
1144 : proto::divides< CalculatorGrammar, CalculatorGrammar >
1147 Now that we've defined all the parts of the grammar, we can define
1148 `CalculatorGrammar`:
1150 struct CalculatorGrammar
1160 That's it! Now we can use `CalculatorGrammar` to enforce that an expression
1161 template conforms to our grammar. We can use _matches_ and `BOOST_MPL_ASSERT()`
1162 to issue readable compile-time errors for invalid expressions, as below:
1164 template< typename Expr >
1165 void evaluate( Expr const & expr )
1167 BOOST_MPL_ASSERT(( proto::matches< Expr, CalculatorGrammar > ));