]> git.proxmox.com Git - ceph.git/blame - ceph/src/boost/libs/proto/doc/intermediate_form.qbk
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / proto / doc / intermediate_form.qbk
CommitLineData
7c673cae
FG
1[/
2 / Copyright (c) 2007 Eric Niebler
3 /
4 / Distributed under the Boost Software License, Version 1.0. (See accompanying
5 / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6 /]
7
8
9[/talk about:
10
11 tag types
12 generator metafunctions
13 accessing child nodes
14 metafunctions for tag type and arity
15 deep_copy
16 flatten
17 debug utilities
18 introspection with grammars and proto::matches
19]
20
21[/================================================================================]
22[section:intermediate_form Intermediate Form:
23 Understanding and Introspecting Expressions]
24[/================================================================================]
25
26By now, you know a bit about how to build a front-end for your EDSL "compiler" -- you can define terminals and functions that generate expression templates. But we haven't said anything about the expression templates themselves. What do they look like? What can you do with them? In this section we'll see.
27
28[/=========================]
29[heading The [^expr<>] Type]
30[/=========================]
31
32All Proto expressions are an instantiation of a template called _expr_ (or a wrapper around such an instantiation). When we define a terminal as below, we are really initializing an instance of the _expr_ template.
33
34 // Define a placeholder type
35 template<int I>
36 struct placeholder
37 {};
38
39 // Define the Protofied placeholder terminal
40 proto::terminal< placeholder<0> >::type const _1 = {{}};
41
42The actual type of `_1` looks like this:
43
44 proto::expr< proto::tag::terminal, proto::term< placeholder<0> >, 0 >
45
46The _expr_ template is the most important type in Proto. Although you will
47rarely need to deal with it directly, it's always there behind the scenes
48holding your expression trees together. In fact, _expr_ /is/ the expression
49tree -- branches, leaves and all.
50
51The _expr_ template makes up the nodes in expression trees. The first template
52parameter is the node type; in this case, `proto::tag::terminal`. That means
53that `_1` is a leaf-node in the expression tree. The second template parameter
54is a list of child types, or in the case of terminals, the terminal's value
55type. Terminals will always have only one type in the type list. The last
56parameter is the arity of the expression. Terminals have arity 0, unary
57expressions have arity 1, etc.
58
59The _expr_ struct is defined as follows:
60
61 template< typename Tag, typename Args, long Arity = Args::arity >
62 struct expr;
63
64 template< typename Tag, typename Args >
65 struct expr< Tag, Args, 1 >
66 {
67 typedef typename Args::child0 proto_child0;
68 proto_child0 child0;
69 // ...
70 };
71
72The _expr_ struct does not define a constructor, or anything else that would
73prevent static initialization. All _expr_ objects are initialized using
74['aggregate initialization], with curly braces. In our example, `_1` is
75initialized with the initializer `{{}}`. The outer braces are the initializer
76for the _expr_ struct, and the inner braces are for the member `_1.child0` which
77is of type `placeholder<0>`. Note that we use braces to initialize `_1.child0`
78because `placeholder<0>` is also an aggregate.
79
80[/================================]
81[heading Building Expression Trees]
82[/================================]
83
84The `_1` node is an instantiation of _expr_, and expressions containing
85`_1` are also instantiations of _expr_. To use Proto effectively, you
86won't have to bother yourself with the actual types that Proto generates.
87These are details, but you're likely to encounter these types in compiler
88error messages, so it's helpful to be familiar with them. The types look
89like this:
90
91 // The type of the expression -_1
92 typedef
93 proto::expr<
94 proto::tag::negate
95 , proto::list1<
96 proto::expr<
97 proto::tag::terminal
98 , proto::term< placeholder<0> >
99 , 0
100 > const &
101 >
102 , 1
103 >
104 negate_placeholder_type;
105
106 negate_placeholder_type x = -_1;
107
108 // The type of the expression _1 + 42
109 typedef
110 proto::expr<
111 proto::tag::plus
112 , proto::list2<
113 proto::expr<
114 proto::tag::terminal
115 , proto::term< placeholder<0> >
116 , 0
117 > const &
118 , proto::expr<
119 proto::tag::terminal
120 , proto::term< int const & >
121 , 0
122 >
123 >
124 , 2
125 >
126 placeholder_plus_int_type;
127
128 placeholder_plus_int_type y = _1 + 42;
129
130There are a few things to note about these types:
131
132* Terminals have arity zero, unary expressions have arity one and binary
133 expressions have arity two.
134* When one Proto expression is made a child node of another Proto expression,
135 it is held by reference, ['even if it is a temporary object]. This last
136 point becomes important later.
137* Non-Proto expressions, such as the integer literal, are turned into Proto
138 expressions by wrapping them in new `expr<>` terminal objects. These new
139 wrappers are not themselves held by reference, but the object wrapped /is/.
140 Notice that the type of the Protofied `42` literal is `int const &` -- held
141 by reference.
142
143The types make it clear: everything in a Proto expression tree is held by
144reference. That means that building an expression tree is exceptionally cheap.
145It involves no copying at all.
146
147[note An astute reader will notice that the object `y` defined above will be
148left holding a dangling reference to a temporary int. In the sorts of
149high-performance applications Proto addresses, it is typical to build and
150evaluate an expression tree before any temporary objects go out of scope, so
151this dangling reference situation often doesn't arise, but it is certainly
152something to be aware of. Proto provides utilities for deep-copying expression
153trees so they can be passed around as value types without concern for dangling
154references.]
155
156[/========================================================]
157[section:left_right_child Accessing Parts of an Expression]
158[/========================================================]
159
160After assembling an expression into a tree, you'll naturally want to be
161able to do the reverse, and access a node's children. You may even want
162to be able to iterate over the children with algorithms from the
163Boost.Fusion library. This section shows how.
164
165[/==========================================]
166[heading Getting Expression Tags and Arities]
167[/==========================================]
168
169Every node in an expression tree has both a /tag/ type that describes the node, and an /arity/ corresponding to the number of child nodes it has. You can use the _tag_of_ and _arity_of_ metafunctions to fetch them. Consider the following:
170
171 template<typename Expr>
172 void check_plus_node(Expr const &)
173 {
174 // Assert that the tag type is proto::tag::plus
175 BOOST_STATIC_ASSERT((
176 boost::is_same<
177 typename proto::tag_of<Expr>::type
178 , proto::tag::plus
179 >::value
180 ));
181
182 // Assert that the arity is 2
183 BOOST_STATIC_ASSERT( proto::arity_of<Expr>::value == 2 );
184 }
185
186 // Create a binary plus node and use check_plus_node()
187 // to verify its tag type and arity:
188 check_plus_node( proto::lit(1) + 2 );
189
190For a given type `Expr`, you could access the tag and arity directly as `Expr::proto_tag` and `Expr::proto_arity`, where `Expr::proto_arity` is an MPL Integral Constant.
191
192[/==============================]
193[heading Getting Terminal Values]
194[/==============================]
195
196There is no simpler expression than a terminal, and no more basic operation than extracting its value. As we've already seen, that is what _value_ is for.
197
198 proto::terminal< std::ostream & >::type cout_ = {std::cout};
199
200 // Get the value of the cout_ terminal:
201 std::ostream & sout = proto::value( cout_ );
202
203 // Assert that we got back what we put in:
204 assert( &sout == &std::cout );
205
206To compute the return type of the _value_ function, you can use _result_of_value_. When the parameter to _result_of_value_ is a non-reference type, the result type of the metafunction is the type of the value as suitable for storage by value; that is, top-level reference and qualifiers are stripped from it. But when instantiated with a reference type, the result type has a reference /added/ to it, yielding a type suitable for storage by reference. If you want to know the actual type of the terminal's value including whether it is stored by value or reference, you can use `fusion::result_of::value_at<Expr, 0>::type`.
207
208The following table summarizes the above paragraph.
209
210[def _unless_ [footnote If `T` is a reference-to-function type, then the result type is simply `T`.]]
211
212[table Accessing Value Types
213 [[Metafunction Invocation][When the Value Type Is ...][The Result Is ...]]
214 [[`proto::result_of::value<Expr>::type`][`T`][``typename boost::remove_const<
215 typename boost::remove_reference<T>::type
216>::type _unless_``]]
217 [[`proto::result_of::value<Expr &>::type`][`T`][``typename boost::add_reference<T>::type``]]
218 [[`proto::result_of::value<Expr const &>::type`][`T`][``typename boost::add_reference<
219 typename boost::add_const<T>::type
220>::type``]]
221 [[`fusion::result_of::value_at<Expr, 0>::type`][`T`][`T`]]
222]
223
224[/================================]
225[heading Getting Child Expressions]
226[/================================]
227
228Each non-terminal node in an expression tree corresponds to an operator in an expression, and the children correspond to the operands, or arguments of the operator. To access them, you can use the _child_c_ function template, as demonstrated below:
229
230 proto::terminal<int>::type i = {42};
231
232 // Get the 0-th operand of an addition operation:
233 proto::terminal<int>::type &ri = proto::child_c<0>( i + 2 );
234
235 // Assert that we got back what we put in:
236 assert( &i == &ri );
237
238You can use the _result_of_child_c_ metafunction to get the type of the Nth child of an expression node. Usually you don't care to know whether a child is stored by value or by reference, so when you ask for the type of the Nth child of an expression `Expr` (where `Expr` is not a reference type), you get the child's type after references and cv-qualifiers have been stripped from it.
239
240 template<typename Expr>
241 void test_result_of_child_c(Expr const &expr)
242 {
243 typedef typename proto::result_of::child_c<Expr, 0>::type type;
244
245 // Since Expr is not a reference type,
246 // result_of::child_c<Expr, 0>::type is a
247 // non-cv qualified, non-reference type:
248 BOOST_MPL_ASSERT((
249 boost::is_same< type, proto::terminal<int>::type >
250 ));
251 }
252
253 // ...
254 proto::terminal<int>::type i = {42};
255 test_result_of_child_c( i + 2 );
256
257However, if you ask for the type of the Nth child of `Expr &` or `Expr const &`
258(note the reference), the result type will be a reference, regardless of whether
259the child is actually stored by reference or not. If you need to know exactly
260how the child is stored in the node, whether by reference or by value, you can
261use `fusion::result_of::value_at<Expr, N>::type`. The following table summarizes
262the behavior of the _result_of_child_c_ metafunction.
263
264[table Accessing Child Types
265 [[Metafunction Invocation][When the Child Is ...][The Result Is ...]]
266 [[`proto::result_of::child_c<Expr, N>::type`][`T`][``typename boost::remove_const<
267 typename boost::remove_reference<T>::type
268>::type``]]
269 [[`proto::result_of::child_c<Expr &, N>::type`][`T`][``typename boost::add_reference<T>::type``]]
270 [[`proto::result_of::child_c<Expr const &, N>::type`][`T`][``typename boost::add_reference<
271 typename boost::add_const<T>::type
272>::type``]]
273 [[`fusion::result_of::value_at<Expr, N>::type`][`T`][`T`]]
274]
275
276[/=======================]
277[heading Common Shortcuts]
278[/=======================]
279
280Most operators in C++ are unary or binary, so accessing the only operand, or the left and right operands, are very common operations. For this reason, Proto provides the _child_, _left_, and _right_ functions. _child_ and _left_ are synonymous with `proto::child_c<0>()`, and _right_ is synonymous with `proto::child_c<1>()`.
281
282There are also _result_of_child_, _result_of_left_, and _result_of_right_ metafunctions that merely forward to their _result_of_child_c_ counterparts.
283
284[endsect]
285
286[/===============================]
287[section Deep-copying Expressions]
288[/===============================]
289
290When you build an expression template with Proto, all the intermediate child nodes are held /by reference/. The avoids needless copies, which is crucial if you want your EDSL to perform well at runtime. Naturally, there is a danger if the temporary objects go out of scope before you try to evaluate your expression template. This is especially a problem in C++0x with the new `decltype` and `auto` keywords. Consider:
291
292 // OOPS: "ex" is left holding dangling references
293 auto ex = proto::lit(1) + 2;
294
295The problem can happen in today's C++ also if you use `BOOST_TYPEOF()` or `BOOST_AUTO()`, or if you try to pass an expression template outside the scope of its constituents.
296
297In these cases, you want to deep-copy your expression template so that all intermediate nodes and the terminals are held /by value/. That way, you can safely assign the expression template to a local variable or return it from a function without worrying about dangling references. You can do this with _deep_copy_ as fo
298llows:
299
300 // OK, "ex" has no dangling references
301 auto ex = proto::deep_copy( proto::lit(1) + 2 );
302
303If you are using _typeof_, it would look like this:
304
305 // OK, use BOOST_AUTO() and proto::deep_copy() to
306 // store an expression template in a local variable
307 BOOST_AUTO( ex, proto::deep_copy( proto::lit(1) + 2 ) );
308
309For the above code to work, you must include the [headerref boost/proto/proto_typeof.hpp] header, which also defines the _AUTO_ macro which automatically deep-copies its argument. With _AUTO_, the above code can be writen as:
310
311 // OK, BOOST_PROTO_AUTO() automatically deep-copies
312 // its argument:
313 BOOST_PROTO_AUTO( ex, proto::lit(1) + 2 );
314
315When deep-copying an expression tree, all intermediate nodes and all terminals are stored by value. The only exception is terminals that are function references, which are left alone.
316
317[note _deep_copy_ makes no exception for arrays, which it stores by value. That can potentially cause a large amount of data to be copied.]
318
319[endsect]
320
321[/============================]
322[section Debugging Expressions]
323[/============================]
324
325Proto provides a utility for pretty-printing expression trees that comes in very handy when you're trying to debug your EDSL. It's called _display_expr_, and you pass it the expression to print and optionally, an `std::ostream` to which to send the output. Consider:
326
327 // Use display_expr() to pretty-print an expression tree
328 proto::display_expr(
329 proto::lit("hello") + 42
330 );
331
332The above code writes this to `std::cout`:
333
334[pre plus(
335 terminal(hello)
336 , terminal(42)
337)]
338
339In order to call _display_expr_, all the terminals in the expression must be Streamable (that is, they can be written to a `std::ostream`). In addition, the tag types must all be Streamable as well. Here is an example that includes a custom terminal type and a custom tag:
340
341 // A custom tag type that is Streamable
342 struct MyTag
343 {
344 friend std::ostream &operator<<(std::ostream &s, MyTag)
345 {
346 return s << "MyTag";
347 }
348 };
349
350 // Some other Streamable type
351 struct MyTerminal
352 {
353 friend std::ostream &operator<<(std::ostream &s, MyTerminal)
354 {
355 return s << "MyTerminal";
356 }
357 };
358
359 int main()
360 {
361 // Display an expression tree that contains a custom
362 // tag and a user-defined type in a terminal
363 proto::display_expr(
364 proto::make_expr<MyTag>(MyTerminal()) + 42
365 );
366 }
367
368The above code prints the following:
369
370[pre plus(
371 MyTag(
372 terminal(MyTerminal)
373 )
374 , terminal(42)
375)]
376
377[endsect]
378
379[/=============================================================]
380[section:tags_and_metafunctions Operator Tags and Metafunctions]
381[/=============================================================]
382
383The following table lists the overloadable C++ operators, the Proto tag types for each, and the name of the metafunctions for generating the corresponding Proto expression types. And as we'll see later, the metafunctions are also usable as grammars for matching such nodes, as well as pass-through transforms.
384
385[table Operators, Tags and Metafunctions
386 [[Operator]
387 [Proto Tag]
388 [Proto Metafunction]]
389
390 [[unary `+`]
391 [`proto::tag::unary_plus`]
392 [`proto::unary_plus<>`]]
393
394 [[unary `-`]
395 [`proto::tag::negate`]
396 [`proto::negate<>`]]
397
398 [[unary `*`]
399 [`proto::tag::dereference`]
400 [`proto::dereference<>`]]
401
402 [[unary `~`]
403 [`proto::tag::complement`]
404 [`proto::complement<>`]]
405
406 [[unary `&`]
407 [`proto::tag::address_of`]
408 [`proto::address_of<>`]]
409
410 [[unary `!`]
411 [`proto::tag::logical_not`]
412 [`proto::logical_not<>`]]
413
414 [[unary prefix `++`]
415 [`proto::tag::pre_inc`]
416 [`proto::pre_inc<>`]]
417
418 [[unary prefix `--`]
419 [`proto::tag::pre_dec`]
420 [`proto::pre_dec<>`]]
421
422 [[unary postfix `++`]
423 [`proto::tag::post_inc`]
424 [`proto::post_inc<>`]]
425
426 [[unary postfix `--`]
427 [`proto::tag::post_dec`]
428 [`proto::post_dec<>`]]
429
430 [[binary `<<`]
431 [`proto::tag::shift_left`]
432 [`proto::shift_left<>`]]
433
434 [[binary `>>`]
435 [`proto::tag::shift_right`]
436 [`proto::shift_right<>`]]
437
438 [[binary `*`]
439 [`proto::tag::multiplies`]
440 [`proto::multiplies<>`]]
441
442 [[binary `/`]
443 [`proto::tag::divides`]
444 [`proto::divides<>`]]
445
446 [[binary `%`]
447 [`proto::tag::modulus`]
448 [`proto::modulus<>`]]
449
450 [[binary `+`]
451 [`proto::tag::plus`]
452 [`proto::plus<>`]]
453
454 [[binary `-`]
455 [`proto::tag::minus`]
456 [`proto::minus<>`]]
457
458 [[binary `<`]
459 [`proto::tag::less`]
460 [`proto::less<>`]]
461
462 [[binary `>`]
463 [`proto::tag::greater`]
464 [`proto::greater<>`]]
465
466 [[binary `<=`]
467 [`proto::tag::less_equal`]
468 [`proto::less_equal<>`]]
469
470 [[binary `>=`]
471 [`proto::tag::greater_equal`]
472 [`proto::greater_equal<>`]]
473
474 [[binary `==`]
475 [`proto::tag::equal_to`]
476 [`proto::equal_to<>`]]
477
478 [[binary `!=`]
479 [`proto::tag::not_equal_to`]
480 [`proto::not_equal_to<>`]]
481
482 [[binary `||`]
483 [`proto::tag::logical_or`]
484 [`proto::logical_or<>`]]
485
486 [[binary `&&`]
487 [`proto::tag::logical_and`]
488 [`proto::logical_and<>`]]
489
490 [[binary `&`]
491 [`proto::tag::bitwise_and`]
492 [`proto::bitwise_and<>`]]
493
494 [[binary `|`]
495 [`proto::tag::bitwise_or`]
496 [`proto::bitwise_or<>`]]
497
498 [[binary `^`]
499 [`proto::tag::bitwise_xor`]
500 [`proto::bitwise_xor<>`]]
501
502 [[binary `,`]
503 [`proto::tag::comma`]
504 [`proto::comma<>`]]
505
506 [[binary `->*`]
507 [`proto::tag::mem_ptr`]
508 [`proto::mem_ptr<>`]]
509
510 [[binary `=`]
511 [`proto::tag::assign`]
512 [`proto::assign<>`]]
513
514 [[binary `<<=`]
515 [`proto::tag::shift_left_assign`]
516 [`proto::shift_left_assign<>`]]
517
518 [[binary `>>=`]
519 [`proto::tag::shift_right_assign`]
520 [`proto::shift_right_assign<>`]]
521
522 [[binary `*=`]
523 [`proto::tag::multiplies_assign`]
524 [`proto::multiplies_assign<>`]]
525
526 [[binary `/=`]
527 [`proto::tag::divides_assign`]
528 [`proto::divides_assign<>`]]
529
530 [[binary `%=`]
531 [`proto::tag::modulus_assign`]
532 [`proto::modulus_assign<>`]]
533
534 [[binary `+=`]
535 [`proto::tag::plus_assign`]
536 [`proto::plus_assign<>`]]
537
538 [[binary `-=`]
539 [`proto::tag::minus_assign`]
540 [`proto::minus_assign<>`]]
541
542 [[binary `&=`]
543 [`proto::tag::bitwise_and_assign`]
544 [`proto::bitwise_and_assign<>`]]
545
546 [[binary `|=`]
547 [`proto::tag::bitwise_or_assign`]
548 [`proto::bitwise_or_assign<>`]]
549
550 [[binary `^=`]
551 [`proto::tag::bitwise_xor_assign`]
552 [`proto::bitwise_xor_assign<>`]]
553
554 [[binary subscript]
555 [`proto::tag::subscript`]
556 [`proto::subscript<>`]]
557
558 [[ternary `?:`]
559 [`proto::tag::if_else_`]
560 [`proto::if_else_<>`]]
561
562 [[n-ary function call]
563 [`proto::tag::function`]
564 [`proto::function<>`]]
565]
566
567[endsect]
568
569[/======================================]
570[section Expressions as Fusion Sequences]
571[/======================================]
572
573Boost.Fusion is a library of iterators, algorithms, containers and adaptors for manipulating heterogeneous sequences. In essence, a Proto expression is just a heterogeneous sequence of its child expressions, and so Proto expressions are valid Fusion random-access sequences. That means you can apply Fusion algorithms to them, transform them, apply Fusion filters and views to them, and access their elements using `fusion::at()`. The things Fusion can do to heterogeneous sequences are beyond the scope of this users' guide, but below is a simple example. It takes a lazy function invocation like `fun(1,2,3,4)` and uses Fusion to print the function arguments in order.
574
575 struct display
576 {
577 template<typename T>
578 void operator()(T const &t) const
579 {
580 std::cout << t << std::endl;
581 }
582 };
583
584 struct fun_t {};
585 proto::terminal<fun_t>::type const fun = {{}};
586
587 // ...
588 fusion::for_each(
589 fusion::transform(
590 // pop_front() removes the "fun" child
591 fusion::pop_front(fun(1,2,3,4))
592 // Extract the ints from the terminal nodes
593 , proto::functional::value()
594 )
595 , display()
596 );
597
598Recall from the Introduction that types in the `proto::functional` namespace
599define function objects that correspond to Proto's free functions. So
600`proto::functional::value()` creates a function object that is equivalent to
601the `proto::value()` function. The above invocation of `fusion::for_each()`
602displays the following:
603
604[pre
6051
6062
6073
6084
609]
610
611Terminals are also valid Fusion sequences. They contain exactly one element: their value.
612
613[/========================================]
614[heading Flattening Proto Expression Tress]
615[/========================================]
616
617Imagine a slight variation of the above example where, instead of iterating over the arguments of a lazy function invocation, we would like to iterate over the terminals in an addition expression:
618
619 proto::terminal<int>::type const _1 = {1};
620
621 // ERROR: this doesn't work! Why?
622 fusion::for_each(
623 fusion::transform(
624 _1 + 2 + 3 + 4
625 , proto::functional::value()
626 )
627 , display()
628 );
629
630The reason this doesn't work is because the expression `_1 + 2 + 3 + 4` does not describe a flat sequence of terminals --- it describes a binary tree. We can treat it as a flat sequence of terminals, however, using Proto's _flatten_ function. _flatten_ returns a view which makes a tree appear as a flat Fusion sequence. If the top-most node has a tag type `T`, then the elements of the flattened sequence are the child nodes that do /not/ have tag type `T`. This process is evaluated recursively. So the above can correctly be written as:
631
632 proto::terminal<int>::type const _1 = {1};
633
634 // OK, iterate over a flattened view
635 fusion::for_each(
636 fusion::transform(
637 proto::flatten(_1 + 2 + 3 + 4)
638 , proto::functional::value()
639 )
640 , display()
641 );
642
643The above invocation of `fusion::for_each()` displays the following:
644
645[pre
6461
6472
6483
6494
650]
651
652[endsect]
653
654[/============================================================================]
655[section:expression_introspection Expression Introspection: Defining a Grammar]
656[/============================================================================]
657
658Expression trees can have a very rich and complicated structure. Often, you need to know some things about an expression's structure before you can process it. This section describes the tools Proto provides for peering inside an expression tree and discovering its structure. And as you'll see in later sections, all the really interesting things you can do with Proto begin right here.
659
660[/===============================================]
661[section:patterns Finding Patterns in Expressions]
662[/===============================================]
663
664Imagine your EDSL is a miniature I/O facility, with iostream operations that execute lazily. You might want expressions representing input operations to be processed by one function, and output operations to be processed by a different function. How would you do that?
665
666The answer is to write patterns (a.k.a, /grammars/) that match the structure of input and output expressions. Proto provides utilities for defining the grammars, and the _matches_ template for checking whether a given expression type matches the grammar.
667
668First, let's define some terminals we can use in our lazy I/O expressions:
669
670 proto::terminal< std::istream & >::type cin_ = { std::cin };
671 proto::terminal< std::ostream & >::type cout_ = { std::cout };
672
673Now, we can use `cout_` instead of `std::cout`, and get I/O expression trees that we can execute later. To define grammars that match input and output expressions of the form `cin_ >> i` and `cout_ << 1` we do this:
674
675 struct Input
676 : proto::shift_right< proto::terminal< std::istream & >, proto::_ >
677 {};
678
679 struct Output
680 : proto::shift_left< proto::terminal< std::ostream & >, proto::_ >
681 {};
682
683We've seen the template `proto::terminal<>` before, but here we're using it
684without accessing the nested `::type`. When used like this, it is a very simple
685grammar, as are `proto::shift_right<>` and `proto::shift_left<>`. The newcomer
686here is `_` in the `proto` namespace. It is a wildcard that matches anything.
687The `Input` struct is a grammar that matches any right-shift expression that
688has a `std::istream` terminal as its left operand.
689
690We can use these grammars together with the _matches_ template to query at
691compile time whether a given I/O expression type is an input or output
692operation. Consider the following:
693
694 template< typename Expr >
695 void input_output( Expr const & expr )
696 {
697 if( proto::matches< Expr, Input >::value )
698 {
699 std::cout << "Input!\n";
700 }
701
702 if( proto::matches< Expr, Output >::value )
703 {
704 std::cout << "Output!\n";
705 }
706 }
707
708 int main()
709 {
710 int i = 0;
711 input_output( cout_ << 1 );
712 input_output( cin_ >> i );
713
714 return 0;
715 }
716
717This program prints the following:
718
719[pre
720Output!
721Input!
722]
723
724If we wanted to break the `input_output()` function into two functions, one that handles input expressions and one for output expressions, we can use `boost::enable_if<>`, as follows:
725
726 template< typename Expr >
727 typename boost::enable_if< proto::matches< Expr, Input > >::type
728 input_output( Expr const & expr )
729 {
730 std::cout << "Input!\n";
731 }
732
733 template< typename Expr >
734 typename boost::enable_if< proto::matches< Expr, Output > >::type
735 input_output( Expr const & expr )
736 {
737 std::cout << "Output!\n";
738 }
739
740This works as the previous version did. However, the following does not compile at all:
741
742 input_output( cout_ << 1 << 2 ); // oops!
743
744What's wrong? The problem is that this expression does not match our grammar. The expression groups as if it were written like `(cout_ << 1) << 2`. It will not match the `Output` grammar, which expects the left operand to be a terminal, not another left-shift operation. We need to fix the grammar.
745
746We notice that in order to verify an expression as input or output, we'll need to recurse down to the bottom-left-most leaf and check that it is a `std::istream` or `std::ostream`. When we get to the terminal, we must stop recursing. We can express this in our grammar using _or_. Here are the correct `Input` and `Output` grammars:
747
748 struct Input
749 : proto::or_<
750 proto::shift_right< proto::terminal< std::istream & >, proto::_ >
751 , proto::shift_right< Input, proto::_ >
752 >
753 {};
754
755 struct Output
756 : proto::or_<
757 proto::shift_left< proto::terminal< std::ostream & >, proto::_ >
758 , proto::shift_left< Output, proto::_ >
759 >
760 {};
761
762This may look a little odd at first. We seem to be defining the `Input` and `Output` types in terms of themselves. This is perfectly OK, actually. At the point in the grammar that the `Input` and `Output` types are being used, they are /incomplete/, but by the time we actually evaluate the grammar with _matches_, the types will be complete. These are recursive grammars, and rightly so because they must match a recursive data structure!
763
764Matching an expression such as `cout_ << 1 << 2` against the `Output` grammar procedes as follows:
765
766# The first alternate of the _or_ is tried first. It will fail, because the
767 expression `cout_ << 1 << 2` does not match the grammar `proto::shift_left<
768 proto::terminal< std::ostream & >, proto::_ >`.
769# Then the second alternate is tried next. We match the expression against
770 `proto::shift_left< Output, proto::_ >`. The expression is a left-shift, so we
771 next try to match the operands.
772# The right operand `2` matches `proto::_` trivially.
773# To see if the left operand `cout_ << 1` matches `Output`, we must recursively
774 evaluate the `Output` grammar. This time we succeed, because `cout_ << 1` will
775 match the first alternate of the _or_.
776
777We're done -- the grammar matches successfully.
778
779[endsect]
780
781[/===========================================]
782[section Fuzzy and Exact Matches of Terminals]
783[/===========================================]
784
785The terminals in an expression tree could be const or non-const references, or they might not be references at all. When writing grammars, you usually don't have to worry about it because _matches_ gives you a little wiggle room when matching terminals. A grammar such as `proto::terminal<int>` will match a terminal of type `int`, `int &`, or `int const &`.
786
787You can explicitly specify that you want to match a reference type. If you do, the type must match exactly. For instance, a grammar such as `proto::terminal<int &>` will only match an `int &`. It will not match an `int` or an `int const &`.
788
789The table below shows how Proto matches terminals. The simple rule is: if you want to match only reference types, you must specify the reference in your grammar. Otherwise, leave it off and Proto will ignore const and references.
790
791[table proto::matches<> and Reference / CV-Qualification of Terminals
792 [[Terminal] [Grammar] [Matches?]]
793 [[T] [T] [yes]]
794 [[T &] [T] [yes]]
795 [[T const &] [T] [yes]]
796 [[T] [T &] [no]]
797 [[T &] [T &] [yes]]
798 [[T const &] [T &] [no]]
799 [[T] [T const &] [no]]
800 [[T &] [T const &] [no]]
801 [[T const &] [T const &] [yes]]
802]
803
804This begs the question: What if you want to match an `int`, but not an `int &` or an `int const &`? For forcing exact matches, Proto provides the _exact_ template. For instance, `proto::terminal< proto::exact<int> >` would only match an `int` held by value.
805
806Proto gives you extra wiggle room when matching array types. Array types match themselves or the pointer types they decay to. This is especially useful with character arrays. The type returned by `proto::as_expr("hello")` is `proto::terminal<char const[6]>::type`. That's a terminal containing a 6-element character array. Naturally, you can match this terminal with the grammar `proto::terminal<char const[6]>`, but the grammar `proto::terminal<char const *>` will match it as well, as the following code fragment illustrates.
807
808 struct CharString
809 : proto::terminal< char const * >
810 {};
811
812 typedef proto::terminal< char const[6] >::type char_array;
813
814 BOOST_MPL_ASSERT(( proto::matches< char_array, CharString > ));
815
816What if we only wanted `CharString` to match terminals of exactly the type `char const *`? You can use _exact_ here to turn off the fuzzy matching of terminals, as follows:
817
818 struct CharString
819 : proto::terminal< proto::exact< char const * > >
820 {};
821
822 typedef proto::terminal<char const[6]>::type char_array;
823 typedef proto::terminal<char const *>::type char_string;
824
825 BOOST_MPL_ASSERT(( proto::matches< char_string, CharString > ));
826 BOOST_MPL_ASSERT_NOT(( proto::matches< char_array, CharString > ));
827
828Now, `CharString` does not match array types, only character string pointers.
829
830The inverse problem is a little trickier: what if you wanted to match all character arrays, but not character pointers? As mentioned above, the expression `as_expr("hello")` has the type `proto::terminal< char const[ 6 ] >::type`. If you wanted to match character arrays of arbitrary size, you could use `proto::N`, which is an array-size wildcard. The following grammar would match any string literal: `proto::terminal< char const[ proto::N ] >`.
831
832Sometimes you need even more wiggle room when matching terminals. For example, maybe you're building a calculator EDSL and you want to allow any terminals that are convertible to `double`. For that, Proto provides the _convertible_to_ template. You can use it as: `proto::terminal< proto::convertible_to< double > >`.
833
834There is one more way you can perform a fuzzy match on terminals. Consider the problem of trying to match a `std::complex<>` terminal. You can easily match a `std::complex<float>` or a `std::complex<double>`, but how would you match any instantiation of `std::complex<>`? You can use `proto::_` here to solve this problem. Here is the grammar to match any `std::complex<>` instantiation:
835
836 struct StdComplex
837 : proto::terminal< std::complex< proto::_ > >
838 {};
839
840When given a grammar like this, Proto will deconstruct the grammar and the terminal it is being matched against and see if it can match all the constituents.
841
842[endsect]
843
844[/====================================================]
845[section:if_and_not [^if_<>], [^and_<>], and [^not_<>]]
846[/====================================================]
847
848We've already seen how to use expression generators like `proto::terminal<>` and `proto::shift_right<>` as grammars. We've also seen _or_, which we can use to express a set of alternate grammars. There are a few others of interest; in particular, _if_, _and_ and _not_.
849
850The _not_ template is the simplest. It takes a grammar as a template parameter and logically negates it; `not_<Grammar>` will match any expression that `Grammar` does /not/ match.
851
852The _if_ template is used together with a Proto transform that is evaluated against expression types to find matches. (Proto transforms will be described later.)
853
854The _and_ template is like _or_, except that each argument of the _and_ must match in order for the _and_ to match. As an example, consider the definition of `CharString` above that uses _exact_. It could have been written without _exact_ as follows:
855
856 struct CharString
857 : proto::and_<
858 proto::terminal< proto::_ >
859 , proto::if_< boost::is_same< proto::_value, char const * >() >
860 >
861 {};
862
863This says that a `CharString` must be a terminal, /and/ its value type must be the same as `char const *`. Notice the template argument of _if_: `boost::is_same< proto::_value, char const * >()`. This is Proto transform that compares the value type of a terminal to `char const *`.
864
865The _if_ template has a couple of variants. In addition to `if_<Condition>` you can also say `if_<Condition, ThenGrammar>` and `if_<Condition, ThenGrammar, ElseGrammar>`. These let you select one sub-grammar or another based on the `Condition`.
866
867[endsect]
868
869[/=======================================================]
870[section:switch Improving Compile Times With [^switch_<>]]
871[/=======================================================]
872
873When your Proto grammar gets large, you'll start to run into some scalability problems with _or_, the construct you use to specify alternate sub-grammars. First, due to limitations in C++, _or_ can only accept up to a certain number of sub-grammars, controlled by the `BOOST_PROTO_MAX_LOGICAL_ARITY` macro. This macro defaults to eight, and you can set it higher, but doing so will aggravate another scalability problem: long compile times. With _or_, alternate sub-grammars are tried in order -- like a series of cascading `if`'s -- leading to lots of unnecessary template instantiations. What you would prefer instead is something like `switch` that avoids the expense of cascading `if`'s. That's the purpose of _switch_; although less convenient than _or_, it improves compile times for larger grammars and does not have an arbitrary fixed limit on the number of sub-grammars.
874
875Let's illustrate how to use _switch_ by first writing a big grammar with _or_ and then translating it to an equivalent grammar using _switch_:
876
877 // Here is a big, inefficient grammar
878 struct ABigGrammar
879 : proto::or_<
880 proto::terminal<int>
881 , proto::terminal<double>
882 , proto::unary_plus<ABigGrammar>
883 , proto::negate<ABigGrammar>
884 , proto::complement<ABigGrammar>
885 , proto::plus<ABigGrammar, ABigGrammar>
886 , proto::minus<ABigGrammar, ABigGrammar>
887 , proto::or_<
888 proto::multiplies<ABigGrammar, ABigGrammar>
889 , proto::divides<ABigGrammar, ABigGrammar>
890 , proto::modulus<ABigGrammar, ABigGrammar>
891 >
892 >
893 {};
894
895The above might be the grammar to a more elaborate calculator EDSL. Notice that since there are more than eight sub-grammars, we had to chain the sub-grammars with a nested _or_ -- not very nice.
896
897The idea behind _switch_ is to dispatch based on an expression's tag type to a sub-grammar that handles expressions of that type. To use _switch_, you define a struct with a nested `case_<>` template, specialized on tag types. The above grammar can be expressed using _switch_ as follows. It is described below.
898
899 // Redefine ABigGrammar more efficiently using proto::switch_<>
900 struct ABigGrammar;
901
902 struct ABigGrammarCases
903 {
904 // The primary template matches nothing:
905 template<typename Tag>
906 struct case_
907 : proto::not_<_>
908 {};
909 };
910
911 // Terminal expressions are handled here
912 template<>
913 struct ABigGrammarCases::case_<proto::tag::terminal>
914 : proto::or_<
915 proto::terminal<int>
916 , proto::terminal<double>
917 >
918 {};
919
920 // Non-terminals are handled similarly
921 template<>
922 struct ABigGrammarCases::case_<proto::tag::unary_plus>
923 : proto::unary_plus<ABigGrammar>
924 {};
925
926 template<>
927 struct ABigGrammarCases::case_<proto::tag::negate>
928 : proto::negate<ABigGrammar>
929 {};
930
931 template<>
932 struct ABigGrammarCases::case_<proto::tag::complement>
933 : proto::complement<ABigGrammar>
934 {};
935
936 template<>
937 struct ABigGrammarCases::case_<proto::tag::plus>
938 : proto::plus<ABigGrammar, ABigGrammar>
939 {};
940
941 template<>
942 struct ABigGrammarCases::case_<proto::tag::minus>
943 : proto::minus<ABigGrammar, ABigGrammar>
944 {};
945
946 template<>
947 struct ABigGrammarCases::case_<proto::tag::multiplies>
948 : proto::multiplies<ABigGrammar, ABigGrammar>
949 {};
950
951 template<>
952 struct ABigGrammarCases::case_<proto::tag::divides>
953 : proto::divides<ABigGrammar, ABigGrammar>
954 {};
955
956 template<>
957 struct ABigGrammarCases::case_<proto::tag::modulus>
958 : proto::modulus<ABigGrammar, ABigGrammar>
959 {};
960
961 // Define ABigGrammar in terms of ABigGrammarCases
962 // using proto::switch_<>
963 struct ABigGrammar
964 : proto::switch_<ABigGrammarCases>
965 {};
966
967Matching an expression type `E` against `proto::switch_<C>` is equivalent to matching it against `C::case_<E::proto_tag>`. By dispatching on the expression's tag type, we can jump to the sub-grammar that handles expressions of that type, skipping over all the other sub-grammars that couldn't possibly match. If there is no specialization of `case_<>` for a particular tag type, we select the primary template. In this case, the primary template inherits from `proto::not_<_>` which matches no expressions.
968
969Notice the specialization that handles terminals:
970
971 // Terminal expressions are handled here
972 template<>
973 struct ABigGrammarCases::case_<proto::tag::terminal>
974 : proto::or_<
975 proto::terminal<int>
976 , proto::terminal<double>
977 >
978 {};
979
980The `proto::tag::terminal` type by itself isn't enough to select an appropriate sub-grammar, so we use _or_ to list the alternate sub-grammars that match terminals.
981
982[note You might be tempted to define your `case_<>` specializations /in situ/ as follows:
983
984``
985 struct ABigGrammarCases
986 {
987 template<typename Tag>
988 struct case_ : proto::not_<_> {};
989
990 // ERROR: not legal C++
991 template<>
992 struct case_<proto::tag::terminal>
993 /* ... */
994 };
995``
996
997Unfortunately, for arcane reasons, it is not legal to define an explicit nested specialization /in situ/ like this. It is, however, perfectly legal to define /partial/ specializations /in situ/, so you can add a extra dummy template parameter that has a default, as follows:
998
999``
1000 struct ABigGrammarCases
1001 {
1002 // Note extra "Dummy" template parameter here:
1003 template<typename Tag, int Dummy = 0>
1004 struct case_ : proto::not_<_> {};
1005
1006 // OK: "Dummy" makes this a partial specialization
1007 // instead of an explicit specialization.
1008 template<int Dummy>
1009 struct case_<proto::tag::terminal, Dummy>
1010 /* ... */
1011 };
1012``
1013
1014You might find this cleaner than defining explicit `case_<>` specializations outside of their enclosing struct.
1015]
1016
1017[endsect]
1018
1019[/==================================]
1020[section Matching Vararg Expressions]
1021[/==================================]
1022
1023Not all of C++'s overloadable operators are unary or binary. There is the oddball `operator()` -- the function call operator -- which can have any number of arguments. Likewise, with Proto you may define your own "operators" that could also take more that two arguments. As a result, there may be nodes in your Proto expression tree that have an arbitrary number of children (up to _MAX_ARITY_, which is configurable). How do you write a grammar to match such a node?
1024
1025For such cases, Proto provides the _vararg_ class template. Its template argument is a grammar, and the _vararg_ will match the grammar zero or more times. Consider a Proto lazy function called `fun()` that can take zero or more characters as arguments, as follows:
1026
1027 struct fun_tag {};
1028 struct FunTag : proto::terminal< fun_tag > {};
1029 FunTag::type const fun = {{}};
1030
1031 // example usage:
1032 fun();
1033 fun('a');
1034 fun('a', 'b');
1035 ...
1036
1037Below is the grammar that matches all the allowable invocations of `fun()`:
1038
1039 struct FunCall
1040 : proto::function< FunTag, proto::vararg< proto::terminal< char > > >
1041 {};
1042
1043The `FunCall` grammar uses _vararg_ to match zero or more character literals as arguments of the `fun()` function.
1044
1045As another example, can you guess what the following grammar matches?
1046
1047 struct Foo
1048 : proto::or_<
1049 proto::terminal< proto::_ >
1050 , proto::nary_expr< proto::_, proto::vararg< Foo > >
1051 >
1052 {};
1053
1054Here's a hint: the first template parameter to `proto::nary_expr<>` represents the node type, and any additional template parameters represent child nodes. The answer is that this is a degenerate grammar that matches every possible expression tree, from root to leaves.
1055
1056[endsect]
1057
1058[/=============================]
1059[section Defining EDSL Grammars]
1060[/=============================]
1061
1062In this section we'll see how to use Proto to define a grammar for your EDSL and use it to validate expression templates, giving short, readable compile-time errors for invalid expressions.
1063
1064[tip You might think that this is a backwards way of doing things. ["If Proto let
1065me select which operators to overload, my users wouldn't be able to create invalid
1066expressions in the first place, and I wouldn't need a grammar at all!] That may be
1067true, but there are reasons for preferring to do things this way.
1068
1069First, it lets you develop your EDSL rapidly -- all the operators are there for you
1070already! -- and worry about invalid syntax later.
1071
1072Second, it might be the case that some operators are only allowed in certain
1073contexts within your EDSL. This is easy to express with a grammar, and hard to do
1074with straight operator overloading.
1075
1076Third, using an EDSL grammar to flag invalid expressions can often yield better
1077errors than manually selecting the overloaded operators.
1078
1079Fourth, the grammar can be used for more than just validation. You can use your
1080grammar to define ['tree transformations] that convert expression templates into
1081other more useful objects.
1082
1083If none of the above convinces you, you actually /can/ use Proto to control which
1084operators are overloaded within your domain. And to do it, you need to define a
1085grammar!]
1086
1087In a previous section, we used Proto to define an EDSL for a lazily evaluated calculator that allowed any combination of placeholders, floating-point literals, addition, subtraction, multiplication, division and grouping. If we were to write the grammar for this EDSL in [@http://en.wikipedia.org/wiki/Extended_Backus_Naur_Form EBNF], it might look like this:
1088
1089[pre
1090group ::= '(' expression ')'
1091factor ::= double | '_1' | '_2' | group
1092term ::= factor (('\*' factor) | ('/' factor))\*
1093expression ::= term (('+' term) | ('-' term))*
1094]
1095
1096This captures the syntax, associativity and precedence rules of a calculator.
1097Writing the grammar for our calculator EDSL using Proto is /even simpler/.
1098Since we are using C++ as the host language, we are bound to the associativity
1099and precedence rules for the C++ operators. Our grammar can assume them. Also,
1100in C++ grouping is already handled for us with the use of parenthesis, so we
1101don't have to code that into our grammar.
1102
1103Let's begin our grammar for forward-declaring it:
1104
1105 struct CalculatorGrammar;
1106
1107It's an incomplete type at this point, but we'll still be able to use it to
1108define the rules of our grammar. Let's define grammar rules for the terminals:
1109
1110 struct Double
1111 : proto::terminal< proto::convertible_to< double > >
1112 {};
1113
1114 struct Placeholder1
1115 : proto::terminal< placeholder<0> >
1116 {};
1117
1118 struct Placeholder2
1119 : proto::terminal< placeholder<1> >
1120 {};
1121
1122 struct Terminal
1123 : proto::or_< Double, Placeholder1, Placeholder2 >
1124 {};
1125
1126Now let's define the rules for addition, subtraction, multiplication and division.
1127Here, we can ignore issues of associativity and precedence -- the C++ compiler will
1128enforce that for us. We only must enforce that the arguments to the operators must
1129themselves conform to the `CalculatorGrammar` that we forward-declared above.
1130
1131 struct Plus
1132 : proto::plus< CalculatorGrammar, CalculatorGrammar >
1133 {};
1134
1135 struct Minus
1136 : proto::minus< CalculatorGrammar, CalculatorGrammar >
1137 {};
1138
1139 struct Multiplies
1140 : proto::multiplies< CalculatorGrammar, CalculatorGrammar >
1141 {};
1142
1143 struct Divides
1144 : proto::divides< CalculatorGrammar, CalculatorGrammar >
1145 {};
1146
1147Now that we've defined all the parts of the grammar, we can define
1148`CalculatorGrammar`:
1149
1150 struct CalculatorGrammar
1151 : proto::or_<
1152 Terminal
1153 , Plus
1154 , Minus
1155 , Multiplies
1156 , Divides
1157 >
1158 {};
1159
1160That's it! Now we can use `CalculatorGrammar` to enforce that an expression
1161template conforms to our grammar. We can use _matches_ and `BOOST_MPL_ASSERT()`
1162to issue readable compile-time errors for invalid expressions, as below:
1163
1164 template< typename Expr >
1165 void evaluate( Expr const & expr )
1166 {
1167 BOOST_MPL_ASSERT(( proto::matches< Expr, CalculatorGrammar > ));
1168 // ...
1169 }
1170
1171[endsect]
1172
1173[endsect]
1174
1175[endsect]