]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | [/============================================================================== |
2 | Copyright (C) 2001-2015 Joel de Guzman | |
3 | Copyright (C) 2001-2011 Hartmut Kaiser | |
4 | ||
5 | Distributed under the Boost Software License, Version 1.0. (See accompanying | |
6 | file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | |
7 | ===============================================================================/] | |
8 | [section Syntax Diagram] | |
9 | ||
10 | In the next section, we will deal with Parsing Expression Grammars | |
11 | (PEG) [footnote Bryan Ford: Parsing Expression Grammars: A Recognition-Based | |
12 | Syntactic Foundation, [@http://pdos.csail.mit.edu/~baford/packrat/popl04/]], | |
13 | a variant of Extended Backus-Naur Form (EBNF) [footnote Richard E. Pattis: EBNF: | |
14 | A Notation to Describe Syntax, [@http://www.cs.cmu.edu/~pattis/misc/ebnf.pdf]] | |
15 | with a different interpretation. It is easier to understand PEG using Syntax | |
16 | Diagrams. Syntax diagrams represent a grammar graphically. It was used | |
17 | extensively by Niklaus Wirth [footnote Niklaus Wirth: The Programming Language | |
18 | Pascal. (July 1973)] in the "Pascal User Manual". Syntax Diagrams are easily | |
19 | understandable by programmers due to their similarity to flow charts. The | |
20 | isomorphism of the diagrams and functions make them ideal for representing | |
21 | __rd__ parsers which are essentially mutually recursive functions. | |
22 | ||
23 | [heading Elements] | |
24 | ||
25 | All diagrams have one entry and one exit point. Arrows connect all possible | |
26 | paths through the grammar from the entry point to the exit point. | |
27 | ||
28 | [:__sd_start_stop__] | |
29 | ||
30 | Terminals are represented by round boxes. Terminals are atomic and | |
31 | usually represent plain characters, strings or tokens. | |
32 | ||
33 | [:__sd_terminals__] | |
34 | ||
35 | Non-terminals are represented by boxes. Diagrams are modularized using | |
36 | named non-terminals. A complex diagram can be broken down into a set of | |
37 | non-terminals. Non-terminals also allow recursion (i.e. a non-terminal can call | |
38 | itself). | |
39 | ||
40 | [:__sd_non_terminals__] | |
41 | ||
42 | [heading Constructs] | |
43 | ||
44 | The most basic composition is the Sequence. B follows A: | |
45 | ||
46 | [:__sd_sequence__] | |
47 | ||
48 | The ordered choice henceforth we will call /alternatives/. In PEG, ordered | |
49 | choice and alternatives are not quite the same. PEG allows ambiguity of choice | |
50 | where one or more branches can succeed. In PEG, in case of ambiguity, the first | |
51 | one always wins. | |
52 | ||
53 | [:__sd_choice__] | |
54 | ||
55 | The optional (zero-or-one): | |
56 | ||
57 | [:__sd_optional__] | |
58 | ||
59 | Now, the loops. We have the zero-or-more and one-or-more: | |
60 | ||
61 | [:__sd_kleene__] | |
62 | [:__sd_plus__] | |
63 | ||
64 | Take note that, as in PEG, these loops behave greedily. If there is | |
65 | another 'A' just before the end-point, it will always fail because the | |
66 | preceding loop has already exhausted all 'A's and there is nothing more left. | |
67 | This is a crucial difference between PEG and general Context Free Grammars | |
68 | (CFGs). This behavior is quite obvious with syntax diagrams as they resemble | |
69 | flow-charts. | |
70 | ||
71 | [heading Predicates] | |
72 | ||
73 | Now, the following are Syntax Diagram versions of PEG predicates. These are not | |
74 | traditionally found in Syntax Diagrams. These are special extensions we invented | |
75 | to closely follow PEGs. | |
76 | ||
77 | First, we introduce a new element, the Predicate: | |
78 | ||
79 | [:__sd_predicate__] | |
80 | ||
81 | This is similar to the conditionals in flow charts where the 'No' branch is | |
82 | absent and always signals a failed parse. | |
83 | ||
84 | We have two versions of the predicate, the /And-Predicate/ and the | |
85 | /Not-Predicate/: | |
86 | ||
87 | [:__sd_and_predicate__] | |
88 | [:__sd_not_predicate__] | |
89 | ||
90 | The /And-Predicate/ tries the predicate, P, and succeeds if P succeeds, | |
91 | or otherwise fail. The opposite is true with the /Not-Predicate/. It | |
92 | tries the predicate, P, and fails if P succeeds, or otherwise succeeds. Both | |
93 | versions do a look-ahead but do not consume any input regardless if P succeeds | |
94 | or not. | |
95 | ||
96 | [endsect] |