]>
Commit | Line | Data |
---|---|---|
878ddf1f | 1 | \r |
2 | \r | |
3 | \r | |
4 | ANTLR(1) PCCTS Manual Pages ANTLR(1)\r | |
5 | \r | |
6 | \r | |
7 | \r | |
8 | NAME\r | |
9 | antlr - ANother Tool for Language Recognition\r | |
10 | \r | |
11 | SYNTAX\r | |
12 | antlr [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bg_\br_\ba_\bm_\bm_\ba_\br__\bf_\bi_\bl_\be_\bs\r | |
13 | \r | |
14 | DESCRIPTION\r | |
15 | _\bA_\bn_\bt_\bl_\br converts an extended form of context-free grammar into\r | |
16 | a set of C functions which directly implement an efficient\r | |
17 | form of deterministic recursive-descent LL(k) parser.\r | |
18 | Context-free grammars may be augmented with predicates to\r | |
19 | allow semantics to influence parsing; this allows a form of\r | |
20 | context-sensitive parsing. Selective backtracking is also\r | |
21 | available to handle non-LL(k) and even non-LALR(k) con-\r | |
22 | structs. _\bA_\bn_\bt_\bl_\br also produces a definition of a lexer which\r | |
23 | can be automatically converted into C code for a DFA-based\r | |
24 | lexer by _\bd_\bl_\bg. Hence, _\ba_\bn_\bt_\bl_\br serves a function much like that\r | |
25 | of _\by_\ba_\bc_\bc, however, it is notably more flexible and is more\r | |
26 | integrated with a lexer generator (_\ba_\bn_\bt_\bl_\br directly generates\r | |
27 | _\bd_\bl_\bg code, whereas _\by_\ba_\bc_\bc and _\bl_\be_\bx are given independent\r | |
28 | descriptions). Unlike _\by_\ba_\bc_\bc which accepts LALR(1) grammars,\r | |
29 | _\ba_\bn_\bt_\bl_\br accepts LL(k) grammars in an extended BNF notation -\r | |
30 | which eliminates the need for precedence rules.\r | |
31 | \r | |
32 | Like _\by_\ba_\bc_\bc grammars, _\ba_\bn_\bt_\bl_\br grammars can use automatically-\r | |
33 | maintained symbol attribute values referenced as dollar\r | |
34 | variables. Further, because _\ba_\bn_\bt_\bl_\br generates top-down\r | |
35 | parsers, arbitrary values may be inherited from parent rules\r | |
36 | (passed like function parameters). _\bA_\bn_\bt_\bl_\br also has a mechan-\r | |
37 | ism for creating and manipulating abstract-syntax-trees.\r | |
38 | \r | |
39 | There are various other niceties in _\ba_\bn_\bt_\bl_\br, including the\r | |
40 | ability to spread one grammar over multiple files or even\r | |
41 | multiple grammars in a single file, the ability to generate\r | |
42 | a version of the grammar with actions stripped out (for\r | |
43 | documentation purposes), and lots more.\r | |
44 | \r | |
45 | OPTIONS\r | |
46 | -ck _\bn\r | |
47 | Use up to _\bn symbols of lookahead when using compressed\r | |
48 | (linear approximation) lookahead. This type of looka-\r | |
49 | head is very cheap to compute and is attempted before\r | |
50 | full LL(k) lookahead, which is of exponential complex-\r | |
51 | ity in the worst case. In general, the compressed loo-\r | |
52 | kahead can be much deeper (e.g, -ck 10) _\bt_\bh_\ba_\bn _\bt_\bh_\be _\bf_\bu_\bl_\bl\r | |
53 | _\bl_\bo_\bo_\bk_\ba_\bh_\be_\ba_\bd (_\bw_\bh_\bi_\bc_\bh _\bu_\bs_\bu_\ba_\bl_\bl_\by _\bm_\bu_\bs_\bt _\bb_\be _\bl_\be_\bs_\bs _\bt_\bh_\ba_\bn _\b4).\r | |
54 | \r | |
55 | -CC Generate C++ output from both ANTLR and DLG.\r | |
56 | \r | |
57 | -cr Generate a cross-reference for all rules. For each\r | |
58 | rule, print a list of all other rules that reference\r | |
59 | it.\r | |
60 | \r | |
61 | -e1 Ambiguities/errors shown in low detail (default).\r | |
62 | \r | |
63 | -e2 Ambiguities/errors shown in more detail.\r | |
64 | \r | |
65 | -e3 Ambiguities/errors shown in excruciating detail.\r | |
66 | \r | |
67 | -fe file\r | |
68 | Rename err.c to file.\r | |
69 | \r | |
70 | -fh file\r | |
71 | Rename stdpccts.h header (turns on -gh) to file.\r | |
72 | \r | |
73 | -fl file\r | |
74 | Rename lexical output, parser.dlg, to file.\r | |
75 | \r | |
76 | -fm file\r | |
77 | Rename file with lexical mode definitions, mode.h, to\r | |
78 | file.\r | |
79 | \r | |
80 | -fr file\r | |
81 | Rename file which remaps globally visible symbols,\r | |
82 | remap.h, to file.\r | |
83 | \r | |
84 | -ft file\r | |
85 | Rename tokens.h to file.\r | |
86 | \r | |
87 | -ga Generate ANSI-compatible code (default case). This has\r | |
88 | not been rigorously tested to be ANSI XJ11 C compliant,\r | |
89 | but it is close. The normal output of _\ba_\bn_\bt_\bl_\br is\r | |
90 | currently compilable under both K&R, ANSI C, and C++-\r | |
91 | this option does nothing because _\ba_\bn_\bt_\bl_\br generates a\r | |
92 | bunch of #ifdef's to do the right thing depending on\r | |
93 | the language.\r | |
94 | \r | |
95 | -gc Indicates that _\ba_\bn_\bt_\bl_\br should generate no C code, i.e.,\r | |
96 | only perform analysis on the grammar.\r | |
97 | \r | |
98 | -gd C code is inserted in each of the _\ba_\bn_\bt_\bl_\br generated pars-\r | |
99 | ing functions to provide for user-defined handling of a\r | |
100 | detailed parse trace. The inserted code consists of\r | |
101 | calls to the user-supplied macros or functions called\r | |
102 | zzTRACEIN and zzTRACEOUT. The only argument is a _\bc_\bh_\ba_\br\r | |
103 | * pointing to a C-style string which is the grammar\r | |
104 | rule recognized by the current parsing function. If no\r | |
105 | definition is given for the trace functions, upon rule\r | |
106 | entry and exit, a message will be printed indicating\r | |
107 | that a particular rule as been entered or exited.\r | |
108 | \r | |
109 | -ge Generate an error class for each non-terminal.\r | |
110 | \r | |
111 | -gh Generate stdpccts.h for non-ANTLR-generated files to\r | |
112 | include. This file contains all defines needed to\r | |
113 | describe the type of parser generated by _\ba_\bn_\bt_\bl_\br (e.g.\r | |
114 | how much lookahead is used and whether or not trees are\r | |
115 | constructed) and contains the header action specified\r | |
116 | by the user.\r | |
117 | \r | |
118 | -gk Generate parsers that delay lookahead fetches until\r | |
119 | needed. Without this option, _\ba_\bn_\bt_\bl_\br generates parsers\r | |
120 | which always have _\bk tokens of lookahead available.\r | |
121 | \r | |
122 | -gl Generate line info about grammar actions in C parser of\r | |
123 | the form # _\bl_\bi_\bn_\be "_\bf_\bi_\bl_\be" which makes error messages from\r | |
124 | the C/C++ compiler make more sense as they will point\r | |
125 | into the grammar file not the resulting C file.\r | |
126 | Debugging is easier as well, because you will step\r | |
127 | through the grammar not C file.\r | |
128 | \r | |
129 | -gs Do not generate sets for token expression lists;\r | |
130 | instead generate a ||-separated sequence of\r | |
131 | LA(1)==_\bt_\bo_\bk_\be_\bn__\bn_\bu_\bm_\bb_\be_\br. The default is to generate sets.\r | |
132 | \r | |
133 | -gt Generate code for Abstract-Syntax Trees.\r | |
134 | \r | |
135 | -gx Do not create the lexical analyzer files (dlg-related).\r | |
136 | This option should be given when the user wishes to\r | |
137 | provide a customized lexical analyzer. It may also be\r | |
138 | used in _\bm_\ba_\bk_\be scripts to cause only the parser to be\r | |
139 | rebuilt when a change not affecting the lexical struc-\r | |
140 | ture is made to the input grammars.\r | |
141 | \r | |
142 | -k _\bn Set k of LL(k) to _\bn; i.e. set tokens of look-ahead\r | |
143 | (default==1).\r | |
144 | \r | |
145 | -o dir\r | |
146 | Directory where output files should go (default=".").\r | |
147 | This is very nice for keeping the source directory\r | |
148 | clear of ANTLR and DLG spawn.\r | |
149 | \r | |
150 | -p The complete grammar, collected from all input grammar\r | |
151 | files and stripped of all comments and embedded\r | |
152 | actions, is listed to stdout. This is intended to aid\r | |
153 | in viewing the entire grammar as a whole and to elim-\r | |
154 | inate the need to keep actions concisely stated so that\r | |
155 | the grammar is easier to read. Hence, it is preferable\r | |
156 | to embed even complex actions directly in the grammar,\r | |
157 | rather than to call them as subroutines, since the sub-\r | |
158 | routine call overhead will be saved.\r | |
159 | \r | |
160 | -pa This option is the same as -p except that the output is\r | |
161 | annotated with the first sets determined from grammar\r | |
162 | analysis.\r | |
163 | \r | |
164 | -prc on\r | |
165 | Turn on the computation and hoisting of predicate con-\r | |
166 | text.\r | |
167 | \r | |
168 | -prc off\r | |
169 | Turn off the computation and hoisting of predicate con-\r | |
170 | text. This option makes 1.10 behave like the 1.06\r | |
171 | release with option -pr on. Context computation is off\r | |
172 | by default.\r | |
173 | \r | |
174 | -rl _\bn\r | |
175 | Limit the maximum number of tree nodes used by grammar\r | |
176 | analysis to _\bn. Occasionally, _\ba_\bn_\bt_\bl_\br is unable to\r | |
177 | analyze a grammar submitted by the user. This rare\r | |
178 | situation can only occur when the grammar is large and\r | |
179 | the amount of lookahead is greater than one. A non-\r | |
180 | linear analysis algorithm is used by PCCTS to handle\r | |
181 | the general case of LL(k) parsing. The average com-\r | |
182 | plexity of analysis, however, is near linear due to\r | |
183 | some fancy footwork in the implementation which reduces\r | |
184 | the number of calls to the full LL(k) algorithm. An\r | |
185 | error message will be displayed, if this limit is\r | |
186 | reached, which indicates the grammar construct being\r | |
187 | analyzed when _\ba_\bn_\bt_\bl_\br hit a non-linearity. Use this\r | |
188 | option if _\ba_\bn_\bt_\bl_\br seems to go out to lunch and your disk\r | |
189 | start thrashing; try _\bn=10000 to start. Once the\r | |
190 | offending construct has been identified, try to remove\r | |
191 | the ambiguity that _\ba_\bn_\bt_\bl_\br was trying to overcome with\r | |
192 | large lookahead analysis. The introduction of (...)?\r | |
193 | backtracking blocks eliminates some of these problems -\r | |
194 | _\ba_\bn_\bt_\bl_\br does not analyze alternatives that begin with\r | |
195 | (...)? (it simply backtracks, if necessary, at run\r | |
196 | time).\r | |
197 | \r | |
198 | -w1 Set low warning level. Do not warn if semantic\r | |
199 | predicates and/or (...)? blocks are assumed to cover\r | |
200 | ambiguous alternatives.\r | |
201 | \r | |
202 | -w2 Ambiguous parsing decisions yield warnings even if\r | |
203 | semantic predicates or (...)? blocks are used. Warn if\r | |
204 | predicate context computed and semantic predicates\r | |
205 | incompletely disambiguate alternative productions.\r | |
206 | \r | |
207 | - Read grammar from standard input and generate stdin.c\r | |
208 | as the parser file.\r | |
209 | \r | |
210 | SPECIAL CONSIDERATIONS\r | |
211 | _\bA_\bn_\bt_\bl_\br works... we think. There is no implicit guarantee of\r | |
212 | anything. We reserve no legal rights to the software known\r | |
213 | as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS\r | |
214 | is in the public domain. An individual or company may do\r | |
215 | whatever they wish with source code distributed with PCCTS\r | |
216 | or the code generated by PCCTS, including the incorporation\r | |
217 | of PCCTS, or its output, into commercial software. We\r | |
218 | encourage users to develop software with PCCTS. However, we\r | |
219 | do ask that credit is given to us for developing PCCTS. By\r | |
220 | "credit", we mean that if you incorporate our source code\r | |
221 | into one of your programs (commercial product, research pro-\r | |
222 | ject, or otherwise) that you acknowledge this fact somewhere\r | |
223 | in the documentation, research report, etc... If you like\r | |
224 | PCCTS and have developed a nice tool with the output, please\r | |
225 | mention that you developed it using PCCTS. As long as these\r | |
226 | guidelines are followed, we expect to continue enhancing\r | |
227 | this system and expect to make other tools available as they\r | |
228 | are completed.\r | |
229 | \r | |
230 | FILES\r | |
231 | *.c output C parser.\r | |
232 | \r | |
233 | *.cpp\r | |
234 | output C++ parser when C++ mode is used.\r | |
235 | \r | |
236 | parser.dlg\r | |
237 | output _\bd_\bl_\bg lexical analyzer.\r | |
238 | \r | |
239 | err.c\r | |
240 | token string array, error sets and error support rou-\r | |
241 | tines. Not used in C++ mode.\r | |
242 | \r | |
243 | remap.h\r | |
244 | file that redefines all globally visible parser sym-\r | |
245 | bols. The use of the #parser directive creates this\r | |
246 | file. Not used in C++ mode.\r | |
247 | \r | |
248 | stdpccts.h\r | |
249 | list of definitions needed by C files, not generated by\r | |
250 | PCCTS, that reference PCCTS objects. This is not gen-\r | |
251 | erated by default. Not used in C++ mode.\r | |
252 | \r | |
253 | tokens.h\r | |
254 | output #_\bd_\be_\bf_\bi_\bn_\be_\bs for tokens used and function prototypes\r | |
255 | for functions generated for rules.\r | |
256 | \r | |
257 | \r | |
258 | SEE ALSO\r | |
259 | dlg(1), pccts(1)\r | |
260 | \r | |
261 | \r | |
262 | \r | |
263 | \r | |
264 | \r |