]>
Commit | Line | Data |
---|---|---|
878ddf1f | 1 | .TH ANTLR 1 "September 1995" "ANTLR" "PCCTS Manual Pages"\r |
2 | .SH NAME\r | |
3 | antlr \- ANother Tool for Language Recognition\r | |
4 | .SH SYNTAX\r | |
5 | .LP\r | |
6 | \fBantlr\fR [\fIoptions\fR] \fIgrammar_files\fR\r | |
7 | .SH DESCRIPTION\r | |
8 | .PP\r | |
9 | \fIAntlr\fP converts an extended form of context-free grammar into a\r | |
10 | set of C functions which directly implement an efficient form of\r | |
11 | deterministic recursive-descent LL(k) parser. Context-free grammars\r | |
12 | may be augmented with predicates to allow semantics to influence\r | |
13 | parsing; this allows a form of context-sensitive parsing. Selective\r | |
14 | backtracking is also available to handle non-LL(k) and even\r | |
15 | non-LALR(k) constructs. \fIAntlr\fP also produces a definition of a\r | |
16 | lexer which can be automatically converted into C code for a DFA-based\r | |
17 | lexer by \fIdlg\fR. Hence, \fIantlr\fR serves a function much like\r | |
18 | that of \fIyacc\fR, however, it is notably more flexible and is more\r | |
19 | integrated with a lexer generator (\fIantlr\fR directly generates\r | |
20 | \fIdlg\fR code, whereas \fIyacc\fR and \fIlex\fR are given independent\r | |
21 | descriptions). Unlike \fIyacc\fR which accepts LALR(1) grammars,\r | |
22 | \fIantlr\fR accepts LL(k) grammars in an extended BNF notation \(em\r | |
23 | which eliminates the need for precedence rules.\r | |
24 | .PP\r | |
25 | Like \fIyacc\fR grammars, \fIantlr\fR grammars can use\r | |
26 | automatically-maintained symbol attribute values referenced as dollar\r | |
27 | variables. Further, because \fIantlr\fR generates top-down parsers,\r | |
28 | arbitrary values may be inherited from parent rules (passed like\r | |
29 | function parameters). \fIAntlr\fP also has a mechanism for creating\r | |
30 | and manipulating abstract-syntax-trees.\r | |
31 | .PP\r | |
32 | There are various other niceties in \fIantlr\fR, including the ability to\r | |
33 | spread one grammar over multiple files or even multiple grammars in a single\r | |
34 | file, the ability to generate a version of the grammar with actions stripped\r | |
35 | out (for documentation purposes), and lots more.\r | |
36 | .SH OPTIONS\r | |
37 | .IP "\fB-ck \fIn\fR"\r | |
38 | Use up to \fIn\fR symbols of lookahead when using compressed (linear\r | |
39 | approximation) lookahead. This type of lookahead is very cheap to\r | |
40 | compute and is attempted before full LL(k) lookahead, which is of\r | |
41 | exponential complexity in the worst case. In general, the compressed\r | |
42 | lookahead can be much deeper (e.g, \f(CW-ck 10\fP) than the full\r | |
43 | lookahead (which usually must be less than 4).\r | |
44 | .IP \fB-CC\fP\r | |
45 | Generate C++ output from both ANTLR and DLG.\r | |
46 | .IP \fB-cr\fP\r | |
47 | Generate a cross-reference for all rules. For each rule, print a list\r | |
48 | of all other rules that reference it.\r | |
49 | .IP \fB-e1\fP\r | |
50 | Ambiguities/errors shown in low detail (default).\r | |
51 | .IP \fB-e2\fP\r | |
52 | Ambiguities/errors shown in more detail.\r | |
53 | .IP \fB-e3\fP\r | |
54 | Ambiguities/errors shown in excruciating detail.\r | |
55 | .IP "\fB-fe\fP file"\r | |
56 | Rename \fBerr.c\fP to file.\r | |
57 | .IP "\fB-fh\fP file"\r | |
58 | Rename \fBstdpccts.h\fP header (turns on \fB-gh\fP) to file.\r | |
59 | .IP "\fB-fl\fP file"\r | |
60 | Rename lexical output, \fBparser.dlg\fP, to file.\r | |
61 | .IP "\fB-fm\fP file"\r | |
62 | Rename file with lexical mode definitions, \fBmode.h\fP, to file.\r | |
63 | .IP "\fB-fr\fP file"\r | |
64 | Rename file which remaps globally visible symbols, \fBremap.h\fP, to file.\r | |
65 | .IP "\fB-ft\fP file"\r | |
66 | Rename \fBtokens.h\fP to file.\r | |
67 | .IP \fB-ga\fP\r | |
68 | Generate ANSI-compatible code (default case). This has not been\r | |
69 | rigorously tested to be ANSI XJ11 C compliant, but it is close. The\r | |
70 | normal output of \fIantlr\fP is currently compilable under both K&R,\r | |
71 | ANSI C, and C++\(emthis option does nothing because \fIantlr\fP\r | |
72 | generates a bunch of #ifdef's to do the right thing depending on the\r | |
73 | language.\r | |
74 | .IP \fB-gc\fP\r | |
75 | Indicates that \fIantlr\fP should generate no C code, i.e., only\r | |
76 | perform analysis on the grammar.\r | |
77 | .IP \fB-gd\fP\r | |
78 | C code is inserted in each of the \fIantlr\fR generated parsing functions to\r | |
79 | provide for user-defined handling of a detailed parse trace. The inserted\r | |
80 | code consists of calls to the user-supplied macros or functions called\r | |
81 | \fBzzTRACEIN\fR and \fBzzTRACEOUT\fP. The only argument is a\r | |
82 | \fIchar *\fR pointing to a C-style string which is the grammar rule\r | |
83 | recognized by the current parsing function. If no definition is given\r | |
84 | for the trace functions, upon rule entry and exit, a message will be\r | |
85 | printed indicating that a particular rule as been entered or exited.\r | |
86 | .IP \fB-ge\fP\r | |
87 | Generate an error class for each non-terminal.\r | |
88 | .IP \fB-gh\fP\r | |
89 | Generate \fBstdpccts.h\fP for non-ANTLR-generated files to include.\r | |
90 | This file contains all defines needed to describe the type of parser\r | |
91 | generated by \fIantlr\fP (e.g. how much lookahead is used and whether\r | |
92 | or not trees are constructed) and contains the \fBheader\fP action\r | |
93 | specified by the user.\r | |
94 | .IP \fB-gk\fP\r | |
95 | Generate parsers that delay lookahead fetches until needed. Without\r | |
96 | this option, \fIantlr\fP generates parsers which always have \fIk\fP\r | |
97 | tokens of lookahead available.\r | |
98 | .IP \fB-gl\fP\r | |
99 | Generate line info about grammar actions in C parser of the form\r | |
100 | \fB#\ \fIline\fP\ "\fIfile\fP"\fR which makes error messages from\r | |
101 | the C/C++ compiler make more sense as they will \*Qpoint\*U into the\r | |
102 | grammar file not the resulting C file. Debugging is easier as well,\r | |
103 | because you will step through the grammar not C file.\r | |
104 | .IP \fB-gs\fR\r | |
105 | Do not generate sets for token expression lists; instead generate a\r | |
106 | \fB||\fP-separated sequence of \fBLA(1)==\fItoken_number\fR. The\r | |
107 | default is to generate sets.\r | |
108 | .IP \fB-gt\fP\r | |
109 | Generate code for Abstract-Syntax Trees.\r | |
110 | .IP \fB-gx\fP\r | |
111 | Do not create the lexical analyzer files (dlg-related). This option\r | |
112 | should be given when the user wishes to provide a customized lexical\r | |
113 | analyzer. It may also be used in \fImake\fR scripts to cause only the\r | |
114 | parser to be rebuilt when a change not affecting the lexical structure\r | |
115 | is made to the input grammars.\r | |
116 | .IP "\fB-k \fIn\fR"\r | |
117 | Set k of LL(k) to \fIn\fR; i.e. set tokens of look-ahead (default==1).\r | |
118 | .IP "\fB-o\fP dir\r | |
119 | Directory where output files should go (default="."). This is very\r | |
120 | nice for keeping the source directory clear of ANTLR and DLG spawn.\r | |
121 | .IP \fB-p\fP\r | |
122 | The complete grammar, collected from all input grammar files and\r | |
123 | stripped of all comments and embedded actions, is listed to\r | |
124 | \fBstdout\fP. This is intended to aid in viewing the entire grammar\r | |
125 | as a whole and to eliminate the need to keep actions concisely stated\r | |
126 | so that the grammar is easier to read. Hence, it is preferable to\r | |
127 | embed even complex actions directly in the grammar, rather than to\r | |
128 | call them as subroutines, since the subroutine call overhead will be\r | |
129 | saved.\r | |
130 | .IP \fB-pa\fP\r | |
131 | This option is the same as \fB-p\fP except that the output is\r | |
132 | annotated with the first sets determined from grammar analysis.\r | |
133 | .IP "\fB-prc on\fR\r | |
134 | Turn on the computation and hoisting of predicate context.\r | |
135 | .IP "\fB-prc off\fR\r | |
136 | Turn off the computation and hoisting of predicate context. This\r | |
137 | option makes 1.10 behave like the 1.06 release with option \fB-pr\fR\r | |
138 | on. Context computation is off by default.\r | |
139 | .IP "\fB-rl \fIn\fR\r | |
140 | Limit the maximum number of tree nodes used by grammar analysis to\r | |
141 | \fIn\fP. Occasionally, \fIantlr\fP is unable to analyze a grammar\r | |
142 | submitted by the user. This rare situation can only occur when the\r | |
143 | grammar is large and the amount of lookahead is greater than one. A\r | |
144 | nonlinear analysis algorithm is used by PCCTS to handle the general\r | |
145 | case of LL(k) parsing. The average complexity of analysis, however, is\r | |
146 | near linear due to some fancy footwork in the implementation which\r | |
147 | reduces the number of calls to the full LL(k) algorithm. An error\r | |
148 | message will be displayed, if this limit is reached, which indicates\r | |
149 | the grammar construct being analyzed when \fIantlr\fP hit a\r | |
150 | non-linearity. Use this option if \fIantlr\fP seems to go out to\r | |
151 | lunch and your disk start thrashing; try \fIn\fP=10000 to start. Once\r | |
152 | the offending construct has been identified, try to remove the\r | |
153 | ambiguity that \fIantlr\fP was trying to overcome with large lookahead\r | |
154 | analysis. The introduction of (...)? backtracking blocks eliminates\r | |
155 | some of these problems\ \(em \fIantlr\fP does not analyze alternatives\r | |
156 | that begin with (...)? (it simply backtracks, if necessary, at run\r | |
157 | time).\r | |
158 | .IP \fB-w1\fR\r | |
159 | Set low warning level. Do not warn if semantic predicates and/or\r | |
160 | (...)? blocks are assumed to cover ambiguous alternatives.\r | |
161 | .IP \fB-w2\fR\r | |
162 | Ambiguous parsing decisions yield warnings even if semantic predicates\r | |
163 | or (...)? blocks are used. Warn if predicate context computed and\r | |
164 | semantic predicates incompletely disambiguate alternative productions.\r | |
165 | .IP \fB-\fR\r | |
166 | Read grammar from standard input and generate \fBstdin.c\fP as the\r | |
167 | parser file.\r | |
168 | .SH "SPECIAL CONSIDERATIONS"\r | |
169 | .PP\r | |
170 | \fIAntlr\fP works... we think. There is no implicit guarantee of\r | |
171 | anything. We reserve no \fBlegal\fP rights to the software known as\r | |
172 | the Purdue Compiler Construction Tool Set (PCCTS) \(em PCCTS is in the\r | |
173 | public domain. An individual or company may do whatever they wish\r | |
174 | with source code distributed with PCCTS or the code generated by\r | |
175 | PCCTS, including the incorporation of PCCTS, or its output, into\r | |
176 | commercial software. We encourage users to develop software with\r | |
177 | PCCTS. However, we do ask that credit is given to us for developing\r | |
178 | PCCTS. By "credit", we mean that if you incorporate our source code\r | |
179 | into one of your programs (commercial product, research project, or\r | |
180 | otherwise) that you acknowledge this fact somewhere in the\r | |
181 | documentation, research report, etc... If you like PCCTS and have\r | |
182 | developed a nice tool with the output, please mention that you\r | |
183 | developed it using PCCTS. As long as these guidelines are followed,\r | |
184 | we expect to continue enhancing this system and expect to make other\r | |
185 | tools available as they are completed.\r | |
186 | .SH FILES\r | |
187 | .IP *.c\r | |
188 | output C parser.\r | |
189 | .IP *.cpp\r | |
190 | output C++ parser when C++ mode is used.\r | |
191 | .IP \fBparser.dlg\fP\r | |
192 | output \fIdlg\fR lexical analyzer.\r | |
193 | .IP \fBerr.c\fP\r | |
194 | token string array, error sets and error support routines. Not used in\r | |
195 | C++ mode.\r | |
196 | .IP \fBremap.h\fP\r | |
197 | file that redefines all globally visible parser symbols. The use of\r | |
198 | the #parser directive creates this file. Not used in\r | |
199 | C++ mode.\r | |
200 | .IP \fBstdpccts.h\fP\r | |
201 | list of definitions needed by C files, not generated by PCCTS, that\r | |
202 | reference PCCTS objects. This is not generated by default. Not used in\r | |
203 | C++ mode.\r | |
204 | .IP \fBtokens.h\fP\r | |
205 | output \fI#defines\fR for tokens used and function prototypes for\r | |
206 | functions generated for rules.\r | |
207 | .SH "SEE ALSO"\r | |
208 | .LP\r | |
209 | dlg(1), pccts(1)\r |