]> git.proxmox.com Git - mirror_edk2.git/blob - BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr.1
Check In tool source code based on Build tool project revision r1655.
[mirror_edk2.git] / BaseTools / Source / C / VfrCompile / Pccts / antlr / antlr.1
1 .TH ANTLR 1 "September 1995" "ANTLR" "PCCTS Manual Pages"
2 .SH NAME
3 antlr \- ANother Tool for Language Recognition
4 .SH SYNTAX
5 .LP
6 \fBantlr\fR [\fIoptions\fR] \fIgrammar_files\fR
7 .SH DESCRIPTION
8 .PP
9 \fIAntlr\fP converts an extended form of context-free grammar into a
10 set of C functions which directly implement an efficient form of
11 deterministic recursive-descent LL(k) parser. Context-free grammars
12 may be augmented with predicates to allow semantics to influence
13 parsing; this allows a form of context-sensitive parsing. Selective
14 backtracking is also available to handle non-LL(k) and even
15 non-LALR(k) constructs. \fIAntlr\fP also produces a definition of a
16 lexer which can be automatically converted into C code for a DFA-based
17 lexer by \fIdlg\fR. Hence, \fIantlr\fR serves a function much like
18 that of \fIyacc\fR, however, it is notably more flexible and is more
19 integrated with a lexer generator (\fIantlr\fR directly generates
20 \fIdlg\fR code, whereas \fIyacc\fR and \fIlex\fR are given independent
21 descriptions). Unlike \fIyacc\fR which accepts LALR(1) grammars,
22 \fIantlr\fR accepts LL(k) grammars in an extended BNF notation \(em
23 which eliminates the need for precedence rules.
24 .PP
25 Like \fIyacc\fR grammars, \fIantlr\fR grammars can use
26 automatically-maintained symbol attribute values referenced as dollar
27 variables. Further, because \fIantlr\fR generates top-down parsers,
28 arbitrary values may be inherited from parent rules (passed like
29 function parameters). \fIAntlr\fP also has a mechanism for creating
30 and manipulating abstract-syntax-trees.
31 .PP
32 There are various other niceties in \fIantlr\fR, including the ability to
33 spread one grammar over multiple files or even multiple grammars in a single
34 file, the ability to generate a version of the grammar with actions stripped
35 out (for documentation purposes), and lots more.
36 .SH OPTIONS
37 .IP "\fB-ck \fIn\fR"
38 Use up to \fIn\fR symbols of lookahead when using compressed (linear
39 approximation) lookahead. This type of lookahead is very cheap to
40 compute and is attempted before full LL(k) lookahead, which is of
41 exponential complexity in the worst case. In general, the compressed
42 lookahead can be much deeper (e.g, \f(CW-ck 10\fP) than the full
43 lookahead (which usually must be less than 4).
44 .IP \fB-CC\fP
45 Generate C++ output from both ANTLR and DLG.
46 .IP \fB-cr\fP
47 Generate a cross-reference for all rules. For each rule, print a list
48 of all other rules that reference it.
49 .IP \fB-e1\fP
50 Ambiguities/errors shown in low detail (default).
51 .IP \fB-e2\fP
52 Ambiguities/errors shown in more detail.
53 .IP \fB-e3\fP
54 Ambiguities/errors shown in excruciating detail.
55 .IP "\fB-fe\fP file"
56 Rename \fBerr.c\fP to file.
57 .IP "\fB-fh\fP file"
58 Rename \fBstdpccts.h\fP header (turns on \fB-gh\fP) to file.
59 .IP "\fB-fl\fP file"
60 Rename lexical output, \fBparser.dlg\fP, to file.
61 .IP "\fB-fm\fP file"
62 Rename file with lexical mode definitions, \fBmode.h\fP, to file.
63 .IP "\fB-fr\fP file"
64 Rename file which remaps globally visible symbols, \fBremap.h\fP, to file.
65 .IP "\fB-ft\fP file"
66 Rename \fBtokens.h\fP to file.
67 .IP \fB-ga\fP
68 Generate ANSI-compatible code (default case). This has not been
69 rigorously tested to be ANSI XJ11 C compliant, but it is close. The
70 normal output of \fIantlr\fP is currently compilable under both K&R,
71 ANSI C, and C++\(emthis option does nothing because \fIantlr\fP
72 generates a bunch of #ifdef's to do the right thing depending on the
73 language.
74 .IP \fB-gc\fP
75 Indicates that \fIantlr\fP should generate no C code, i.e., only
76 perform analysis on the grammar.
77 .IP \fB-gd\fP
78 C code is inserted in each of the \fIantlr\fR generated parsing functions to
79 provide for user-defined handling of a detailed parse trace. The inserted
80 code consists of calls to the user-supplied macros or functions called
81 \fBzzTRACEIN\fR and \fBzzTRACEOUT\fP. The only argument is a
82 \fIchar *\fR pointing to a C-style string which is the grammar rule
83 recognized by the current parsing function. If no definition is given
84 for the trace functions, upon rule entry and exit, a message will be
85 printed indicating that a particular rule as been entered or exited.
86 .IP \fB-ge\fP
87 Generate an error class for each non-terminal.
88 .IP \fB-gh\fP
89 Generate \fBstdpccts.h\fP for non-ANTLR-generated files to include.
90 This file contains all defines needed to describe the type of parser
91 generated by \fIantlr\fP (e.g. how much lookahead is used and whether
92 or not trees are constructed) and contains the \fBheader\fP action
93 specified by the user.
94 .IP \fB-gk\fP
95 Generate parsers that delay lookahead fetches until needed. Without
96 this option, \fIantlr\fP generates parsers which always have \fIk\fP
97 tokens of lookahead available.
98 .IP \fB-gl\fP
99 Generate line info about grammar actions in C parser of the form
100 \fB#\ \fIline\fP\ "\fIfile\fP"\fR which makes error messages from
101 the C/C++ compiler make more sense as they will \*Qpoint\*U into the
102 grammar file not the resulting C file. Debugging is easier as well,
103 because you will step through the grammar not C file.
104 .IP \fB-gs\fR
105 Do not generate sets for token expression lists; instead generate a
106 \fB||\fP-separated sequence of \fBLA(1)==\fItoken_number\fR. The
107 default is to generate sets.
108 .IP \fB-gt\fP
109 Generate code for Abstract-Syntax Trees.
110 .IP \fB-gx\fP
111 Do not create the lexical analyzer files (dlg-related). This option
112 should be given when the user wishes to provide a customized lexical
113 analyzer. It may also be used in \fImake\fR scripts to cause only the
114 parser to be rebuilt when a change not affecting the lexical structure
115 is made to the input grammars.
116 .IP "\fB-k \fIn\fR"
117 Set k of LL(k) to \fIn\fR; i.e. set tokens of look-ahead (default==1).
118 .IP "\fB-o\fP dir
119 Directory where output files should go (default="."). This is very
120 nice for keeping the source directory clear of ANTLR and DLG spawn.
121 .IP \fB-p\fP
122 The complete grammar, collected from all input grammar files and
123 stripped of all comments and embedded actions, is listed to
124 \fBstdout\fP. This is intended to aid in viewing the entire grammar
125 as a whole and to eliminate the need to keep actions concisely stated
126 so that the grammar is easier to read. Hence, it is preferable to
127 embed even complex actions directly in the grammar, rather than to
128 call them as subroutines, since the subroutine call overhead will be
129 saved.
130 .IP \fB-pa\fP
131 This option is the same as \fB-p\fP except that the output is
132 annotated with the first sets determined from grammar analysis.
133 .IP "\fB-prc on\fR
134 Turn on the computation and hoisting of predicate context.
135 .IP "\fB-prc off\fR
136 Turn off the computation and hoisting of predicate context. This
137 option makes 1.10 behave like the 1.06 release with option \fB-pr\fR
138 on. Context computation is off by default.
139 .IP "\fB-rl \fIn\fR
140 Limit the maximum number of tree nodes used by grammar analysis to
141 \fIn\fP. Occasionally, \fIantlr\fP is unable to analyze a grammar
142 submitted by the user. This rare situation can only occur when the
143 grammar is large and the amount of lookahead is greater than one. A
144 nonlinear analysis algorithm is used by PCCTS to handle the general
145 case of LL(k) parsing. The average complexity of analysis, however, is
146 near linear due to some fancy footwork in the implementation which
147 reduces the number of calls to the full LL(k) algorithm. An error
148 message will be displayed, if this limit is reached, which indicates
149 the grammar construct being analyzed when \fIantlr\fP hit a
150 non-linearity. Use this option if \fIantlr\fP seems to go out to
151 lunch and your disk start thrashing; try \fIn\fP=10000 to start. Once
152 the offending construct has been identified, try to remove the
153 ambiguity that \fIantlr\fP was trying to overcome with large lookahead
154 analysis. The introduction of (...)? backtracking blocks eliminates
155 some of these problems\ \(em \fIantlr\fP does not analyze alternatives
156 that begin with (...)? (it simply backtracks, if necessary, at run
157 time).
158 .IP \fB-w1\fR
159 Set low warning level. Do not warn if semantic predicates and/or
160 (...)? blocks are assumed to cover ambiguous alternatives.
161 .IP \fB-w2\fR
162 Ambiguous parsing decisions yield warnings even if semantic predicates
163 or (...)? blocks are used. Warn if predicate context computed and
164 semantic predicates incompletely disambiguate alternative productions.
165 .IP \fB-\fR
166 Read grammar from standard input and generate \fBstdin.c\fP as the
167 parser file.
168 .SH "SPECIAL CONSIDERATIONS"
169 .PP
170 \fIAntlr\fP works... we think. There is no implicit guarantee of
171 anything. We reserve no \fBlegal\fP rights to the software known as
172 the Purdue Compiler Construction Tool Set (PCCTS) \(em PCCTS is in the
173 public domain. An individual or company may do whatever they wish
174 with source code distributed with PCCTS or the code generated by
175 PCCTS, including the incorporation of PCCTS, or its output, into
176 commercial software. We encourage users to develop software with
177 PCCTS. However, we do ask that credit is given to us for developing
178 PCCTS. By "credit", we mean that if you incorporate our source code
179 into one of your programs (commercial product, research project, or
180 otherwise) that you acknowledge this fact somewhere in the
181 documentation, research report, etc... If you like PCCTS and have
182 developed a nice tool with the output, please mention that you
183 developed it using PCCTS. As long as these guidelines are followed,
184 we expect to continue enhancing this system and expect to make other
185 tools available as they are completed.
186 .SH FILES
187 .IP *.c
188 output C parser.
189 .IP *.cpp
190 output C++ parser when C++ mode is used.
191 .IP \fBparser.dlg\fP
192 output \fIdlg\fR lexical analyzer.
193 .IP \fBerr.c\fP
194 token string array, error sets and error support routines. Not used in
195 C++ mode.
196 .IP \fBremap.h\fP
197 file that redefines all globally visible parser symbols. The use of
198 the #parser directive creates this file. Not used in
199 C++ mode.
200 .IP \fBstdpccts.h\fP
201 list of definitions needed by C files, not generated by PCCTS, that
202 reference PCCTS objects. This is not generated by default. Not used in
203 C++ mode.
204 .IP \fBtokens.h\fP
205 output \fI#defines\fR for tokens used and function prototypes for
206 functions generated for rules.
207 .SH "SEE ALSO"
208 .LP
209 dlg(1), pccts(1)