]> git.proxmox.com Git - mirror_edk2.git/blame - Tools/CCode/Source/Pccts/antlr/antlr.1
Fixed all scripts to use new directory layout.
[mirror_edk2.git] / Tools / CCode / Source / Pccts / antlr / antlr.1
CommitLineData
878ddf1f 1.TH ANTLR 1 "September 1995" "ANTLR" "PCCTS Manual Pages"\r
2.SH NAME\r
3antlr \- ANother Tool for Language Recognition\r
4.SH SYNTAX\r
5.LP\r
6\fBantlr\fR [\fIoptions\fR] \fIgrammar_files\fR\r
7.SH DESCRIPTION\r
8.PP\r
9\fIAntlr\fP converts an extended form of context-free grammar into a\r
10set of C functions which directly implement an efficient form of\r
11deterministic recursive-descent LL(k) parser. Context-free grammars\r
12may be augmented with predicates to allow semantics to influence\r
13parsing; this allows a form of context-sensitive parsing. Selective\r
14backtracking is also available to handle non-LL(k) and even\r
15non-LALR(k) constructs. \fIAntlr\fP also produces a definition of a\r
16lexer which can be automatically converted into C code for a DFA-based\r
17lexer by \fIdlg\fR. Hence, \fIantlr\fR serves a function much like\r
18that of \fIyacc\fR, however, it is notably more flexible and is more\r
19integrated with a lexer generator (\fIantlr\fR directly generates\r
20\fIdlg\fR code, whereas \fIyacc\fR and \fIlex\fR are given independent\r
21descriptions). Unlike \fIyacc\fR which accepts LALR(1) grammars,\r
22\fIantlr\fR accepts LL(k) grammars in an extended BNF notation \(em\r
23which eliminates the need for precedence rules.\r
24.PP\r
25Like \fIyacc\fR grammars, \fIantlr\fR grammars can use\r
26automatically-maintained symbol attribute values referenced as dollar\r
27variables. Further, because \fIantlr\fR generates top-down parsers,\r
28arbitrary values may be inherited from parent rules (passed like\r
29function parameters). \fIAntlr\fP also has a mechanism for creating\r
30and manipulating abstract-syntax-trees.\r
31.PP\r
32There are various other niceties in \fIantlr\fR, including the ability to\r
33spread one grammar over multiple files or even multiple grammars in a single\r
34file, the ability to generate a version of the grammar with actions stripped\r
35out (for documentation purposes), and lots more.\r
36.SH OPTIONS\r
37.IP "\fB-ck \fIn\fR"\r
38Use up to \fIn\fR symbols of lookahead when using compressed (linear\r
39approximation) lookahead. This type of lookahead is very cheap to\r
40compute and is attempted before full LL(k) lookahead, which is of\r
41exponential complexity in the worst case. In general, the compressed\r
42lookahead can be much deeper (e.g, \f(CW-ck 10\fP) than the full\r
43lookahead (which usually must be less than 4).\r
44.IP \fB-CC\fP\r
45Generate C++ output from both ANTLR and DLG.\r
46.IP \fB-cr\fP\r
47Generate a cross-reference for all rules. For each rule, print a list\r
48of all other rules that reference it.\r
49.IP \fB-e1\fP\r
50Ambiguities/errors shown in low detail (default).\r
51.IP \fB-e2\fP\r
52Ambiguities/errors shown in more detail.\r
53.IP \fB-e3\fP\r
54Ambiguities/errors shown in excruciating detail.\r
55.IP "\fB-fe\fP file"\r
56Rename \fBerr.c\fP to file.\r
57.IP "\fB-fh\fP file"\r
58Rename \fBstdpccts.h\fP header (turns on \fB-gh\fP) to file.\r
59.IP "\fB-fl\fP file"\r
60Rename lexical output, \fBparser.dlg\fP, to file.\r
61.IP "\fB-fm\fP file"\r
62Rename file with lexical mode definitions, \fBmode.h\fP, to file.\r
63.IP "\fB-fr\fP file"\r
64Rename file which remaps globally visible symbols, \fBremap.h\fP, to file.\r
65.IP "\fB-ft\fP file"\r
66Rename \fBtokens.h\fP to file.\r
67.IP \fB-ga\fP\r
68Generate ANSI-compatible code (default case). This has not been\r
69rigorously tested to be ANSI XJ11 C compliant, but it is close. The\r
70normal output of \fIantlr\fP is currently compilable under both K&R,\r
71ANSI C, and C++\(emthis option does nothing because \fIantlr\fP\r
72generates a bunch of #ifdef's to do the right thing depending on the\r
73language.\r
74.IP \fB-gc\fP\r
75Indicates that \fIantlr\fP should generate no C code, i.e., only\r
76perform analysis on the grammar.\r
77.IP \fB-gd\fP\r
78C code is inserted in each of the \fIantlr\fR generated parsing functions to\r
79provide for user-defined handling of a detailed parse trace. The inserted\r
80code consists of calls to the user-supplied macros or functions called\r
81\fBzzTRACEIN\fR and \fBzzTRACEOUT\fP. The only argument is a\r
82\fIchar *\fR pointing to a C-style string which is the grammar rule\r
83recognized by the current parsing function. If no definition is given\r
84for the trace functions, upon rule entry and exit, a message will be\r
85printed indicating that a particular rule as been entered or exited.\r
86.IP \fB-ge\fP\r
87Generate an error class for each non-terminal.\r
88.IP \fB-gh\fP\r
89Generate \fBstdpccts.h\fP for non-ANTLR-generated files to include.\r
90This file contains all defines needed to describe the type of parser\r
91generated by \fIantlr\fP (e.g. how much lookahead is used and whether\r
92or not trees are constructed) and contains the \fBheader\fP action\r
93specified by the user.\r
94.IP \fB-gk\fP\r
95Generate parsers that delay lookahead fetches until needed. Without\r
96this option, \fIantlr\fP generates parsers which always have \fIk\fP\r
97tokens of lookahead available.\r
98.IP \fB-gl\fP\r
99Generate line info about grammar actions in C parser of the form\r
100\fB#\ \fIline\fP\ "\fIfile\fP"\fR which makes error messages from\r
101the C/C++ compiler make more sense as they will \*Qpoint\*U into the\r
102grammar file not the resulting C file. Debugging is easier as well,\r
103because you will step through the grammar not C file.\r
104.IP \fB-gs\fR\r
105Do not generate sets for token expression lists; instead generate a\r
106\fB||\fP-separated sequence of \fBLA(1)==\fItoken_number\fR. The\r
107default is to generate sets.\r
108.IP \fB-gt\fP\r
109Generate code for Abstract-Syntax Trees.\r
110.IP \fB-gx\fP\r
111Do not create the lexical analyzer files (dlg-related). This option\r
112should be given when the user wishes to provide a customized lexical\r
113analyzer. It may also be used in \fImake\fR scripts to cause only the\r
114parser to be rebuilt when a change not affecting the lexical structure\r
115is made to the input grammars.\r
116.IP "\fB-k \fIn\fR"\r
117Set k of LL(k) to \fIn\fR; i.e. set tokens of look-ahead (default==1).\r
118.IP "\fB-o\fP dir\r
119Directory where output files should go (default="."). This is very\r
120nice for keeping the source directory clear of ANTLR and DLG spawn.\r
121.IP \fB-p\fP\r
122The complete grammar, collected from all input grammar files and\r
123stripped of all comments and embedded actions, is listed to\r
124\fBstdout\fP. This is intended to aid in viewing the entire grammar\r
125as a whole and to eliminate the need to keep actions concisely stated\r
126so that the grammar is easier to read. Hence, it is preferable to\r
127embed even complex actions directly in the grammar, rather than to\r
128call them as subroutines, since the subroutine call overhead will be\r
129saved.\r
130.IP \fB-pa\fP\r
131This option is the same as \fB-p\fP except that the output is\r
132annotated with the first sets determined from grammar analysis.\r
133.IP "\fB-prc on\fR\r
134Turn on the computation and hoisting of predicate context.\r
135.IP "\fB-prc off\fR\r
136Turn off the computation and hoisting of predicate context. This\r
137option makes 1.10 behave like the 1.06 release with option \fB-pr\fR\r
138on. Context computation is off by default.\r
139.IP "\fB-rl \fIn\fR\r
140Limit the maximum number of tree nodes used by grammar analysis to\r
141\fIn\fP. Occasionally, \fIantlr\fP is unable to analyze a grammar\r
142submitted by the user. This rare situation can only occur when the\r
143grammar is large and the amount of lookahead is greater than one. A\r
144nonlinear analysis algorithm is used by PCCTS to handle the general\r
145case of LL(k) parsing. The average complexity of analysis, however, is\r
146near linear due to some fancy footwork in the implementation which\r
147reduces the number of calls to the full LL(k) algorithm. An error\r
148message will be displayed, if this limit is reached, which indicates\r
149the grammar construct being analyzed when \fIantlr\fP hit a\r
150non-linearity. Use this option if \fIantlr\fP seems to go out to\r
151lunch and your disk start thrashing; try \fIn\fP=10000 to start. Once\r
152the offending construct has been identified, try to remove the\r
153ambiguity that \fIantlr\fP was trying to overcome with large lookahead\r
154analysis. The introduction of (...)? backtracking blocks eliminates\r
155some of these problems\ \(em \fIantlr\fP does not analyze alternatives\r
156that begin with (...)? (it simply backtracks, if necessary, at run\r
157time).\r
158.IP \fB-w1\fR\r
159Set low warning level. Do not warn if semantic predicates and/or\r
160(...)? blocks are assumed to cover ambiguous alternatives.\r
161.IP \fB-w2\fR\r
162Ambiguous parsing decisions yield warnings even if semantic predicates\r
163or (...)? blocks are used. Warn if predicate context computed and\r
164semantic predicates incompletely disambiguate alternative productions.\r
165.IP \fB-\fR\r
166Read grammar from standard input and generate \fBstdin.c\fP as the\r
167parser file.\r
168.SH "SPECIAL CONSIDERATIONS"\r
169.PP\r
170\fIAntlr\fP works... we think. There is no implicit guarantee of\r
171anything. We reserve no \fBlegal\fP rights to the software known as\r
172the Purdue Compiler Construction Tool Set (PCCTS) \(em PCCTS is in the\r
173public domain. An individual or company may do whatever they wish\r
174with source code distributed with PCCTS or the code generated by\r
175PCCTS, including the incorporation of PCCTS, or its output, into\r
176commercial software. We encourage users to develop software with\r
177PCCTS. However, we do ask that credit is given to us for developing\r
178PCCTS. By "credit", we mean that if you incorporate our source code\r
179into one of your programs (commercial product, research project, or\r
180otherwise) that you acknowledge this fact somewhere in the\r
181documentation, research report, etc... If you like PCCTS and have\r
182developed a nice tool with the output, please mention that you\r
183developed it using PCCTS. As long as these guidelines are followed,\r
184we expect to continue enhancing this system and expect to make other\r
185tools available as they are completed.\r
186.SH FILES\r
187.IP *.c\r
188output C parser.\r
189.IP *.cpp\r
190output C++ parser when C++ mode is used.\r
191.IP \fBparser.dlg\fP\r
192output \fIdlg\fR lexical analyzer.\r
193.IP \fBerr.c\fP\r
194token string array, error sets and error support routines. Not used in\r
195C++ mode.\r
196.IP \fBremap.h\fP\r
197file that redefines all globally visible parser symbols. The use of\r
198the #parser directive creates this file. Not used in\r
199C++ mode.\r
200.IP \fBstdpccts.h\fP\r
201list of definitions needed by C files, not generated by PCCTS, that\r
202reference PCCTS objects. This is not generated by default. Not used in\r
203C++ mode.\r
204.IP \fBtokens.h\fP\r
205output \fI#defines\fR for tokens used and function prototypes for\r
206functions generated for rules.\r
207.SH "SEE ALSO"\r
208.LP\r
209dlg(1), pccts(1)\r