+++ /dev/null
-\r
-\r
-\r
-ANTLR(1) PCCTS Manual Pages ANTLR(1)\r
-\r
-\r
-\r
-NAME\r
- antlr - ANother Tool for Language Recognition\r
-\r
-SYNTAX\r
- antlr [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bg_\br_\ba_\bm_\bm_\ba_\br__\bf_\bi_\bl_\be_\bs\r
-\r
-DESCRIPTION\r
- _\bA_\bn_\bt_\bl_\br converts an extended form of context-free grammar into\r
- a set of C functions which directly implement an efficient\r
- form of deterministic recursive-descent LL(k) parser.\r
- Context-free grammars may be augmented with predicates to\r
- allow semantics to influence parsing; this allows a form of\r
- context-sensitive parsing. Selective backtracking is also\r
- available to handle non-LL(k) and even non-LALR(k) con-\r
- structs. _\bA_\bn_\bt_\bl_\br also produces a definition of a lexer which\r
- can be automatically converted into C code for a DFA-based\r
- lexer by _\bd_\bl_\bg. Hence, _\ba_\bn_\bt_\bl_\br serves a function much like that\r
- of _\by_\ba_\bc_\bc, however, it is notably more flexible and is more\r
- integrated with a lexer generator (_\ba_\bn_\bt_\bl_\br directly generates\r
- _\bd_\bl_\bg code, whereas _\by_\ba_\bc_\bc and _\bl_\be_\bx are given independent\r
- descriptions). Unlike _\by_\ba_\bc_\bc which accepts LALR(1) grammars,\r
- _\ba_\bn_\bt_\bl_\br accepts LL(k) grammars in an extended BNF notation -\r
- which eliminates the need for precedence rules.\r
-\r
- Like _\by_\ba_\bc_\bc grammars, _\ba_\bn_\bt_\bl_\br grammars can use automatically-\r
- maintained symbol attribute values referenced as dollar\r
- variables. Further, because _\ba_\bn_\bt_\bl_\br generates top-down\r
- parsers, arbitrary values may be inherited from parent rules\r
- (passed like function parameters). _\bA_\bn_\bt_\bl_\br also has a mechan-\r
- ism for creating and manipulating abstract-syntax-trees.\r
-\r
- There are various other niceties in _\ba_\bn_\bt_\bl_\br, including the\r
- ability to spread one grammar over multiple files or even\r
- multiple grammars in a single file, the ability to generate\r
- a version of the grammar with actions stripped out (for\r
- documentation purposes), and lots more.\r
-\r
-OPTIONS\r
- -ck _\bn\r
- Use up to _\bn symbols of lookahead when using compressed\r
- (linear approximation) lookahead. This type of looka-\r
- head is very cheap to compute and is attempted before\r
- full LL(k) lookahead, which is of exponential complex-\r
- ity in the worst case. In general, the compressed loo-\r
- kahead can be much deeper (e.g, -ck 10) _\bt_\bh_\ba_\bn _\bt_\bh_\be _\bf_\bu_\bl_\bl\r
- _\bl_\bo_\bo_\bk_\ba_\bh_\be_\ba_\bd (_\bw_\bh_\bi_\bc_\bh _\bu_\bs_\bu_\ba_\bl_\bl_\by _\bm_\bu_\bs_\bt _\bb_\be _\bl_\be_\bs_\bs _\bt_\bh_\ba_\bn _\b4).\r
-\r
- -CC Generate C++ output from both ANTLR and DLG.\r
-\r
- -cr Generate a cross-reference for all rules. For each\r
- rule, print a list of all other rules that reference\r
- it.\r
-\r
- -e1 Ambiguities/errors shown in low detail (default).\r
-\r
- -e2 Ambiguities/errors shown in more detail.\r
-\r
- -e3 Ambiguities/errors shown in excruciating detail.\r
-\r
- -fe file\r
- Rename err.c to file.\r
-\r
- -fh file\r
- Rename stdpccts.h header (turns on -gh) to file.\r
-\r
- -fl file\r
- Rename lexical output, parser.dlg, to file.\r
-\r
- -fm file\r
- Rename file with lexical mode definitions, mode.h, to\r
- file.\r
-\r
- -fr file\r
- Rename file which remaps globally visible symbols,\r
- remap.h, to file.\r
-\r
- -ft file\r
- Rename tokens.h to file.\r
-\r
- -ga Generate ANSI-compatible code (default case). This has\r
- not been rigorously tested to be ANSI XJ11 C compliant,\r
- but it is close. The normal output of _\ba_\bn_\bt_\bl_\br is\r
- currently compilable under both K&R, ANSI C, and C++-\r
- this option does nothing because _\ba_\bn_\bt_\bl_\br generates a\r
- bunch of #ifdef's to do the right thing depending on\r
- the language.\r
-\r
- -gc Indicates that _\ba_\bn_\bt_\bl_\br should generate no C code, i.e.,\r
- only perform analysis on the grammar.\r
-\r
- -gd C code is inserted in each of the _\ba_\bn_\bt_\bl_\br generated pars-\r
- ing functions to provide for user-defined handling of a\r
- detailed parse trace. The inserted code consists of\r
- calls to the user-supplied macros or functions called\r
- zzTRACEIN and zzTRACEOUT. The only argument is a _\bc_\bh_\ba_\br\r
- * pointing to a C-style string which is the grammar\r
- rule recognized by the current parsing function. If no\r
- definition is given for the trace functions, upon rule\r
- entry and exit, a message will be printed indicating\r
- that a particular rule as been entered or exited.\r
-\r
- -ge Generate an error class for each non-terminal.\r
-\r
- -gh Generate stdpccts.h for non-ANTLR-generated files to\r
- include. This file contains all defines needed to\r
- describe the type of parser generated by _\ba_\bn_\bt_\bl_\br (e.g.\r
- how much lookahead is used and whether or not trees are\r
- constructed) and contains the header action specified\r
- by the user.\r
-\r
- -gk Generate parsers that delay lookahead fetches until\r
- needed. Without this option, _\ba_\bn_\bt_\bl_\br generates parsers\r
- which always have _\bk tokens of lookahead available.\r
-\r
- -gl Generate line info about grammar actions in C parser of\r
- the form # _\bl_\bi_\bn_\be "_\bf_\bi_\bl_\be" which makes error messages from\r
- the C/C++ compiler make more sense as they will point\r
- into the grammar file not the resulting C file.\r
- Debugging is easier as well, because you will step\r
- through the grammar not C file.\r
-\r
- -gs Do not generate sets for token expression lists;\r
- instead generate a ||-separated sequence of\r
- LA(1)==_\bt_\bo_\bk_\be_\bn__\bn_\bu_\bm_\bb_\be_\br. The default is to generate sets.\r
-\r
- -gt Generate code for Abstract-Syntax Trees.\r
-\r
- -gx Do not create the lexical analyzer files (dlg-related).\r
- This option should be given when the user wishes to\r
- provide a customized lexical analyzer. It may also be\r
- used in _\bm_\ba_\bk_\be scripts to cause only the parser to be\r
- rebuilt when a change not affecting the lexical struc-\r
- ture is made to the input grammars.\r
-\r
- -k _\bn Set k of LL(k) to _\bn; i.e. set tokens of look-ahead\r
- (default==1).\r
-\r
- -o dir\r
- Directory where output files should go (default=".").\r
- This is very nice for keeping the source directory\r
- clear of ANTLR and DLG spawn.\r
-\r
- -p The complete grammar, collected from all input grammar\r
- files and stripped of all comments and embedded\r
- actions, is listed to stdout. This is intended to aid\r
- in viewing the entire grammar as a whole and to elim-\r
- inate the need to keep actions concisely stated so that\r
- the grammar is easier to read. Hence, it is preferable\r
- to embed even complex actions directly in the grammar,\r
- rather than to call them as subroutines, since the sub-\r
- routine call overhead will be saved.\r
-\r
- -pa This option is the same as -p except that the output is\r
- annotated with the first sets determined from grammar\r
- analysis.\r
-\r
- -prc on\r
- Turn on the computation and hoisting of predicate con-\r
- text.\r
-\r
- -prc off\r
- Turn off the computation and hoisting of predicate con-\r
- text. This option makes 1.10 behave like the 1.06\r
- release with option -pr on. Context computation is off\r
- by default.\r
-\r
- -rl _\bn\r
- Limit the maximum number of tree nodes used by grammar\r
- analysis to _\bn. Occasionally, _\ba_\bn_\bt_\bl_\br is unable to\r
- analyze a grammar submitted by the user. This rare\r
- situation can only occur when the grammar is large and\r
- the amount of lookahead is greater than one. A non-\r
- linear analysis algorithm is used by PCCTS to handle\r
- the general case of LL(k) parsing. The average com-\r
- plexity of analysis, however, is near linear due to\r
- some fancy footwork in the implementation which reduces\r
- the number of calls to the full LL(k) algorithm. An\r
- error message will be displayed, if this limit is\r
- reached, which indicates the grammar construct being\r
- analyzed when _\ba_\bn_\bt_\bl_\br hit a non-linearity. Use this\r
- option if _\ba_\bn_\bt_\bl_\br seems to go out to lunch and your disk\r
- start thrashing; try _\bn=10000 to start. Once the\r
- offending construct has been identified, try to remove\r
- the ambiguity that _\ba_\bn_\bt_\bl_\br was trying to overcome with\r
- large lookahead analysis. The introduction of (...)?\r
- backtracking blocks eliminates some of these problems -\r
- _\ba_\bn_\bt_\bl_\br does not analyze alternatives that begin with\r
- (...)? (it simply backtracks, if necessary, at run\r
- time).\r
-\r
- -w1 Set low warning level. Do not warn if semantic\r
- predicates and/or (...)? blocks are assumed to cover\r
- ambiguous alternatives.\r
-\r
- -w2 Ambiguous parsing decisions yield warnings even if\r
- semantic predicates or (...)? blocks are used. Warn if\r
- predicate context computed and semantic predicates\r
- incompletely disambiguate alternative productions.\r
-\r
- - Read grammar from standard input and generate stdin.c\r
- as the parser file.\r
-\r
-SPECIAL CONSIDERATIONS\r
- _\bA_\bn_\bt_\bl_\br works... we think. There is no implicit guarantee of\r
- anything. We reserve no legal rights to the software known\r
- as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS\r
- is in the public domain. An individual or company may do\r
- whatever they wish with source code distributed with PCCTS\r
- or the code generated by PCCTS, including the incorporation\r
- of PCCTS, or its output, into commercial software. We\r
- encourage users to develop software with PCCTS. However, we\r
- do ask that credit is given to us for developing PCCTS. By\r
- "credit", we mean that if you incorporate our source code\r
- into one of your programs (commercial product, research pro-\r
- ject, or otherwise) that you acknowledge this fact somewhere\r
- in the documentation, research report, etc... If you like\r
- PCCTS and have developed a nice tool with the output, please\r
- mention that you developed it using PCCTS. As long as these\r
- guidelines are followed, we expect to continue enhancing\r
- this system and expect to make other tools available as they\r
- are completed.\r
-\r
-FILES\r
- *.c output C parser.\r
-\r
- *.cpp\r
- output C++ parser when C++ mode is used.\r
-\r
- parser.dlg\r
- output _\bd_\bl_\bg lexical analyzer.\r
-\r
- err.c\r
- token string array, error sets and error support rou-\r
- tines. Not used in C++ mode.\r
-\r
- remap.h\r
- file that redefines all globally visible parser sym-\r
- bols. The use of the #parser directive creates this\r
- file. Not used in C++ mode.\r
-\r
- stdpccts.h\r
- list of definitions needed by C files, not generated by\r
- PCCTS, that reference PCCTS objects. This is not gen-\r
- erated by default. Not used in C++ mode.\r
-\r
- tokens.h\r
- output #_\bd_\be_\bf_\bi_\bn_\be_\bs for tokens used and function prototypes\r
- for functions generated for rules.\r
-\r
-\r
-SEE ALSO\r
- dlg(1), pccts(1)\r
-\r
-\r
-\r
-\r
-\r