Tools/CCode/Source/Pccts/antlr/antlr1.txt

   1
   2
   3
   4 ANTLR(1)               PCCTS Manual Pages                ANTLR(1)
   5
   6
   7
   8 NAME
   9      antlr - ANother Tool for Language Recognition
  10
  11 SYNTAX
  12      antlr [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bg_\br_\ba_\bm_\bm_\ba_\br__\bf_\bi_\bl_\be_\bs
  13
  14 DESCRIPTION
  15      _\bA_\bn_\bt_\bl_\br converts an extended form of context-free grammar into
  16      a set of C functions which directly implement an efficient
  17      form of deterministic recursive-descent LL(k) parser.
  18      Context-free grammars may be augmented with predicates to
  19      allow semantics to influence parsing; this allows a form of
  20      context-sensitive parsing.  Selective backtracking is also
  21      available to handle non-LL(k) and even non-LALR(k) con-
  22      structs.  _\bA_\bn_\bt_\bl_\br also produces a definition of a lexer which
  23      can be automatically converted into C code for a DFA-based
  24      lexer by _\bd_\bl_\bg.  Hence, _\ba_\bn_\bt_\bl_\br serves a function much like that
  25      of _\by_\ba_\bc_\bc, however, it is notably more flexible and is more
  26      integrated with a lexer generator (_\ba_\bn_\bt_\bl_\br directly generates
  27      _\bd_\bl_\bg code, whereas _\by_\ba_\bc_\bc and _\bl_\be_\bx are given independent
  28      descriptions).  Unlike _\by_\ba_\bc_\bc which accepts LALR(1) grammars,
  29      _\ba_\bn_\bt_\bl_\br accepts LL(k) grammars in an extended BNF notation -
  30      which eliminates the need for precedence rules.
  31
  32      Like _\by_\ba_\bc_\bc grammars, _\ba_\bn_\bt_\bl_\br grammars can use automatically-
  33      maintained symbol attribute values referenced as dollar
  34      variables.  Further, because _\ba_\bn_\bt_\bl_\br generates top-down
  35      parsers, arbitrary values may be inherited from parent rules
  36      (passed like function parameters).  _\bA_\bn_\bt_\bl_\br also has a mechan-
  37      ism for creating and manipulating abstract-syntax-trees.
  38
  39      There are various other niceties in _\ba_\bn_\bt_\bl_\br, including the
  40      ability to spread one grammar over multiple files or even
  41      multiple grammars in a single file, the ability to generate
  42      a version of the grammar with actions stripped out (for
  43      documentation purposes), and lots more.
  44
  45 OPTIONS
  46      -ck _\bn
  47           Use up to _\bn symbols of lookahead when using compressed
  48           (linear approximation) lookahead.  This type of looka-
  49           head is very cheap to compute and is attempted before
  50           full LL(k) lookahead, which is of exponential complex-
  51           ity in the worst case.  In general, the compressed loo-
  52           kahead can be much deeper (e.g, -ck 10) _\bt_\bh_\ba_\bn _\bt_\bh_\be _\bf_\bu_\bl_\bl
  53           _\bl_\bo_\bo_\bk_\ba_\bh_\be_\ba_\bd (_\bw_\bh_\bi_\bc_\bh _\bu_\bs_\bu_\ba_\bl_\bl_\by _\bm_\bu_\bs_\bt _\bb_\be _\bl_\be_\bs_\bs _\bt_\bh_\ba_\bn _\b4).
  54
  55      -CC  Generate C++ output from both ANTLR and DLG.
  56
  57      -cr  Generate a cross-reference for all rules.  For each
  58           rule, print a list of all other rules that reference
  59           it.
  60
  61      -e1  Ambiguities/errors shown in low detail (default).
  62
  63      -e2  Ambiguities/errors shown in more detail.
  64
  65      -e3  Ambiguities/errors shown in excruciating detail.
  66
  67      -fe file
  68           Rename err.c to file.
  69
  70      -fh file
  71           Rename stdpccts.h header (turns on -gh) to file.
  72
  73      -fl file
  74           Rename lexical output, parser.dlg, to file.
  75
  76      -fm file
  77           Rename file with lexical mode definitions, mode.h, to
  78           file.
  79
  80      -fr file
  81           Rename file which remaps globally visible symbols,
  82           remap.h, to file.
  83
  84      -ft file
  85           Rename tokens.h to file.
  86
  87      -ga  Generate ANSI-compatible code (default case).  This has
  88           not been rigorously tested to be ANSI XJ11 C compliant,
  89           but it is close.  The normal output of _\ba_\bn_\bt_\bl_\br is
  90           currently compilable under both K&R, ANSI C, and C++-
  91           this option does nothing because _\ba_\bn_\bt_\bl_\br generates a
  92           bunch of #ifdef's to do the right thing depending on
  93           the language.
  94
  95      -gc  Indicates that _\ba_\bn_\bt_\bl_\br should generate no C code, i.e.,
  96           only perform analysis on the grammar.
  97
  98      -gd  C code is inserted in each of the _\ba_\bn_\bt_\bl_\br generated pars-
  99           ing functions to provide for user-defined handling of a
 100           detailed parse trace.  The inserted code consists of
 101           calls to the user-supplied macros or functions called
 102           zzTRACEIN and zzTRACEOUT.  The only argument is a _\bc_\bh_\ba_\br
 103           * pointing to a C-style string which is the grammar
 104           rule recognized by the current parsing function.  If no
 105           definition is given for the trace functions, upon rule
 106           entry and exit, a message will be printed indicating
 107           that a particular rule as been entered or exited.
 108
 109      -ge  Generate an error class for each non-terminal.
 110
 111      -gh  Generate stdpccts.h for non-ANTLR-generated files to
 112           include.  This file contains all defines needed to
 113           describe the type of parser generated by _\ba_\bn_\bt_\bl_\br (e.g.
 114           how much lookahead is used and whether or not trees are
 115           constructed) and contains the header action specified
 116           by the user.
 117
 118      -gk  Generate parsers that delay lookahead fetches until
 119           needed.  Without this option, _\ba_\bn_\bt_\bl_\br generates parsers
 120           which always have _\bk tokens of lookahead available.
 121
 122      -gl  Generate line info about grammar actions in C parser of
 123           the form # _\bl_\bi_\bn_\be "_\bf_\bi_\bl_\be" which makes error messages from
 124           the C/C++ compiler make more sense as they will point
 125           into the grammar file not the resulting C file.
 126           Debugging is easier as well, because you will step
 127           through the grammar not C file.
 128
 129      -gs  Do not generate sets for token expression lists;
 130           instead generate a ||-separated sequence of
 131           LA(1)==_\bt_\bo_\bk_\be_\bn__\bn_\bu_\bm_\bb_\be_\br.  The default is to generate sets.
 132
 133      -gt  Generate code for Abstract-Syntax Trees.
 134
 135      -gx  Do not create the lexical analyzer files (dlg-related).
 136           This option should be given when the user wishes to
 137           provide a customized lexical analyzer.  It may also be
 138           used in _\bm_\ba_\bk_\be scripts to cause only the parser to be
 139           rebuilt when a change not affecting the lexical struc-
 140           ture is made to the input grammars.
 141
 142      -k _\bn Set k of LL(k) to _\bn; i.e. set tokens of look-ahead
 143           (default==1).
 144
 145      -o dir
 146           Directory where output files should go (default=".").
 147           This is very nice for keeping the source directory
 148           clear of ANTLR and DLG spawn.
 149
 150      -p   The complete grammar, collected from all input grammar
 151           files and stripped of all comments and embedded
 152           actions, is listed to stdout.  This is intended to aid
 153           in viewing the entire grammar as a whole and to elim-
 154           inate the need to keep actions concisely stated so that
 155           the grammar is easier to read.  Hence, it is preferable
 156           to embed even complex actions directly in the grammar,
 157           rather than to call them as subroutines, since the sub-
 158           routine call overhead will be saved.
 159
 160      -pa  This option is the same as -p except that the output is
 161           annotated with the first sets determined from grammar
 162           analysis.
 163
 164      -prc on
 165           Turn on the computation and hoisting of predicate con-
 166           text.
 167
 168      -prc off
 169           Turn off the computation and hoisting of predicate con-
 170           text.  This option makes 1.10 behave like the 1.06
 171           release with option -pr on.  Context computation is off
 172           by default.
 173
 174      -rl _\bn
 175           Limit the maximum number of tree nodes used by grammar
 176           analysis to _\bn.  Occasionally, _\ba_\bn_\bt_\bl_\br is unable to
 177           analyze a grammar submitted by the user.  This rare
 178           situation can only occur when the grammar is large and
 179           the amount of lookahead is greater than one.  A non-
 180           linear analysis algorithm is used by PCCTS to handle
 181           the general case of LL(k) parsing.  The average com-
 182           plexity of analysis, however, is near linear due to
 183           some fancy footwork in the implementation which reduces
 184           the number of calls to the full LL(k) algorithm.  An
 185           error message will be displayed, if this limit is
 186           reached, which indicates the grammar construct being
 187           analyzed when _\ba_\bn_\bt_\bl_\br hit a non-linearity.  Use this
 188           option if _\ba_\bn_\bt_\bl_\br seems to go out to lunch and your disk
 189           start thrashing; try _\bn=10000 to start.  Once the
 190           offending construct has been identified, try to remove
 191           the ambiguity that _\ba_\bn_\bt_\bl_\br was trying to overcome with
 192           large lookahead analysis.  The introduction of (...)?
 193           backtracking blocks eliminates some of these problems -
 194           _\ba_\bn_\bt_\bl_\br does not analyze alternatives that begin with
 195           (...)? (it simply backtracks, if necessary, at run
 196           time).
 197
 198      -w1  Set low warning level.  Do not warn if semantic
 199           predicates and/or (...)? blocks are assumed to cover
 200           ambiguous alternatives.
 201
 202      -w2  Ambiguous parsing decisions yield warnings even if
 203           semantic predicates or (...)? blocks are used.  Warn if
 204           predicate context computed and semantic predicates
 205           incompletely disambiguate alternative productions.
 206
 207      -    Read grammar from standard input and generate stdin.c
 208           as the parser file.
 209
 210 SPECIAL CONSIDERATIONS
 211      _\bA_\bn_\bt_\bl_\br works...  we think.  There is no implicit guarantee of
 212      anything.  We reserve no legal rights to the software known
 213      as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
 214      is in the public domain.  An individual or company may do
 215      whatever they wish with source code distributed with PCCTS
 216      or the code generated by PCCTS, including the incorporation
 217      of PCCTS, or its output, into commercial software.  We
 218      encourage users to develop software with PCCTS.  However, we
 219      do ask that credit is given to us for developing PCCTS.  By
 220      "credit", we mean that if you incorporate our source code
 221      into one of your programs (commercial product, research pro-
 222      ject, or otherwise) that you acknowledge this fact somewhere
 223      in the documentation, research report, etc...  If you like
 224      PCCTS and have developed a nice tool with the output, please
 225      mention that you developed it using PCCTS.  As long as these
 226      guidelines are followed, we expect to continue enhancing
 227      this system and expect to make other tools available as they
 228      are completed.
 229
 230 FILES
 231      *.c  output C parser.
 232
 233      *.cpp
 234           output C++ parser when C++ mode is used.
 235
 236      parser.dlg
 237           output _\bd_\bl_\bg lexical analyzer.
 238
 239      err.c
 240           token string array, error sets and error support rou-
 241           tines.  Not used in C++ mode.
 242
 243      remap.h
 244           file that redefines all globally visible parser sym-
 245           bols.  The use of the #parser directive creates this
 246           file.  Not used in C++ mode.
 247
 248      stdpccts.h
 249           list of definitions needed by C files, not generated by
 250           PCCTS, that reference PCCTS objects.  This is not gen-
 251           erated by default.  Not used in C++ mode.
 252
 253      tokens.h
 254           output #_\bd_\be_\bf_\bi_\bn_\be_\bs for tokens used and function prototypes
 255           for functions generated for rules.
 256
 257
 258 SEE ALSO
 259      dlg(1), pccts(1)
 260
 261
 262
 263
 264