ANTLR(1)               PCCTS Manual Pages                ANTLR(1)



NAME
     antlr - ANother Tool for Language Recognition

SYNTAX
     antlr [_o_p_t_i_o_n_s] _g_r_a_m_m_a_r__f_i_l_e_s

DESCRIPTION
     _A_n_t_l_r converts an extended form of context-free grammar into
     a set of C functions which directly implement an efficient
     form of deterministic recursive-descent LL(k) parser.
     Context-free grammars may be augmented with predicates to
     allow semantics to influence parsing; this allows a form of
     context-sensitive parsing.  Selective backtracking is also
     available to handle non-LL(k) and even non-LALR(k) con-
     structs.  _A_n_t_l_r also produces a definition of a lexer which
     can be automatically converted into C code for a DFA-based
     lexer by _d_l_g.  Hence, _a_n_t_l_r serves a function much like that
     of _y_a_c_c, however, it is notably more flexible and is more
     integrated with a lexer generator (_a_n_t_l_r directly generates
     _d_l_g code, whereas _y_a_c_c and _l_e_x are given independent
     descriptions).  Unlike _y_a_c_c which accepts LALR(1) grammars,
     _a_n_t_l_r accepts LL(k) grammars in an extended BNF notation -
     which eliminates the need for precedence rules.

     Like _y_a_c_c grammars, _a_n_t_l_r grammars can use automatically-
     maintained symbol attribute values referenced as dollar
     variables.  Further, because _a_n_t_l_r generates top-down
     parsers, arbitrary values may be inherited from parent rules
     (passed like function parameters).  _A_n_t_l_r also has a mechan-
     ism for creating and manipulating abstract-syntax-trees.

     There are various other niceties in _a_n_t_l_r, including the
     ability to spread one grammar over multiple files or even
     multiple grammars in a single file, the ability to generate
     a version of the grammar with actions stripped out (for
     documentation purposes), and lots more.

OPTIONS
     -ck _n
          Use up to _n symbols of lookahead when using compressed
          (linear approximation) lookahead.  This type of looka-
          head is very cheap to compute and is attempted before
          full LL(k) lookahead, which is of exponential complex-
          ity in the worst case.  In general, the compressed loo-
          kahead can be much deeper (e.g, -ck 10) _t_h_a_n _t_h_e _f_u_l_l
          _l_o_o_k_a_h_e_a_d (_w_h_i_c_h _u_s_u_a_l_l_y _m_u_s_t _b_e _l_e_s_s _t_h_a_n _4).

     -CC  Generate C++ output from both ANTLR and DLG.

     -cr  Generate a cross-reference for all rules.  For each
          rule, print a list of all other rules that reference
          it.

     -e1  Ambiguities/errors shown in low detail (default).

     -e2  Ambiguities/errors shown in more detail.

     -e3  Ambiguities/errors shown in excruciating detail.

     -fe file
          Rename err.c to file.

     -fh file
          Rename stdpccts.h header (turns on -gh) to file.

     -fl file
          Rename lexical output, parser.dlg, to file.

     -fm file
          Rename file with lexical mode definitions, mode.h, to
          file.

     -fr file
          Rename file which remaps globally visible symbols,
          remap.h, to file.

     -ft file
          Rename tokens.h to file.

     -ga  Generate ANSI-compatible code (default case).  This has
          not been rigorously tested to be ANSI XJ11 C compliant,
          but it is close.  The normal output of _a_n_t_l_r is
          currently compilable under both K&R, ANSI C, and C++-
          this option does nothing because _a_n_t_l_r generates a
          bunch of #ifdef's to do the right thing depending on
          the language.

     -gc  Indicates that _a_n_t_l_r should generate no C code, i.e.,
          only perform analysis on the grammar.

     -gd  C code is inserted in each of the _a_n_t_l_r generated pars-
          ing functions to provide for user-defined handling of a
          detailed parse trace.  The inserted code consists of
          calls to the user-supplied macros or functions called
          zzTRACEIN and zzTRACEOUT.  The only argument is a _c_h_a_r
          * pointing to a C-style string which is the grammar
          rule recognized by the current parsing function.  If no
          definition is given for the trace functions, upon rule
          entry and exit, a message will be printed indicating
          that a particular rule as been entered or exited.

     -ge  Generate an error class for each non-terminal.

     -gh  Generate stdpccts.h for non-ANTLR-generated files to
          include.  This file contains all defines needed to
          describe the type of parser generated by _a_n_t_l_r (e.g.
          how much lookahead is used and whether or not trees are
          constructed) and contains the header action specified
          by the user.

     -gk  Generate parsers that delay lookahead fetches until
          needed.  Without this option, _a_n_t_l_r generates parsers
          which always have _k tokens of lookahead available.

     -gl  Generate line info about grammar actions in C parser of
          the form # _l_i_n_e "_f_i_l_e" which makes error messages from
          the C/C++ compiler make more sense as they will point
          into the grammar file not the resulting C file.
          Debugging is easier as well, because you will step
          through the grammar not C file.

     -gs  Do not generate sets for token expression lists;
          instead generate a ||-separated sequence of
          LA(1)==_t_o_k_e_n__n_u_m_b_e_r.  The default is to generate sets.

     -gt  Generate code for Abstract-Syntax Trees.

     -gx  Do not create the lexical analyzer files (dlg-related).
          This option should be given when the user wishes to
          provide a customized lexical analyzer.  It may also be
          used in _m_a_k_e scripts to cause only the parser to be
          rebuilt when a change not affecting the lexical struc-
          ture is made to the input grammars.

     -k _n Set k of LL(k) to _n; i.e. set tokens of look-ahead
          (default==1).

     -o dir
          Directory where output files should go (default=".").
          This is very nice for keeping the source directory
          clear of ANTLR and DLG spawn.

     -p   The complete grammar, collected from all input grammar
          files and stripped of all comments and embedded
          actions, is listed to stdout.  This is intended to aid
          in viewing the entire grammar as a whole and to elim-
          inate the need to keep actions concisely stated so that
          the grammar is easier to read.  Hence, it is preferable
          to embed even complex actions directly in the grammar,
          rather than to call them as subroutines, since the sub-
          routine call overhead will be saved.

     -pa  This option is the same as -p except that the output is
          annotated with the first sets determined from grammar
          analysis.

     -prc on
          Turn on the computation and hoisting of predicate con-
          text.

     -prc off
          Turn off the computation and hoisting of predicate con-
          text.  This option makes 1.10 behave like the 1.06
          release with option -pr on.  Context computation is off
          by default.

     -rl _n
          Limit the maximum number of tree nodes used by grammar
          analysis to _n.  Occasionally, _a_n_t_l_r is unable to
          analyze a grammar submitted by the user.  This rare
          situation can only occur when the grammar is large and
          the amount of lookahead is greater than one.  A non-
          linear analysis algorithm is used by PCCTS to handle
          the general case of LL(k) parsing.  The average com-
          plexity of analysis, however, is near linear due to
          some fancy footwork in the implementation which reduces
          the number of calls to the full LL(k) algorithm.  An
          error message will be displayed, if this limit is
          reached, which indicates the grammar construct being
          analyzed when _a_n_t_l_r hit a non-linearity.  Use this
          option if _a_n_t_l_r seems to go out to lunch and your disk
          start thrashing; try _n=10000 to start.  Once the
          offending construct has been identified, try to remove
          the ambiguity that _a_n_t_l_r was trying to overcome with
          large lookahead analysis.  The introduction of (...)?
          backtracking blocks eliminates some of these problems -
          _a_n_t_l_r does not analyze alternatives that begin with
          (...)? (it simply backtracks, if necessary, at run
          time).

     -w1  Set low warning level.  Do not warn if semantic
          predicates and/or (...)? blocks are assumed to cover
          ambiguous alternatives.

     -w2  Ambiguous parsing decisions yield warnings even if
          semantic predicates or (...)? blocks are used.  Warn if
          predicate context computed and semantic predicates
          incompletely disambiguate alternative productions.

     -    Read grammar from standard input and generate stdin.c
          as the parser file.

SPECIAL CONSIDERATIONS
     _A_n_t_l_r works...  we think.  There is no implicit guarantee of
     anything.  We reserve no legal rights to the software known
     as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
     is in the public domain.  An individual or company may do
     whatever they wish with source code distributed with PCCTS
     or the code generated by PCCTS, including the incorporation
     of PCCTS, or its output, into commercial software.  We
     encourage users to develop software with PCCTS.  However, we
     do ask that credit is given to us for developing PCCTS.  By
     "credit", we mean that if you incorporate our source code
     into one of your programs (commercial product, research pro-
     ject, or otherwise) that you acknowledge this fact somewhere
     in the documentation, research report, etc...  If you like
     PCCTS and have developed a nice tool with the output, please
     mention that you developed it using PCCTS.  As long as these
     guidelines are followed, we expect to continue enhancing
     this system and expect to make other tools available as they
     are completed.

FILES
     *.c  output C parser.

     *.cpp
          output C++ parser when C++ mode is used.

     parser.dlg
          output _d_l_g lexical analyzer.

     err.c
          token string array, error sets and error support rou-
          tines.  Not used in C++ mode.

     remap.h
          file that redefines all globally visible parser sym-
          bols.  The use of the #parser directive creates this
          file.  Not used in C++ mode.

     stdpccts.h
          list of definitions needed by C files, not generated by
          PCCTS, that reference PCCTS objects.  This is not gen-
          erated by default.  Not used in C++ mode.

     tokens.h
          output #_d_e_f_i_n_e_s for tokens used and function prototypes
          for functions generated for rules.


SEE ALSO
     dlg(1), pccts(1)