1 =======================================================================
2 List of Implemented Fixes and Changes for Maintenance Releases of PCCTS
5 For a summary of the most significant changes see CHANGES_SUMMARY.TXT
7 =======================================================================
11 The software and these notes are provided "as is". They may include
12 typographical or technical errors and their authors disclaims all
13 liability of any kind or nature for damages due to error, fault,
14 defect, or deficiency regardless of cause. All warranties of any
15 kind, either express or implied, including, but not limited to, the
16 implied warranties of merchantability and fitness for a particular
17 purpose are disclaimed.
20 -------------------------------------------------------
21 Note: Items #153 to #1 are now in a separate file named
22 CHANGES_FROM_133_BEFORE_MR13.txt
23 -------------------------------------------------------
25 #261. (Changed in MR19) Defer token fetch for C++ mode
27 Item #216 has been revised to indicate that use of the defer fetch
28 option (ZZDEFER_FETCH) requires dlg option -i.
30 #260. (MR22) Raise default lex buffer size from 8,000 to 32,000 bytes.
32 ZZLEXBUFSIZE is the size (in bytes) of the buffer used by dlg
33 generated lexers. The default value has been raised to 32,000 and
34 the value used by antlr, dlg, and sorcerer has also been raised to
37 #259. (MR22) Default function arguments in C++ mode.
39 If a rule is declared:
43 then the declaration generated by pccts resembles:
47 however, the definition must omit the default argument:
51 In the past the default value was not omitted. In MR22
52 the generated code resembles:
54 void rr(int i /* = 0 */ ) {...}
56 Implemented by Volker H. Simonis (simonis@informatik.uni-tuebingen.de)
58 #258. (MR22) Using a base class for your parser
60 In item #102 (MR10) the class statement was extended to allow one
61 to specify a base class other than ANTLRParser for the generated
62 parser. It turned out that this was less than useful because
63 the constructor still specified ANTLRParser as the base class.
65 The class statement now uses the first identifier appearing after
66 the ":" as the name of the base class. For example:
68 class MyParser : public FooParser {
70 Generates in MyParser.h:
72 class MyParser : public FooParser {
74 Generates in MyParser.cpp something that resembles:
76 MyParser::MyParser(ANTLRTokenBuffer *input) :
77 FooParser(input,1,0,0,4)
79 token_tbl = _token_tbl;
80 traceOptionValueDefault=1; // MR10 turn trace ON
83 The base class must constructor must have a signature similar to
86 #257. (MR21a) Removed dlg statement that -i has no effect in C++ mode.
90 #256. (MR21a) Malformed syntax graph causes crash after error message.
92 In the past, certain kinds of errors in the very first grammar
93 element could cause the construction of a malformed graph
94 representing the grammar. This would eventually result in a
95 fatal internal error. The code has been changed to be more
96 resistant to this particular error.
98 #255. (MR21a) ParserBlackBox(FILE* f)
100 This constructor set openByBlackBox to the wrong value.
102 Reported by Kees Bakker (kees_bakker@tasking.nl).
104 #254. (MR21a) Reporting syntax error at end-of-file
106 When there was a syntax error at the end-of-file the syntax
107 error routine would substitute "<eof>" for the programmer's
108 end-of-file symbol. This substitution is now done only when
109 the programmer does not define his own end-of-file symbol
110 or the symbol begins with the character "@".
112 Reported by Kees Bakker (kees_bakker@tasking.nl).
114 #253. (MR21) Generation of block preamble (-preamble and -preamble_first)
116 The antlr option -preamble causes antlr to insert the code
117 BLOCK_PREAMBLE at the start of each rule and block. It does
118 not insert code before rules references, token references, or
119 actions. By properly defining the macro BLOCK_PREAMBLE the
120 user can generate code which is specific to the start of blocks.
122 The antlr option -preamble_first is similar, but inserts the
123 code BLOCK_PREAMBLE_FIRST(PreambleFirst_123) where the symbol
124 PreambleFirst_123 is equivalent to the first set defined by
125 the #FirstSetSymbol described in Item #248.
127 I have not investigated how these options interact with guess
128 mode (syntactic predicates).
130 #252. (MR21) Check for null pointer in trace routine
132 When some trace options are used when the parser is generated
133 without the trace enabled, the current rule name may be a
134 NULL pointer. A guard was added to check for this in
137 Reported by Douglas E. Forester (dougf@projtech.com).
139 #251. (MR21) Changes to #define zzTRACE_RULES
141 The macro zzTRACE_RULES was being use to pass information to
142 AParser.h. If this preprocessor symbol was not properly
143 set the first time AParser.h was #included, the declaration
144 of zzTRACEdata would be omitted (it is used by the -gd option).
145 Subsequent #includes of AParser.h would be skipped because of
146 the #ifdef guard, so the declaration of zzTracePrevRuleName would
147 never be made. The result was that proper compilation was very
150 The declaration of zzTRACEdata was made unconditional and the
151 problem of removing unused declarations will be left to optimizers.
153 Diagnosed by Douglas E. Forester (dougf@projtech.com).
155 #250. (MR21) Option for EXPERIMENTAL change to error sets for blocks
157 The antlr option -mrblkerr turns on an experimental feature
158 which is supposed to provide more accurate syntax error messages
159 for k=1, ck=1 grammars. When used with k>1 or ck>1 grammars the
160 behavior should be no worse than the current behavior.
162 There is no problem with the matching of elements or the computation
163 of prediction expressions in pccts. The task is only one of listing
164 the most appropriate tokens in the error message. The error sets used
165 in pccts error messages are approximations of the exact error set when
166 optional elements in (...)* or (...)+ are involved. While entirely
167 correct, the error messages are sometimes not 100% accurate.
169 There is also a minor philosophical issue. For example, suppose the
170 grammar expects the token to be an optional A followed by Z, and it
171 is X. X, of course, is neither A nor Z, so an error message is appropriate.
172 Is it appropriate to say "Expected Z" ? It is correct, it is accurate,
173 but it is not complete.
175 When k>1 or ck>1 the problem of providing the exactly correct
176 list of tokens for the syntax error messages ends up becoming
177 equivalent to evaluating the prediction expression for the
178 alternatives twice. However, for k=1 ck=1 grammars the prediction
179 expression can be computed easily and evaluated cheaply, so I
180 decided to try implementing it to satisfy a particular application.
181 This application uses the error set in an interactive command language
182 to provide prompts which list the alternatives available at that
183 point in the parser. The user can then enter additional tokens to
184 complete the command line. To do this required more accurate error
185 sets then previously provided by pccts.
187 In some cases the default pccts behavior may lead to more robust error
188 recovery or clearer error messages then having the exact set of tokens.
189 This is because (a) features like -ge allow the use of symbolic names for
190 certain sets of tokens, so having extra tokens may simply obscure things
191 and (b) the error set is use to resynchronize the parser, so a good
192 choice is sometimes more important than having the exact set.
194 Consider the following example:
196 Note: All examples code has been abbreviated
197 to the absolute minimum in order to make the
202 The generated code resembles:
204 old new (with -mrblkerr)
205 ------------- --------------------
206 for (;;) { for (;;) {
209 match(Z); if (! A and ! Z) then
216 old message: Found X, expected Z
217 new message: Found X, expected A, Z
223 old new (with -mrblkerr)
224 ------------- --------------------
225 for (;;) { for (;;) {
226 if (!A and !B) break; if (!A and !B) break;
227 if (...) { if (...) {
228 <same ...> <same ...>
231 FAIL(...{A,B,Z}...) FAIL(...{A,B}...);
234 match(B); if (! A and ! B and !Z) then
240 old message: Found X, expected Z
241 new message: Found X, expected A, B, Z
243 old message: Found X, expected Z
244 new message: Found X, expected A, B, Z
246 This includes the choice of looping back to the
249 The code for plus blocks:
253 The generated code resembles:
255 old new (with -mrblkerr)
256 ------------- --------------------
259 } while (A) } while (A)
260 match(Z); if (! A and ! Z) then
266 old message: Found X, expected Z
267 new message: Found X, expected A, Z
269 This includes the choice of looping back to the
276 old new (with -mrblkerr)
277 ------------- --------------------
281 } else if (B) { <same>
284 if (cnt > 1) break; <same>
285 FAIL(...{A,B,Z}...) FAIL(...{A,B}...);
290 match(Z); if (! A and ! B and !Z) then
296 old message: Found X, expected A, B, Z
297 new message: Found X, expected A, B
299 old message: Found X, expected Z
300 new message: Found X, expected A, B, Z
302 This includes the choice of looping back to the
305 #249. (MR21) Changes for DEC/VMS systems
307 Jean-François Piéronne (jfp@altavista.net) has updated some
308 VMS related command files and fixed some minor problems related
309 to building pccts under the DEC/VMS operating system. For DEC/VMS
310 users the most important differences are:
312 a. Revised makefile.vms
313 b. Revised genMMS for genrating VMS style makefiles.
315 #248. (MR21) Generate symbol for first set of an alternative
317 pccts can generate a symbol which represents the tokens which may
318 appear at the start of a block:
320 rr : #FirstSetSymbol(rr_FirstSet) ( Foo | Bar ) ;
322 This will generate the symbol rr_FirstSet of type SetWordType with
323 elements Foo and Bar set. The bits can be tested using code similar
326 if (set_el(Foo, &rr_FirstSet)) { ...
328 This can be combined with the C array zztokens[] or the C++ routine
329 tokenName() to get the print name of the token in the first set.
331 The size of the set is given by the newly added enum SET_SIZE, a
332 protected member of the generated parser's class. The number of
333 elements in the generated set will not be exactly equal to the
334 value of SET_SIZE because of synthetic tokens created by #tokclass,
335 #errclass, the -ge option, and meta-tokens such as epsilon, and
338 The #FirstSetSymbol must appear immediately before a block
339 such as (...)+, (...)*, and {...}, and (...). It may not appear
340 immediately before a token, a rule reference, or action. However
341 a token or rule reference can be enclosed in a (...) in order to
342 make the use of #pragma FirstSetSymbol legal.
344 rr_bad : #FirstSetSymbol(rr_bad_FirstSet) Foo; // Illegal
346 rr_ok : #FirstSetSymbol(rr_ok_FirstSet) (Foo); // Legal
348 Do not confuse FirstSetSymbol sets with the sets used for testing
349 lookahead. The sets used for FirstSetSymbol have one element per bit,
350 so the number of bytes is approximately the largest token number
351 divided by 8. The sets used for testing lookahead store 8 lookahead
352 sets per byte, so the length of the array is approximately the largest
355 If there is demand, a similar routine for follow sets can be added.
357 #247. (MR21) Misleading error message on syntax error for optional elements.
359 Prior to MR21, tokens which were optional did not appear in syntax
360 error messages if the block which immediately followed detected a
363 Consider the following grammar which accepts Number, Word, and Other:
367 For this rule the code resembles:
369 if (LA(1) == Number) {
375 Prior to MR21, the error message for input "$ a" would be:
377 line 1: syntax error at "$" missing Word
379 With MR21 the message will be:
381 line 1: syntax error at "$" expecting Word, Number.
383 The generate code resembles:
385 if ( (LA(1)==Number) ) {
390 if ( (LA(1)==Word) ) {
394 FAIL(... message for both Number and Word ...);
399 The code generated for optional blocks in MR21 is slightly longer
400 than the previous versions, but it should give better error messages.
402 The code generated for:
406 should now be *identical* to:
410 which was not the case prior to MR21.
412 Reported by Sue Marvin (sue@siara.com).
414 #246. (Changed in MR21) Use of $(MAKE) for calls to make
416 Calls to make from the makefiles were replaced with $(MAKE)
417 because of problems when using gmake.
419 Reported with fix by Sunil K.Vallamkonda (sunil@siara.com).
421 #245. (Changed in MR21) Changes to genmk
423 The following command line options have been added to genmk:
427 To add a user's C or C++ files into makefile automatically.
428 The list of files must be enclosed in apostrophes. This
429 option may be specified multiple times.
433 The name of the compiler to use for $(CCC) or $(CC). The
434 default in C++ mode is "CC". The default in C mode is "cc".
438 The value for $(PCCTS), the pccts directory. The default
441 Contributed by Tomasz Babczynski (t.babczynski@ict.pwr.wroc.pl).
443 #244. (Changed in MR21) Rename variable "not" in antlr.g
445 When antlr.g is compiled with a C++ compiler, a variable named
446 "not" causes problems. Reported by Sinan Karasu
447 (sinan.karasu@boeing.com).
449 #243 (Changed in MR21) Replace recursion with iteration in zzfree_ast
451 Another refinement to zzfree_ast in ast.c to limit recursion.
453 NAKAJIMA Mutsuki (muc@isr.co.jp).
456 #242. (Changed in MR21) LineInfoFormatStr
458 Added an #ifndef/#endif around LineInfoFormatStr in pcctscfg.h.
460 #241. (Changed in MR21) Changed macro PURIFY to a no-op
462 ***********************
463 *** NOT IMPLEMENTED ***
464 ***********************
466 The PURIFY macro was changed to a no-op because it was causing
467 problems when passing C++ objects.
471 #define PURIFY(r,s) memset((char *) &(r),'\\0',(s));
475 #define PURIFY(r,s) /* nothing */
478 #240. (Changed in MR21) sorcerer/h/sorcerer.h _MATCH and _MATCHRANGE
480 Added test for NULL token pointer.
482 Suggested by Peter Keller (keller@ebi.ac.uk)
484 #239. (Changed in MR21) C++ mode AParser::traceGuessFail
486 If tracing is turned on when the code has been generated
487 without trace code, a failed guess generates a trace report
488 even though there are no other trace reports. This
489 make the behavior consistent with other parts of the
492 Reported by David Wigg (wiggjd@sbu.ac.uk).
494 #238. (Changed in MR21) Namespace version #include files
496 Changed reference from CStdio to cstdio (and other
497 #include file names) in the namespace version of pccts.
498 Should have known better.
500 #237. (Changed in MR21) ParserBlackBox(FILE*)
502 In the past, ParserBlackBox would close the FILE in the dtor
503 even though it was not opened by ParserBlackBox. The problem
504 is that there were two constructors, one which accepted a file
505 name and did an fopen, the other which accepted a FILE and did
506 not do an fopen. There is now an extra member variable which
507 remembers whether ParserBlackBox did the open or not.
509 Suggested by Mike Percy (mpercy@scires.com).
511 #236. (Changed in MR21) tmake now reports down pointer problem
513 When ASTBase::tmake attempts to update the down pointer of
514 an AST it checks to see if the down pointer is NULL. If it
515 is not NULL it does not do the update and returns NULL.
516 An attempt to update the down pointer is almost always a
517 result of a user error. This can lead to difficult to find
518 problems during tree construction.
520 With this change, the routine calls a virtual function
521 reportOverwriteOfDownPointer() which calls panic to
522 report the problem. Users who want the old behavior can
523 redefined the virtual function in their AST class.
525 Suggested by Sinan Karasu (sinan.karasu@boeing.com)
527 #235. (Changed in MR21) Made ANTLRParser::resynch() virtual
529 Suggested by Jerry Evans (jerry@swsl.co.uk).
531 #234. (Changed in MR21) Implicit int for function return value
533 ATokenBuffer:bufferSize() did not specify a type for the
536 Reported by Hai Vo-Ba (hai@fc.hp.com).
538 #233. (Changed in MR20) Converted to MSVC 6.0
540 Due to external circumstances I have had to convert to MSVC 6.0
541 The MSVC 5.0 project files (.dsw and .dsp) have been retained as
542 xxx50.dsp and xxx50.dsw. The MSVC 6.0 files are named xxx60.dsp
543 and xxx60.dsw (where xxx is the related to the directory/project).
545 #232. (Changed in MR20) Make setwd bit vectors protected in parser.h
547 The access for the setwd array in the parser header was not
548 specified. As a result, it would depend on the code which
549 preceded it. In MR20 it will always have access "protected".
551 Reported by Piotr Eljasiak (eljasiak@zt.gdansk.tpsa.pl).
553 #231. (Changed in MR20) Error in token buffer debug code.
555 When token buffer debugging is selected via the pre-processor
556 symbol DEBUG_TOKENBUFFER there is an erroneous check in
559 #ifdef DEBUG_TOKENBUFFER
560 if (i >= inputTokens->bufferSize() ||
561 inputTokens->minTokens() < LLk ) /* MR20 Was "<=" */
565 Reported by David Wigg (wiggjd@sbu.ac.uk).
567 #230. (Changed in MR20) Fixed problem with #define for -gd option
569 There was an error in setting zzTRACE_RULES for the -gd (trace) option.
571 Reported by Gary Funck (gary@intrepid.com).
573 #229. (Changed in MR20) Additional "const" for literals
575 "const" was added to the token name literal table.
576 "const" was added to some panic() and similar routine
578 #228. (Changed in MR20) dlg crashes on "()"
580 The following token defintion will cause DLG to crash.
584 When there is a syntax error in a regular expression
585 many of the dlg routines return a structure which has
586 null pointers. When this is accessed by callers it
589 I have attempted to fix the more common cases.
591 Reported by Mengue Olivier (dolmen@bigfoot.com).
593 #227. (Changed in MR20) Array overwrite
595 Steveh Hand (sassth@unx.sas.com) reported a problem which
596 was traced to a temporary array which was not properly
597 resized for deeply nested blocks. This has been fixed.
599 #226. (Changed in MR20) -pedantic conformance
601 G. Hobbelt (i_a@mbh.org) and THM made many, many minor
602 changes to create prototypes for all the functions and
603 bring antlr, dlg, and sorcerer into conformance with
604 the gcc -pedantic option.
606 This may require uses to add pccts/h/pcctscfg.h to some
607 files or makefiles in order to have __USE_PROTOS defined.
609 #225 (Changed in MR20) AST stack adjustment in C mode
611 The fix in #214 for AST stack adjustment in C mode missed
614 Reported with fix by Ger Hobbelt (i_a@mbh.org).
616 #224 (Changed in MR20) LL(1) and LL(2) with #pragma approx
618 This may take a record for the oldest, most trival, lexical
619 error in pccts. The regular expressions for LL(1) and LL(2)
620 lacked an escape for the left and right parenthesis.
622 Reported by Ger Hobbelt (i_a@mbh.org).
624 #223 (Changed in MR20) Addition of IBM_VISUAL_AGE directory
626 Build files for antlr, dlg, and sorcerer under IBM Visual Age
627 have been contributed by Anton Sergeev (ags@mlc.ru). They have
628 been placed in the pccts/IBM_VISUAL_AGE directory.
630 #222 (Changed in MR20) Replace __STDC__ with __USE_PROTOS
632 Most occurrences of __STDC__ replaced with __USE_PROTOS due to
633 complaints from several users.
635 #221 (Changed in MR20) Added #include for DLexerBase.h to PBlackBox.
637 Added #include for DLexerBase.h to PBlackBox.
639 #220 (Changed in MR19) strcat arguments reversed in #pred parse
641 The arguments to strcat are reversed when creating a print
642 name for a hash table entry for use with #pred feature.
644 Problem diagnosed and fix reported by Scott Harrington
645 (seh4@ix.netcom.com).
647 #219. (Changed in MR19) C Mode routine zzfree_ast
649 Changes to reduce use of recursion for AST trees with only right
650 links or only left links in the C mode routine zzfree_ast.
652 Implemented by SAKAI Kiyotaka (ksakai@isr.co.jp).
654 #218. (Changed in MR19) Changes to support unsigned char in C mode
656 Changes to antlr.h and err.h to fix omissions in use of zzchar_t
658 Implemented by SAKAI Kiyotaka (ksakai@isr.co.jp).
660 #217. (Changed in MR19) Error message when dlg -i and -CC options selected
662 *** This change was rescinded by item #257 ***
664 The parsers generated by pccts in C++ mode are not able to support the
665 interactive lexer option (except, perhaps, when using the deferred fetch
666 parser option.(Item #216).
668 DLG now warns when both -i and -CC are selected.
670 This warning was suggested by David Venditti (07751870267-0001@t-online.de).
672 #216. (Changed in MR19) Defer token fetch for C++ mode
674 Implemented by Volker H. Simonis (simonis@informatik.uni-tuebingen.de)
676 Normally, pccts keeps the lookahead token buffer completely filled.
677 This requires max(k,ck) tokens of lookahead. For some applications
678 this can cause deadlock problems. For example, there may be cases
679 when the parser can't tell when the input has been completely consumed
680 until the parse is complete, but the parse can't be completed because
681 the input routines are waiting for additional tokens to fill the
684 When the ANTLRParser class is built with the pre-processor option
685 ZZDEFER_FETCH defined, the fetch of new tokens by consume() is deferred
686 until LA(i) or LT(i) is called.
688 To test whether this option has been built into the ANTLRParser class
689 use "isDeferFetchEnabled()".
691 Using the -gd trace option with the default tracein() and traceout()
692 routines will defeat the effort to defer the fetch because the
693 trace routines print out information about the lookahead token at
694 the start of the rule.
696 Because the tracein and traceout routines are virtual it is
697 easy to redefine them in your parser:
701 virtual void tracein(ANTLRChar * ruleName)
702 { fprintf(stderr,"Entering: %s\n", ruleName); }
703 virtual void traceout(ANTLRChar * ruleName)
704 { fprintf(stderr,"Leaving: %s\n", ruleName); }
707 The originals for those routines are pccts/h/AParser.cpp
709 This requires use of the dlg option -i (interactive lexer).
711 This is experimental. The interaction with guess mode (syntactic
712 predicates)is not known.
714 #215. (Changed in MR19) Addition of reset() to DLGLexerBase
716 There was no obvious way to reset the lexer for reuse. The
717 reset() method now does this.
719 Suggested by David Venditti (07751870267-0001@t-online.de).
721 #214. (Changed in MR19) C mode: Adjust AST stack pointer at exit
723 In C mode the AST stack pointer needs to be reset if there will
724 be multiple calls to the ANTLRx macros.
726 Reported with fix by Paul D. Smith (psmith@baynetworks.com).
728 #213. (Changed in MR18) Fatal error with -mrhoistk (k>1 hoisting)
730 When rearranging code I forgot to un-comment a critical line of
731 code that handles hoisting of predicates with k>1 lookahead. This
734 Reported by Reinier van den Born (reinier@vnet.ibm.com).
736 #212. (Changed in MR17) Mac related changes by Kenji Tanaka
738 Kenji Tanaka (kentar@osa.att.ne.jp) has made a number of changes for
741 a. The following Macintosh MPW files aid in installing pccts on Mac:
749 pccts/antlr/antlr68K.make
750 pccts/antlr/antlrPPC.make
753 pccts/dlg/dlg68K.make
754 pccts/dlg/dlgPPC.make
757 pccts/sorcerer/sor68K.make
758 pccts/sorcerer/sorPPC.make
760 They completely replace the previous Mac installation files.
762 b. The most significant is a change in the MAC_FILE_CREATOR symbol
765 old: #define MAC_FILE_CREATOR 'MMCC' /* Metrowerks C/C++ Text files */
766 new: #define MAC_FILE_CREATOR 'CWIE' /* Metrowerks C/C++ Text files */
768 c. Added calls to special_fopen_actions() where necessary.
770 #211. (Changed in MR16a) C++ style comment in dlg
774 #210. (Changed in MR16a) Sor accepts \r\n, \r, or \n for end-of-line
776 A user requested that Sorcerer be changed to accept other forms
779 #209. (Changed in MR16) Name of files changed.
781 Old: CHANGES_FROM_1.33
782 New: CHANGES_FROM_133.txt
785 New: KNOWN_PROBLEMS.txt
787 #208. (Changed in MR16) Change in use of pccts #include files
789 There were problems with MS DevStudio when mixing Sorcerer and
790 PCCTS in the same source file. The problem is caused by the
791 redefinition of setjmp in the MS header file setjmp.h. In
792 setjmp.h the pre-processor symbol setjmp was redefined to be
793 _setjmp. A later effort to execute #include <setjmp.h> resulted
794 in an effort to #include <_setjmp.h>. I'm not sure whether this
795 is a bug or a feature. In any case, I decided to fix it by
796 avoiding the use of pre-processor symbols in #include statements
797 altogether. This has the added benefit of making pre-compiled
800 I've replaced statements:
802 old: #include PCCTS_SETJMP_H
803 new: #include "pccts_setjmp.h"
805 Where pccts_setjmp.h contains:
807 #ifndef __PCCTS_SETJMP_H__
808 #define __PCCTS_SETJMP_H__
810 #ifdef PCCTS_USE_NAMESPACE_STD
818 A similar change has been made for other standard header files
819 required by pccts and sorcerer: stdlib.h, stdarg.h, stdio.h, etc.
821 Reported by Jeff Vincent (JVincent@novell.com) and Dale Davis
822 (DalDavis@spectrace.com).
824 #207. (Changed in MR16) dlg reports an invalid range for: [\0x00-\0xff]
826 dlg will report that this is an invalid range.
828 Diagnosed by Piotr Eljasiak (eljasiak@no-spam.zt.gdansk.tpsa.pl):
830 I think this problem is not specific to unsigned chars
831 because dlg reports no error for the range [\0x00-\0xfe].
833 I've found that information on range is kept in field
834 letter (unsigned char) of Attrib struct. Unfortunately
835 the letter value internally is for some reasons increased
836 by 1, so \0xff is represented here as 0.
838 That's why dlg complains about the range [\0x00-\0xff] in
841 if ($$.letter > $2.letter) {
842 error("invalid range ", zzline);
847 if ($$.letter > $2.letter && 255 != $$2.letter) {
848 error("invalid range ", zzline);
851 #206. (Changed in MR16) Free zzFAILtext in ANTLRParser destructor
853 The ANTLRParser destructor now frees zzFAILtext.
855 Problem and fix reported by Manfred Kogler (km@cast.uni-linz.ac.at).
857 #205. (Changed in MR16) DLGStringReset argument now const
859 Changed: void DLGStringReset(DLGChar *s) {...}
860 To: void DLGStringReset(const DLGChar *s) {...}
862 Suggested by Dale Davis (daldavis@spectrace.com)
864 #204. (Changed in MR15a) Change __WATCOM__ to __WATCOMC__ in pcctscfg.h
866 Reported by Oleg Dashevskii (olegdash@my-dejanews.com).
868 #203. (Changed in MR15) Addition of sorcerer to distribution kit
870 I have finally caved in to popular demand. The pccts 1.33mr15
871 kit will include sorcerer. The separate sorcerer kit will be
874 #202. (Changed) in MR15) Organization of MS Dev Studio Projects in Kit
876 Previously there was one workspace that contained projects for
877 all three parts of pccts: antlr, dlg, and sorcerer. Now each
878 part (and directory) has its own workspace/project and there
879 is an additional workspace/project to build a library from the
880 .cpp files in the pccts/h directory.
882 The library build will create pccts_debug.lib or pccts_release.lib
883 according to the configuration selected.
885 If you don't want to build pccts 1.33MR15 you can download a
886 ready-to-run kit for win32 from http://www.polhode.com/win32.zip.
887 The ready-to-run for win32 includes executables, a pre-built static
888 library for the .cpp files in the pccts/h directory, and a sample
891 You will need to define the environment variable PCCTS to point to
892 the root of the pccts directory hierarchy.
894 #201. (Changed in MR15) Several fixes by K.J. Cummings (cummings@peritus.com)
896 Generation of SETJMP rather than SETJMP_H in gen.c.
898 (Sor B19) Declaration of ref_vars_inits for ref_var_inits in
899 pccts/sorcerer/sorcerer.h.
901 #200. (Changed in MR15) Remove operator=() in AToken.h
903 User reported that WatCom couldn't handle use of
904 explicit operator =(). Replace with equivalent
907 #199. (Changed in MR15) Don't allow use of empty #tokclass
909 Change antlr.g to disallow empty #tokclass sets.
911 Reported by Manfred Kogler (km@cast.uni-linz.ac.at).
913 #198. Revised ANSI C grammar due to efforts by Manuel Kessler
915 Manuel Kessler (mlkessler@cip.physik.uni-wuerzburg.de)
917 Allow trailing ... in function parameter lists.
919 Allow old-style function declarations.
920 Support cv-qualified pointers.
921 Better checking of combinations of type specifiers.
922 Release of memory for local symbols on scope exit.
923 Allow input file name on command line as well as by redirection.
925 and other miscellaneous tweaks.
927 This is not part of the pccts distribution kit. It must be
928 downloaded separately from:
930 http://www.polhode.com/ansi_mr15.zip
932 #197. (Changed in MR14) Resetting the lookahead buffer of the parser
934 Explanation and fix by Sinan Karasu (sinan.karasu@boeing.com)
936 Consider the code used to prime the lookahead buffer LA(i)
937 of the parser when init() is called:
944 for(i=1;i<=LLk; i++) consume();
946 //lap = 0; // MR14 - Sinan Karasu (sinan.karusu@boeing.com)
947 //labase = 0; // MR14
951 When the parser is instantiated, lap=0,labase=0 is set.
953 The "for" loop runs LLk times. In consume(), lap = lap +1 (mod LLk) is
954 computed. Therefore, lap(before the loop) == lap (after the loop).
956 Now the only problem comes in when one does an init() of the parser
957 after an Eof has been seen. At that time, lap could be non zero.
958 Assume it was lap==1. Now we do a prime_lookahead(). If LLk is 2,
963 NLA = inputTokens->getToken()->getType();
965 lap = (lap+1)&(LLk-1);
970 token_type[lap&(LLk-1)]) = inputTokens->getToken()->getType();
972 lap = (lap+1)&(LLk-1);
974 so now we prime locations 1 and 2. In prime_lookahead it used to set
975 lap=0 and labase=0. Now, the next token will be read from location 0,
976 NOT 1 as it should have been.
978 This was never caught before, because if a parser is just instantiated,
979 then lap and labase are 0, the offending assignment lines are
980 basically no-ops, since the for loop wraps around back to 0.
982 #196. (Changed in MR14) Problems with "(alpha)? beta" guess
984 Consider the following syntactic predicate in a grammar
985 with 2 tokens of lookahead (k=2 or ck=2):
987 rule : ( alpha )? beta ;
994 When antlr computes the prediction expression with one token
995 of lookahead for alts 1 and 2 of rule t it finds an ambiguity.
997 Because the grammar has a lookahead of 2 it tries to compute
998 two tokens of lookahead for alts 1 and 2 of t. Alt 1 clearly
999 has a lookahead of (T U). Alt 2 is one token long so antlr
1000 tries to compute the follow set of alt 2, which means finding
1001 the things which can follow rule t in the context of (alpha)?.
1002 This cannot be computed, because alpha is only part of a rule,
1003 and antlr can't tell what part of beta is matched by alpha and
1004 what part remains to be matched. Thus it impossible for antlr
1005 to properly determine the follow set of rule t.
1007 Prior to 1.33MR14, the follow of (alpha)? was computed as
1008 FIRST(beta) as a result of the internal representation of
1011 With MR14 the follow set will be the empty set for that context.
1013 Normally, one expects a rule appearing in a guess block to also
1014 appear elsewhere. When the follow context for this other use
1015 is "ored" with the empty set, the context from the other use
1016 results, and a reasonable follow context results. However if
1017 there is *no* other use of the rule, or it is used in a different
1018 manner then the follow context will be inaccurate - it was
1019 inaccurate even before MR14, but it will be inaccurate in a
1022 For the example given earlier, a reasonable way to rewrite the
1025 rule : ( alpha )? beta
1032 If there are no other uses of the rule appearing in the guess
1033 block it will generate a test for EOF - a workaround for
1034 representing a null set in the lookahead tests.
1036 If you encounter such a problem you can use the -alpha option
1037 to get additional information:
1039 line 2: error: not possible to compute follow set for alpha
1040 in an "(alpha)? beta" block.
1042 With the antlr -alpha command line option the following information
1043 is inserted into the generated file:
1047 Trace of references leading to attempt to compute the follow set of
1048 alpha in an "(alpha)? beta" block. It is not possible for antlr to
1049 compute this follow set because it is not known what part of beta has
1050 already been matched by alpha and what part remains to be matched.
1052 Rules which make use of the incorrect follow set will also be incorrect
1054 1 #token T alpha/2 line 7 brief.g
1055 2 end alpha alpha/3 line 8 brief.g
1056 2 end (...)? block at start/1 line 2 brief.g
1060 At the moment, with the -alpha option selected the program marks
1061 any rules which appear in the trace back chain (above) as rules with
1062 possible problems computing follow set.
1064 Reported by Greg Knapen (gregory.knapen@bell.ca).
1066 #195. (Changed in MR14) #line directive not at column 1
1068 Under certain circunstances a predicate test could generate
1069 a #line directive which was not at column 1.
1071 Reported with fix by David KÃ¥gedal (davidk@lysator.liu.se)
1072 (http://www.lysator.liu.se/~davidk/).
1074 #194. (Changed in MR14) (C Mode only) Demand lookahead with #tokclass
1076 In C mode with the demand lookahead option there is a bug in the
1077 code which handles matches for #tokclass (zzsetmatch and
1080 The bug causes the lookahead pointer to get out of synchronization
1081 with the current token pointer.
1083 The problem was reported with a fix by Ger Hobbelt (hobbelt@axa.nl).
1085 #193. (Changed in MR14) Use of PCCTS_USE_NAMESPACE_STD
1087 The pcctscfg.h now contains the following definitions:
1089 #ifdef PCCTS_USE_NAMESPACE_STD
1090 #define PCCTS_STDIO_H <Cstdio>
1091 #define PCCTS_STDLIB_H <Cstdlib>
1092 #define PCCTS_STDARG_H <Cstdarg>
1093 #define PCCTS_SETJMP_H <Csetjmp>
1094 #define PCCTS_STRING_H <Cstring>
1095 #define PCCTS_ASSERT_H <Cassert>
1096 #define PCCTS_ISTREAM_H <istream>
1097 #define PCCTS_IOSTREAM_H <iostream>
1098 #define PCCTS_NAMESPACE_STD namespace std {}; using namespace std;
1100 #define PCCTS_STDIO_H <stdio.h>
1101 #define PCCTS_STDLIB_H <stdlib.h>
1102 #define PCCTS_STDARG_H <stdarg.h>
1103 #define PCCTS_SETJMP_H <setjmp.h>
1104 #define PCCTS_STRING_H <string.h>
1105 #define PCCTS_ASSERT_H <assert.h>
1106 #define PCCTS_ISTREAM_H <istream.h>
1107 #define PCCTS_IOSTREAM_H <iostream.h>
1108 #define PCCTS_NAMESPACE_STD
1111 The runtime support in pccts/h uses these pre-processor symbols
1114 Also, antlr and dlg have been changed to generate code which uses
1115 these pre-processor symbols rather than having the names of the
1116 #include files hard-coded in the generated code.
1118 This required the addition of "#include pcctscfg.h" to a number of
1121 It appears that this sometimes causes problems for MSVC 5 in
1122 combination with the "automatic" option for pre-compiled headers.
1123 In such cases disable the "automatic" pre-compiled headers option.
1125 Suggested by Hubert Holin (Hubert.Holin@Bigfoot.com).
1127 #192. (Changed in MR14) Change setText() to accept "const ANTLRChar *"
1129 Changed ANTLRToken::setText(ANTLRChar *) to setText(const ANTLRChar *).
1130 This allows literal strings to be used to initialize tokens. Since
1131 the usual token implementation (ANTLRCommonToken) makes a copy of the
1132 input string, this was an unnecessary limitation.
1134 Suggested by Bob McWhirter (bob@netwrench.com).
1136 #191. (Changed in MR14) HP/UX aCC compiler compatibility problem
1138 Needed to explicitly declare zzINF_DEF_TOKEN_BUFFER_SIZE and
1139 zzINF_BUFFER_TOKEN_CHUNK_SIZE as ints in pccts/h/AParser.cpp.
1141 Reported by David Cook (dcook@bmc.com).
1143 #190. (Changed in MR14) IBM OS/2 CSet compiler compatibility problem
1145 Name conflict with "_cs" in pccts/h/ATokenBuffer.cpp
1147 Reported by David Cook (dcook@bmc.com).
1149 #189. (Changed in MR14) -gxt switch in C mode
1151 The -gxt switch in C mode didn't work because of incorrect
1154 Reported by Sinan Karasu (sinan@boeing.com).
1156 #188. (Changed in MR14) Added pccts/h/DLG_stream_input.h
1158 This is a DLG stream class based on C++ istreams.
1160 Contributed by Hubert Holin (Hubert.Holin@Bigfoot.com).
1162 #187. (Changed in MR14) Rename config.h to pcctscfg.h
1164 The PCCTS configuration file has been renamed from config.h to
1165 pcctscfg.h. The problem with the original name is that it led
1166 to name collisions when pccts parsers were combined with other
1169 All of the runtime support routines in pccts/h/* have been
1170 changed to use the new name. Existing software can continue
1171 to use pccts/h/config.h. The contents of pccts/h/config.h is
1172 now just "#include "pcctscfg.h".
1174 I don't have a record of the user who suggested this.
1176 #186. (Changed in MR14) Pre-processor symbol DllExportPCCTS class modifier
1178 Classes in the C++ runtime support routines are now declared:
1180 class DllExportPCCTS className ....
1182 By default, the pre-processor symbol is defined as the empty
1183 string. This if for use by MSVC++ users to create DLL classes.
1185 Suggested by Manfred Kogler (km@cast.uni-linz.ac.at).
1187 #185. (Changed in MR14) Option to not use PCCTS_AST base class for ASTBase
1189 Normally, the ASTBase class is derived from PCCTS_AST which contains
1190 functions useful to Sorcerer. If these are not necessary then the
1191 user can define the pre-processor symbol "PCCTS_NOT_USING_SOR" which
1192 will cause the ASTBase class to replace references to PCCTS_AST with
1193 references to ASTBase where necessary.
1195 The class ASTDoublyLinkedBase will contain a pure virtual function
1196 shallowCopy() that was formerly defined in class PCCTS_AST.
1198 Suggested by Bob McWhirter (bob@netwrench.com).
1200 #184. (Changed in MR14) Grammars with no tokens generate invalid tokens.h
1202 Reported by Hubert Holin (Hubert.Holin@bigfoot.com).
1204 #183. (Changed in MR14) -f to specify file with names of grammar files
1206 In DEC/VMS it is difficult to specify very long command lines.
1207 The -f option allows one to place the names of the grammar files
1208 in a data file in order to bypass limitations of the DEC/VMS
1209 command language interpreter.
1211 Addition supplied by Bernard Giroud (b_giroud@decus.ch).
1213 #182. (Changed in MR14) Output directory option for DEC/VMS
1215 Fix some problems with the -o option under DEC/VMS.
1217 Fix supplied by Bernard Giroud (b_giroud@decus.ch).
1219 #181. (Changed in MR14) Allow chars > 127 in DLGStringInput::nextChar()
1221 Changed DLGStringInput to cast the character using (unsigned char)
1222 so that languages with character codes greater than 127 work
1225 Suggested by Manfred Kogler (km@cast.uni-linz.ac.at).
1227 #180. (Added in MR14) ANTLRParser::getEofToken()
1229 Added "ANTLRToken ANTLRParser::getEofToken() const" to match the
1230 setEofToken routine.
1232 Requested by Manfred Kogler (km@cast.uni-linz.ac.at).
1234 #179. (Fixed in MR14) Memory leak for BufFileInput subclass of DLGInputStream
1236 The BufFileInput class described in Item #142 neglected to release
1237 the allocated buffer when an instance was destroyed.
1239 Reported by Manfred Kogler (km@cast.uni-linz.ac.at).
1241 #178. (Fixed in MR14) Bug in "(alpha)? beta" guess blocks first sets
1243 In 1.33 vanilla, and all maintenance releases prior to MR14
1244 there is a bug in the handling of guess blocks which use the
1249 inside a (...)*, (...)+, or {...} block.
1251 This problem does *not* apply to the case where beta is omitted
1252 or when the syntactic predicate is on the leading edge of an
1255 The problem is that both alpha and beta are stored in the
1256 syntax diagram, and that some analysis routines would fail
1257 to skip the alpha portion when it was not on the leading edge.
1258 Consider the following grammar with -ck 2:
1262 | A B /* forces -ck 2 computation for old antlr */
1263 /* reports ambig for alts 1 & 2 */
1265 | B C /* forces -ck 2 computation for new antlr */
1266 /* reports ambig for alts 1 & 3 */
1269 The prediction expression for the first alternative should be
1270 LA(1)={B C} LA(2)={B C D}, but previous versions of antlr
1271 would compute the prediction expression as LA(1)={A C} LA(2)={B D}
1273 Reported by Arpad Beszedes (beszedes@inf.u-szeged.hu) who provided
1274 a very clear example of the problem and identified the probable cause.
1276 #177. (Changed in MR14) #tokdefs and #token with regular expression
1278 In MR13 the change described by Item #162 caused an existing
1279 feature of antlr to fail. Prior to the change it was possible
1280 to give regular expression definitions and actions to tokens
1281 which were defined via the #tokdefs directive.
1283 This now works again.
1285 Reported by Manfred Kogler (km@cast.uni-linz.ac.at).
1287 #176. (Changed in MR14) Support for #line in antlr source code
1289 Note: this was implemented by Arpad Beszedes (beszedes@inf.u-szeged.hu).
1291 In 1.33MR14 it is possible for a pre-processor to generate #line
1292 directives in the antlr source and have those line numbers and file
1293 names used in antlr error messages and in the #line directives
1296 The #line directive may appear in the following forms:
1298 #line ll "sss" xx xx ...
1300 where ll represents a line number, "sss" represents the name of a file
1301 enclosed in quotation marks, and xxx are arbitrary integers.
1303 The following form (without "line") is not supported at the moment:
1305 # ll "sss" xx xx ...
1311 is replaced with ll from the # or #line directive
1315 is updated with the contents of the string (if any)
1316 following the line number
1320 The file-name string following the line number can be a complete
1321 name with a directory-path. Antlr generates the output files from
1322 the input file name (by replacing the extension from the file-name
1325 If the input file (or the file-name from the line-info) contains
1330 the generated source code will be placed in "../grammar.cpp" (i.e.
1331 in the parent directory). This is inconvenient in some cases
1332 (even the -o switch can not be used) so the path information is
1333 removed from the #line directive. Thus, if the line-info was
1335 #line 2 "../grammar.g"
1337 then the current file-name will become "grammar.g"
1339 In this way, the generated source code according to the grammar file
1340 will always be in the current directory, except when the -o switch
1343 #175. (Changed in MR14) Bug when guess block appears at start of (...)*
1345 In 1.33 vanilla and all maintenance releases prior to 1.33MR14
1346 there is a bug when a guess block appears at the start of a (...)+.
1347 Consider the following k=1 (ck=1) grammar:
1350 ( (STAR)? ZIP )* ID ;
1352 Prior to 1.33MR14, the generated code resembled:
1357 if ( ! LA(1)==STAR) break;
1367 Note that the routine uses STAR for the prediction expression
1368 rather than ZIP. With 1.33MR14 the generated code resembles:
1372 if ( ! LA(1)==ZIP) break;
1375 This problem existed only with (...)* blocks and was caused
1376 by the slightly more complicate graph which represents (...)*
1377 blocks. This caused the analysis routine to compute the first
1378 set for the alpha part of the "(alpha)? beta" rather than the
1381 Both (...)+ and {...} blocks handled the guess block correctly.
1383 Reported by Arpad Beszedes (beszedes@inf.u-szeged.hu) who provided
1384 a very clear example of the problem and identified the probable cause.
1386 #174. (Changed in MR14) Bug when action precedes syntactic predicate
1388 In 1.33 vanilla, and all maintenance releases prior to MR14,
1389 there was a bug when a syntactic predicate was immediately
1390 preceded by an action. Consider the following -ck 2 grammar:
1401 Prior to MR14, the code generated for the first alternative
1406 if ( !zzrv && LA(1)==A && LA(2)==A) {
1415 The prediction expression (i.e. LA(1)==A && LA(2)==A) is clearly
1416 wrong because LA(2) should be matched to B (first[2] of beta is {B}).
1418 With 1.33MR14 the prediction expression is:
1421 if ( !zzrv && LA(1)==A && LA(2)==B) {
1430 This will only affect users in which alpha is shorter than
1431 than max(k,ck) and there is an action immediately preceding
1432 the syntactic predicate.
1434 This problem was reported by reported by Arpad Beszedes
1435 (beszedes@inf.u-szeged.hu) who provided a very clear example
1436 of the problem and identified the presence of the init-action
1437 as the likely culprit.
1439 #173. (Changed in MR13a) -glms for Microsoft style filenames with -gl
1441 With the -gl option antlr generates #line directives using the
1442 exact name of the input files specified on the command line.
1443 An oddity of the Microsoft C and C++ compilers is that they
1444 don't accept file names in #line directives containing "\"
1445 even though these are names from the native file system.
1447 With -glms option, the "\" in file names appearing in #line
1448 directives is replaced with a "/" in order to conform to
1449 Microsoft compiler requirements.
1451 Reported by Erwin Achermann (erwin.achermann@switzerland.org).
1453 #172. (Changed in MR13) \r\n in antlr source counted as one line
1455 Some MS software uses \r\n to indicate a new line. Antlr
1456 now recognizes this in counting lines.
1458 Reported by Edward L. Hepler (elh@ece.vill.edu).
1460 #171. (Changed in MR13) #tokclass L..U now allowed
1462 The following is now allowed:
1464 #tokclass ABC { A..B C }
1466 Reported by Dave Watola (dwatola@amtsun.jpl.nasa.gov)
1468 #170. (Changed in MR13) Suppression for predicates with lookahead depth >1
1470 In MR12 the capability for suppression of predicates with lookahead
1471 depth=1 was introduced. With MR13 this had been extended to
1472 predicates with lookahead depth > 1 and released for use by users
1473 on an experimental basis.
1475 Consider the following grammar with -ck 2 and the predicate in rule
1485 a : (A B)? => <<p(LATEXT(2))>>? A B C
1491 Normally, the predicate would be hoisted into rule r1 in order to
1492 determine whether to call rule "ab". However it should *not* be
1493 hoisted because, even if p is false, there is a valid alternative
1494 in rule b. With "-mrhoistk on" the predicate will be suppressed.
1496 If "-info p" command line option is present the following information
1497 will appear in the generated code:
1502 Part (or all) of predicate with depth > 1 suppressed by alternative
1505 pred << p(LATEXT(2))>>?
1506 depth=k=2 ("=>" guard) rule a line 8 t1.g
1512 The token sequence which is suppressed: ( A B )
1513 The sequence of references which generate that sequence of tokens:
1515 1 to ab r1/1 line 1 t1.g
1516 2 ab ab/1 line 4 t1.g
1517 3 to b ab/2 line 5 t1.g
1518 4 b b/1 line 11 t1.g
1519 5 #token A b/1 line 11 t1.g
1520 6 #token B b/1 line 11 t1.g
1524 A slightly more complicated example:
1533 a : (A B)? => <<p(LATEXT(2))>>? (A B | D E)
1536 b : <<q(LATEXT(2))>>? D E
1540 In this case, the sequence (D E) in rule "a" which lies behind
1541 the guard is used to suppress the predicate with context (D E)
1544 while ( (LA(1)==A || LA(1)==D)
1547 Part (or all) of predicate with depth > 1 suppressed by alternative
1550 pred << q(LATEXT(2))>>?
1551 depth=k=2 rule b line 11 t2.g
1557 The token sequence which is suppressed: ( D E )
1558 The sequence of references which generate that sequence of tokens:
1560 1 to ab r1/1 line 1 t2.g
1561 2 ab ab/1 line 4 t2.g
1562 3 to a ab/1 line 4 t2.g
1564 5 #token D a/1 line 8 t2.g
1565 6 #token E a/1 line 8 t2.g
1571 pred << p(LATEXT(2))>>?
1572 depth=k=2 ("=>" guard) rule a line 8 t2.g
1580 (! ( LA(1)==A && LA(2)==B ) || p(LATEXT(2)) ) {
1584 #169. (Changed in MR13) Predicate test optimization for depth=1 predicates
1586 When the MR12 generated a test of a predicate which had depth 1
1587 it would use the depth >1 routines, resulting in correct but
1588 inefficient behavior. In MR13, a bit test is used.
1590 #168. (Changed in MR13) Token expressions in context guards
1592 The token expressions appearing in context guards such as:
1594 (A B)? => <<test(LT(1))>>? someRule
1596 are computed during an early phase of antlr processing. As
1597 a result, prior to MR13, complex expressions such as:
1605 were not computed properly. This resulted in incorrect
1606 context being computed for such expressions.
1608 In MR13 these context guards are verified for proper semantics
1609 in the initial phase and then re-evaluated after complex token
1610 expressions have been computed in order to produce the correct
1613 Reported by Arpad Beszedes (beszedes@inf.u-szeged.hu).
1615 #167. (Changed in MR13) ~L..U
1617 Prior to MR13, the complement of a token range was
1618 not properly computed.
1620 #166. (Changed in MR13) token expression L..U
1622 The token U was represented as an unsigned char, restricting
1623 the use of L..U to cases where U was assigned a token number
1624 less than 256. This is corrected in MR13.
1626 #165. (Changed in MR13) option -newAST
1628 To create ASTs from an ANTLRTokenPtr antlr usually calls
1629 "new AST(ANTLRTokenPtr)". This option generates a call
1630 to "newAST(ANTLRTokenPtr)" instead. This allows a user
1631 to define a parser member function to create an AST object.
1633 Similar changes for ASTBase::tmake and ASTBase::link were not
1634 thought necessary since they do not create AST objects, only
1637 #164. (Changed in MR13) Unused variable _astp
1639 For many compilations, we have lived with warnings about
1640 the unused variable _astp. It turns out that this varible
1641 can *never* be used because the code which references it was
1644 This investigation was sparked by a note from Erwin Achermann
1645 (erwin.achermann@switzerland.org).
1647 #163. (Changed in MR13) Incorrect makefiles for testcpp examples
1649 All the examples in pccts/testcpp/* had incorrect definitions
1650 in the makefiles for the symbol "CCC". Instead of CCC=CC they
1653 There was an additional problem in testcpp/1/test.g due to the
1654 change in ANTLRToken::getText() to a const member function
1657 Reported by Maurice Mass (maas@cuci.nl).
1659 #162. (Changed in MR13) Combining #token with #tokdefs
1661 When it became possible to change the print-name of a
1662 #token (Item #148) it became useful to give a #token
1663 statement whose only purpose was to giving a print name
1664 to the #token. Prior to this change this could not be
1665 combined with the #tokdefs feature.
1667 #161. (Changed in MR13) Switch -gxt inhibits generation of tokens.h
1669 #160. (Changed in MR13) Omissions in list of names for remap.h
1671 When a user selects the -gp option antlr creates a list
1672 of macros in remap.h to rename some of the standard
1673 antlr routines from zzXXX to userprefixXXX.
1675 There were number of omissions from the remap.h name
1676 list related to the new trace facility. This was reported,
1677 along with a fix, by Bernie Solomon (bernard@ug.eds.com).
1679 #159. (Changed in MR13) Violations of classic C rules
1681 There were a number of violations of classic C style in
1682 the distribution kit. This was reported, along with fixes,
1683 by Bernie Solomon (bernard@ug.eds.com).
1685 #158. (Changed in MR13) #header causes problem for pre-processors
1687 A user who runs the C pre-processor on antlr source suggested
1688 that another syntax be allowed. With MR13 such directives
1689 such as #header, #pragma, etc. may be written as "\#header",
1690 "\#pragma", etc. For escaping pre-processor directives inside
1691 a #header use something like the following:
1698 #157. (Fixed in MR13) empty error sets for rules with infinite recursion
1700 When the first set for a rule cannot be computed due to infinite
1701 left recursion and it is the only alternative for a block then
1702 the error set for the block would be empty. This would result
1705 Reported by Darin Creason (creason@genedax.com)
1707 #156. (Changed in MR13) DLGLexerBase::getToken() now public
1709 #155. (Changed in MR13) Context behind predicates can suppress
1711 With -mrhoist enabled the context behind a guarded predicate can
1712 be used to suppress other predicates. Consider the following grammar:
1719 rp : <<p LATEXT(1)>>? B ;
1720 rq : (A)? => <<q LATEXT(1)>>? (A|B);
1722 In earlier versions both predicates "p" and "q" would be hoisted into
1723 rule r0. With MR12c predicate p is suppressed because the context which
1724 follows predicate q includes "B" which can "cover" predicate "p". In
1725 other words, in trying to decide in r0 whether to call r1, it doesn't
1726 really matter whether p is false or true because, either way, there is
1727 a valid choice within r1.
1729 #154. (Changed in MR13) Making hoist suppression explicit using <<nohoist>>
1731 A common error, even among experienced pccts users, is to code
1732 an init-action to inhibit hoisting rather than a leading action.
1733 An init-action does not inhibit hoisting.
1739 This is what was meant:
1741 rule1 : <<;>> <<;>> rule2
1743 With MR13, the user can code:
1745 rule1 : <<;>> <<nohoist>> rule2
1747 The following will give an error message:
1749 rule1 : <<nohoist>> rule2
1751 If the <<nohoist>> appears as an init-action rather than a leading
1752 action an error message is issued. The meaning of an init-action
1753 containing "nohoist" is unclear: does it apply to just one
1754 alternative or to all alternatives ?
1763 -------------------------------------------------------
1764 Note: Items #153 to #1 are now in a separate file named
1765 CHANGES_FROM_133_BEFORE_MR13.txt
1766 -------------------------------------------------------