+++ /dev/null
-=======================================================================\r
-List of Implemented Fixes and Changes for Maintenance Releases of PCCTS\r
-=======================================================================\r
-\r
- DISCLAIMER\r
-\r
- The software and these notes are provided "as is". They may include\r
- typographical or technical errors and their authors disclaims all\r
- liability of any kind or nature for damages due to error, fault,\r
- defect, or deficiency regardless of cause. All warranties of any\r
- kind, either express or implied, including, but not limited to, the\r
- implied warranties of merchantability and fitness for a particular\r
- purpose are disclaimed.\r
-\r
-\r
- -------------------------------------------------------\r
- Note: Items #153 to #1 are now in a separate file named\r
- CHANGES_FROM_133_BEFORE_MR13.txt\r
- -------------------------------------------------------\r
- \r
-#312. (Changed in MR33) Bug caused by change #299.\r
-\r
- In change #299 a warning message was suppressed when there was\r
- no LT(1) in a semantic predicate and max(k,ck) was 1. The \r
- changed caused the code which set a default predicate depth for\r
- the semantic predicate to be left as 0 rather than set to 1.\r
- \r
- This manifested as an error at line #1559 of mrhost.c\r
- \r
- Reported by Peter Dulimov.\r
- \r
-#311. (Changed in MR33) Added sorcer/lib to Makefile.\r
-\r
- Reported by Dale Martin.\r
- \r
-#310. (Changed in MR32) In C mode zzastPush was spelled zzastpush in one case.\r
-\r
- Reported by Jean-Claude Durand\r
- \r
-#309. (Changed in MR32) Renamed baseName because of VMS name conflict\r
-\r
- Renamed baseName to pcctsBaseName to avoid library name conflict with\r
- VMS library routine. Reported by Jean-François PIÉRONNE.\r
- \r
-#308. (Changed in MR32) Used "template" as name of formal in C routine\r
-\r
- In astlib.h routine ast_scan a formal was named "template". This caused\r
- problems when the C code was compiled with a C++ compiler. Reported by\r
- Sabyasachi Dey.\r
- \r
-#307. (Changed in MR31) Compiler dependent bug in function prototype generation\r
- \r
- The code which generated function prototypes contained a bug which\r
- was compiler/optimization dependent. Under some circumstance an\r
- extra character would be included in portions of a function prototype.\r
- \r
- Reported by David Cook.\r
- \r
-#306. (Changed in MR30) Validating predicate following a token\r
-\r
- A validating predicate which immediately followed a token match \r
- consumed the token after the predicate rather than before. Prior\r
- to this fix (in the following example) isValidTimeScaleValue() in\r
- the predicate would test the text for TIMESCALE rather than for\r
- NUMBER:\r
- \r
- time_scale :\r
- TIMESCALE\r
- <<isValidTimeScaleValue(LT(1)->getText())>>?\r
- ts:NUMBER\r
- ( us:MICROSECOND << tVal = ...>>\r
- | ns:NANOSECOND << tVal = ... >>\r
- )\r
- \r
- Reported by Adalbert Perbandt.\r
- \r
-#305. (Changed in MR30) Alternatives with guess blocks inside (...)* blocks.\r
-\r
- In MR14 change #175 fixed a bug in the prediction expressions for guess\r
- blocks which were of the form (alpha)? beta. Unfortunately, this\r
- resulted in a new bug as exemplified by the example below, which computed\r
- the first set for r as {B} rather than {B C}:\r
- \r
- r : ( (A)? B\r
- | C\r
- )*\r
- \r
- This example doesn't make any sense as A is not a prefix of B, but it\r
- illustrates the problem. This bug did not appear for:\r
- \r
- r : ( (A)?\r
- | C\r
- )*\r
-\r
- because it does not use the (alpha)? beta form.\r
-\r
- Item #175 fixed an asymmetry in ambiguity messages for the following\r
- constructs which appear to have identical ambiguities (between repeating\r
- the loop vs. exiting the loop). MR30 retains this fix, but the implementation\r
- is slightly different.\r
- \r
- r_star : ( (A B)? )* A ;\r
- r_plus : ( (A B)? )+ A ;\r
-\r
- Reported by Arpad Beszedes (beszedes inf.u-szeged.hu).\r
- \r
-#304. (Changed in MR30) Crash when mismatch between output value counts.\r
-\r
- For a rule such as:\r
- \r
- r1 : r2>[i,j];\r
- r2 >[int i, int j] : A;\r
- \r
- If there were extra actuals for the reference to rule r2 from rule r1\r
- there antlr would crash. This bug was introduced by change #276.\r
-\r
- Reported by Sinan Karasu.\r
- \r
-#303. (Changed in MR30) DLGLexerBase::replchar\r
-\r
- DLGLexerBase::replchar and the C mode routine zzreplchar did not work \r
- properly when the new character was 0.\r
- \r
- Reported with fix by Philippe Laporte\r
-\r
-#302. (Changed in MR28) Fix significant problems in initial release of MR27.\r
-\r
-#301. (Changed in MR27) Default tab stops set to 2 spaces.\r
-\r
- To have antlr generate true tabs rather than spaces, use "antlr -tab 0".\r
- To generate 4 spaces per tab stop use "antlr -tab 4"\r
- \r
-#300. (Changed in MR27)\r
-\r
- Consider the following methods of constructing an AST from ID:\r
- \r
- rule1!\r
- : id:ID << #0 = #[id]; >> ;\r
- \r
- rule2!\r
- : id:ID << #0 = #id; >> ;\r
- \r
- rule3\r
- : ID ;\r
- \r
- rule4\r
- : id:ID << #0 = #id; >> ;\r
- \r
- For rule_2, the AST corresponding to id would always be NULL. This\r
- is because the user explicitly suppressed AST construction using the\r
- "!" operator on the rule. In MR27 the use of an AST expression\r
- such as #id overrides the "!" operator and forces construction of\r
- the AST.\r
- \r
- This fix does not apply to C mode ASTs when the ASTs are referenced\r
- using numbers rather than symbols.\r
-\r
- For C mode, this requires that the (optional) function/macro zzmk_ast\r
- be defined. This functions copies information from an attribute into\r
- a previously allocated AST.\r
-\r
- Reported by Jan Langer (jan langernetz.de)\r
-\r
-#299. (Changed in MR27) Don't warn if k=1 and semantic predicate missing LT(i)\r
-\r
- If a semantic does not have a reference to LT(i) or (C mode LATEXT(i))\r
- then pccts doesn't know how many lookahead tokens to use for context.\r
- However, if max(k,ck) is 1 then there is really only one choice and\r
- the warning is unnecessary.\r
- \r
-#298. (Changed in MR27) Removed "register" for lastpos in dlgauto.c zzgettok\r
-\r
-#297. (Changed in MR27) Incorrect prototypes when used with classic C\r
-\r
- There were a number of errors in function headers when antlr was\r
- built with compilers that do not have __STDC__ or __cplusplus set.\r
- \r
- The functions which have variable length argument lists now use\r
- PCCTS_USE_STDARG rather than __USE_PROTOTYPES__ to determine\r
- whether to use stdargs or varargs.\r
-\r
-#296. (Changed in MR27) Complex return types in rules.\r
-\r
- The following return type was not properly handled when \r
- unpacking a struct with containing multiple return values:\r
- \r
- rule > [int i, IIR_Bool (IIR_Decl::*constraint)()] : ... \r
-\r
- Instead of using "constraint", the program got lost and used\r
- an empty string.\r
- \r
- Reported by P.A. Wilsey.\r
-\r
-#295. (Changed in MR27) Extra ";" following zzGUESS_DONE sometimes.\r
-\r
- Certain constructs with guess blocks in MR23 led to extra ";"\r
- preceding the "else" clause of an "if".\r
-\r
- Reported by P.A. Wilsey.\r
- \r
-#294. (Changed in MR27) Infinite loop in antlr for nested blocks\r
-\r
- An oversight in detecting an empty alternative sometimes led\r
- to an infinite loop in antlr when it encountered a rule with\r
- nested blocks and guess blocks.\r
- \r
- Reported by P.A. Wilsey.\r
- \r
-#293. (Changed in MR27) Sorcerer optimization of _t->type()\r
-\r
- Sorcerer generated code may contain many calls to _t->type() in a\r
- single statement. This change introduces a temporary variable\r
- to eliminate unnnecesary function calls.\r
-\r
- Change implemented by Tom Molteno (tim videoscript.com).\r
-\r
-#292. (Changed in MR27)\r
-\r
- WARNING: Item #267 changes the signature of methods in the AST class.\r
-\r
- **** Be sure to revise your AST functions of the same name ***\r
-\r
-#291. (Changed in MR24)\r
-\r
- Fix to serious code generation error in MR23 for (...)+ block.\r
-\r
-#290. (Changed in MR23) \r
-\r
- Item #247 describes a change in the way {...} blocks handled\r
- an error. Consider:\r
-\r
- r1 : {A} b ;\r
- b : B;\r
- \r
- with input "C".\r
-\r
- Prior to change #247, the error would resemble "expected B -\r
- found C". This is correct but incomplete, and therefore\r
- misleading. In #247 it was changed to "expected A, B - found\r
- C". This was fine, except for users of parser exception\r
- handling because the exception was generated in the epilogue \r
- for {...} block rather than in rule b. This made it difficult\r
- for users of parser exception handling because B was not\r
- expected in that context. Those not using parser exception\r
- handling didn't notice the difference.\r
-\r
- The current change restores the behavior prior to #247 when\r
- parser exceptions are present, but retains the revised behavior\r
- otherwise. This change should be visible only when exceptions\r
- are in use and only for {...} blocks and sub-blocks of the form\r
- (something|something | something | epsilon) where epsilon represents\r
- an empty production and it is the last alternative of a sub-block.\r
- In contrast, (something | epsilon | something) should generate the\r
- same code as before, even when exceptions are used.\r
- \r
- Reported by Philippe Laporte (philippe at transvirtual.com).\r
-\r
-#289. (Changed in MR23) Bug in matching complement of a #tokclass\r
-\r
- Prior to MR23 when a #tokclass was matched in both its complemented form\r
- and uncomplemented form, the bit set generated for its first use was used\r
- for both cases. However, the prediction expression was correctly computed\r
- in both cases. This meant that the second case would never be matched\r
- because, for the second appearance, the prediction expression and the \r
- set to be matched would be complements of each other.\r
- \r
- Consider:\r
- \r
- #token A "a"\r
- #token B "b"\r
- #token C "c"\r
- #tokclass AB {A B}\r
- \r
- r1 : AB /* alt 1x */\r
- | ~AB /* alt 1y */\r
- ;\r
- \r
- Prior to MR23, this resulted in alternative 1y being unreachable. Had it\r
- been written:\r
- \r
- r2 : ~AB /* alt 2x */\r
- : AB /* alt 2y */\r
- \r
- then alternative 2y would have become unreachable. \r
- \r
- This bug was only for the case of complemented #tokclass. For complemented\r
- #token the proper code was generated. \r
- \r
-#288. (Changed in MR23) #errclass not restricted to choice points\r
-\r
- The #errclass directive is supposed to allow a programmer to define\r
- print strings which should appear in syntax error messages as a replacement\r
- for some combinations of tokens. For instance:\r
- \r
- #errclass Operator {PLUS MINUS TIMES DIVIDE}\r
- \r
- If a syntax message includes all four of these tokens, and there is no\r
- "better" choice of error class, the word "Operator" will be used rather\r
- than a list of the four token names.\r
- \r
- Prior to MR23 the #errclass definitions were used only at choice points\r
- (which call the FAIL macro). In other cases where there was no choice\r
- (e.g. where a single token or token class were matched) the #errclass\r
- information was not used.\r
-\r
- With MR23 the #errclass declarations are used for syntax error messages\r
- when matching a #tokclass, a wildcard (i.e. "*"), or the complement of a\r
- #token or #tokclass (e.g. ~Operator).\r
-\r
- Please note that #errclass may now be defined using #tokclass names \r
- (see Item #284).\r
-\r
- Reported by Philip A. Wilsey.\r
-\r
-#287. (Changed in MR23) Print name for #tokclass\r
-\r
- Item #148 describes how to give a print name to a #token so that,for\r
- example, #token ID could have the expression "identifier" in syntax\r
- error messages. This has been extended to #tokclass:\r
- \r
- #token ID("identifier") "[a-zA-Z]+"\r
- #tokclass Primitive("primitive type") \r
- {INT, FLOAT, CHAR, FLOAT, DOUBLE, BOOL} \r
-\r
- This is really a cosmetic change, since #tokclass names do not appear\r
- in any error messages.\r
- \r
-#286. (Changed in MR23) Makefile change to use of cd\r
-\r
- In cases where a pccts subdirectory name matched a directory identified\r
- in a $CDPATH environment variable the build would fail. All makefile\r
- cd commands have been changed from "cd xyz" to "cd ./xyz" in order\r
- to avoid this problem.\r
- \r
-#285. (Changed in MR23) Check for null pointers in some dlg structures\r
-\r
- An invalid regular expression can cause dlg to build an invalid\r
- structure to represent the regular expression even while it issues \r
- error messages. Additional pointer checks were added.\r
-\r
- Reported by Robert Sherry.\r
-\r
-#284. (Changed in MR23) Allow #tokclass in #errclass definitions\r
-\r
- Previously, a #tokclass reference in the definition of an\r
- #errclass was not handled properly. Instead of being expanded\r
- into the set of tokens represented by the #tokclass it was\r
- treated somewhat like an #errclass. However, in a later phase\r
- when all #errclass were expanded into the corresponding tokens\r
- the #tokclass reference was not expanded (because it wasn't an\r
- #errclass). In effect the reference was ignored.\r
-\r
- This has been fixed.\r
-\r
- Problem reported by Mike Dimmick (mike dimmick.demon.co.uk).\r
-\r
-#283. (Changed in MR23) Option -tmake invoke's parser's tmake \r
-\r
- When the string #(...) appears in an action antlr replaces it with\r
- a call to ASTBase::tmake(...) to construct an AST. It is sometimes\r
- useful to change the tmake routine so that it has access to information\r
- in the parser - something which is not possible with a static method\r
- in an application where they may be multiple parsers active.\r
-\r
- The antlr option -tmake replaces the call to ASTBase::tmake with a call\r
- to a user supplied tmake routine.\r
- \r
-#282. (Changed in MR23) Initialization error for DBG_REFCOUNTTOKEN\r
-\r
- When the pre-processor symbol DBG_REFCOUNTTOKEN is defined \r
- incorrect code is generated to initialize ANTLRRefCountToken::ctor and\r
- dtor.\r
-\r
- Fix reported by Sven Kuehn (sven sevenkuehn.de).\r
- \r
-#281. (Changed in MR23) Addition of -noctor option for Sorcerer\r
-\r
- Added a -noctor option to suppress generation of the blank ctor\r
- for users who wish to define their own ctor.\r
-\r
- Contributed by Jan Langer (jan langernetz.de).\r
-\r
-#280. (Changed in MR23) Syntax error message for EOF token\r
-\r
- The EOF token now receives special treatment in syntax error messages\r
- because there is no text matched by the eof token. The token name\r
- of the eof token is used unless it is "@" - in which case the string\r
- "<eof>" is used.\r
-\r
- Problem reported by Erwin Achermann (erwin.achermann switzerland.org).\r
-\r
-#279. (Changed in MR23) Exception groups\r
-\r
- There was a bug in the way that exception groups were attached to\r
- alternatives which caused problems when there was a block contained\r
- in an alternative. For instance, in the following rule;\r
-\r
- statement : IF S { ELSE S } \r
- exception ....\r
- ;\r
-\r
- the exception would be attached to the {...} block instead of the \r
- entire alternative because it was attached, in error, to the last\r
- alternative instead of the last OPEN alternative.\r
-\r
- Reported by Ty Mordane (tymordane hotmail.com).\r
- \r
-#278. (Changed in MR23) makefile changes\r
-\r
- Contributed by Tomasz Babczynski (faster lab05-7.ict.pwr.wroc.pl).\r
-\r
- The -cfile option is not absolutely needed: when extension of\r
- source file is one of the well-known C/C++ extensions it is \r
- treated as C/C++ source\r
-\r
- The gnu make defines the CXX variable as the default C++ compiler\r
- name, so I added a line to copy this (if defined) to the CCC var.\r
-\r
- Added a -sor option: after it any -class command defines the class\r
- name for sorcerer, not for ANTLR. A file extended with .sor is \r
- treated as sorcerer input. Because sorcerer can be called multiple\r
- times, -sor option can be repeated. Any files and classes (one class\r
- per group) after each -sor makes one tree parser.\r
-\r
- Not implemented:\r
-\r
- 1. Generate dependences for user c/c++ files.\r
- 2. Support for -sor in c mode not.\r
-\r
- I have left the old genmk program in the directory as genmk_old.c.\r
-\r
-#277. (Changed in MR23) Change in macro for failed semantic predicates\r
-\r
- In the past, a semantic predicate that failed generated a call to\r
- the macro zzfailed_pred:\r
-\r
- #ifndef zzfailed_pred\r
- #define zzfailed_pred(_p) \\r
- if (guessing) { \\r
- zzGUESS_FAIL; \\r
- } else { \\r
- something(_p)\r
- }\r
- #endif\r
-\r
- If a user wished to use the failed action option for semantic predicates:\r
-\r
- rule : <<my_predicate>>? [my_fail_action] A\r
- | ...\r
-\r
- \r
- the code for my_fail_action would have to contain logic for handling\r
- the guess part of the zzfailed_pred macro. The user should not have\r
- to be aware of the guess logic in writing the fail action.\r
-\r
- The zzfailed_pred has been rewritten to have three arguments:\r
-\r
- arg 1: the stringized predicate of the semantic predicate\r
- arg 2: 0 => there is no user-defined fail action\r
- 1 => there is a user-defined fail action\r
- arg 3: the user-defined fail action (if defined)\r
- otherwise a no-operation\r
-\r
- The zzfailed_pred macro is now defined as:\r
-\r
- #ifndef zzfailed_pred\r
- #define zzfailed_pred(_p,_hasuseraction,_useraction) \\r
- if (guessing) { \\r
- zzGUESS_FAIL; \\r
- } else { \\r
- zzfailed_pred_action(_p,_hasuseraction,_useraction) \\r
- }\r
- #endif\r
-\r
-\r
- With zzfailed_pred_action defined as:\r
-\r
- #ifndef zzfailed_pred_action\r
- #define zzfailed_pred_action(_p,_hasuseraction,_useraction) \\r
- if (_hasUserAction) { _useraction } else { failedSemanticPredicate(_p); }\r
- #endif\r
-\r
- In C++ mode failedSemanticPredicate() is a virtual function.\r
- In C mode the default action is a fprintf statement.\r
-\r
- Suggested by Erwin Achermann (erwin.achermann switzerland.org).\r
-\r
-#276. (Changed in MR23) Addition of return value initialization syntax\r
-\r
- In an attempt to reduce the problems caused by the PURIFY macro I have\r
- added new syntax for initializing the return value of rules and the\r
- antlr option "-nopurify".\r
-\r
- A rule with a single return argument:\r
-\r
- r1 > [Foo f = expr] :\r
-\r
- now generates code that resembles:\r
-\r
- Foo r1(void) {\r
- Foo _retv = expr;\r
- ...\r
- }\r
- \r
- A rule with more than one return argument:\r
-\r
- r2 > [Foo f = expr1, Bar b = expr2 ] :\r
-\r
- generates code that resembles:\r
-\r
- struct _rv1 {\r
- Foo f;\r
- Bar b;\r
- }\r
-\r
- _rv1 r2(void) {\r
- struct _rv1 _retv;\r
- _retv.f = expr1;\r
- _retv.b = expr2;\r
- ...\r
- }\r
-\r
- C++ style comments appearing in the initialization list may cause problems.\r
-\r
-#275. (Changed in MR23) Addition of -nopurify option to antlr\r
-\r
- A long time ago the PURIFY macro was introduced to initialize\r
- return value arguments and get rid of annying messages from program\r
- that checked for unitialized variables.\r
-\r
- This has caused significant annoyance for C++ users that had\r
- classes with virtual functions or non-trivial contructors because\r
- it would zero the object, including the pointer to the virtual\r
- function table. This could be defeated by redefining\r
- the PURIFY macro to be empty, but it was a constant surprise to\r
- new C++ users of pccts.\r
-\r
- I would like to remove it, but I fear that some existing programs\r
- depend on it and would break. My temporary solution is to add\r
- an antlr option -nopurify which disables generation of the PURIFY\r
- macro call.\r
-\r
- The PURIFY macro should be avoided in favor of the new syntax\r
- for initializing return arguments described in item #275.\r
-\r
- To avoid name clash, the PURIFY macro has been renamed PCCTS_PURIFY.\r
-\r
-#274. (Changed in MR23) DLexer.cpp renamed to DLexer.h\r
- (Changed in MR23) ATokPtr.cpp renamed to ATokPtrImpl.h\r
-\r
- These two files had .cpp extensions but acted like .h files because\r
- there were included in other files. This caused problems for many IDE.\r
- I have renamed them. The ATokPtrImpl.h was necessary because there was\r
- already an ATokPtr.h.\r
-\r
-#273. (Changed in MR23) Default win32 library changed to multi-threaded DLL\r
-\r
- The model used for building the Win32 debug and release libraries has changed\r
- to multi-threaded DLL.\r
-\r
- To make this change in your MSVC 6 project:\r
-\r
- Project -> Settings\r
- Select the C++ tab in the right pane of the dialog box\r
- Select "Category: Code Generation"\r
- Under "Use run-time library" select one of the following:\r
-\r
- Multi-threaded DLL\r
- Debug Multi-threaded DLL\r
- \r
- Suggested by Bill Menees (bill.menees gogallagher.com) \r
- \r
-#272. (Changed in MR23) Failed semantic predicate reported via virtual function\r
-\r
- In the past, a failed semantic predicated reported the problem via a\r
- macro which used fprintf(). The macro now expands into a call on \r
- the virtual function ANTLRParser::failedSemanticPredicate().\r
-\r
-#271. (Changed in MR23) Warning for LT(i), LATEXT(i) in token match actions\r
-\r
- An bug (or at least an oddity) is that a reference to LT(1), LA(1),\r
- or LATEXT(1) in an action which immediately follows a token match\r
- in a rule refers to the token matched, not the token which is in\r
- the lookahead buffer. Consider:\r
-\r
- r : abc <<action alpha>> D <<action beta>> E;\r
-\r
- In this case LT(1) in action alpha will refer to the next token in\r
- the lookahead buffer ("D"), but LT(1) in action beta will refer to\r
- the token matched by D - the preceding token.\r
-\r
- A warning has been added for users about this when an action\r
- following a token match contains a reference to LT(1), LA(1), or LATEXT(1).\r
-\r
- This behavior should be changed, but it appears in too many programs\r
- now. Another problem, perhaps more significant, is that the obvious\r
- fix (moving the consume() call to before the action) could change the \r
- order in which input is requested and output appears in existing programs.\r
-\r
- This problem was reported, along with a fix by Benjamin Mandel\r
- (beny sd.co.il). However, I felt that changing the behavior was too\r
- dangerous for existing code.\r
-\r
-#270. (Changed in MR23) Removed static objects from PCCTSAST.cpp\r
-\r
- There were some statically allocated objects in PCCTSAST.cpp\r
- These were changed to non-static.\r
-\r
-#269. (Changed in MR23) dlg output for initializing static array\r
-\r
- The output from dlg contains a construct similar to the\r
- following:\r
- \r
- struct XXX {\r
- static const int size;\r
- static int array1[5];\r
- };\r
-\r
- const int XXX::size = 4;\r
- int XXX::array1[size+1];\r
-\r
- \r
- The problem is that although the expression "size+1" used in\r
- the definition of array1 is equal to 5 (the expression used to\r
- declare array), it is not considered equivalent by some compilers.\r
-\r
- Reported with fix by Volker H. Simonis (simonis informatik.uni-tuebingen.de)\r
-\r
-#268. (Changed in MR23) syn() routine output when k > 1\r
-\r
- The syn() routine is supposed to print out the text of the\r
- token causing the syntax error. It appears that it always\r
- used the text from the first lookahead token rather than the\r
- appropriate one. The appropriate one is computed by comparing\r
- the token codes of lookahead token i (for i = 1 to k) with\r
- the FIRST(i) set.\r
- \r
- This has been corrected in ANTLRParser::syn().\r
-\r
- Reported by Bill Menees (bill.menees gogallagher.com) \r
-\r
-#267. (Changed in MR23) AST traversal functions client data argument\r
-\r
- The AST traversal functions now take an extra (optional) parameter\r
- which can point to client data:\r
-\r
- preorder_action(void* pData = NULL)\r
- preorder_before_action(void* pData = NULL)\r
- preorder_after_action(void* pData = NULL)\r
-\r
- **** Warning: this changes the AST signature. ***\r
- **** Be sure to revise your AST functions of the same name ***\r
-\r
- Bill Menees (bill.menees gogallagher.com) \r
- \r
-#266. (Changed in MR23) virtual function printMessage()\r
-\r
- Bill Menees (bill.menees gogallagher.com) has completed the\r
- tedious taks of replacing all calls to fprintf() with calls\r
- to the virtual function printMessage(). For classes which\r
- have a pointer to the parser it forwards the printMessage()\r
- call to the parser's printMessage() routine.\r
-\r
- This should make it significanly easier to redirect pccts\r
- error and warning messages.\r
-\r
-#265. (Changed in MR23) Remove "labase++" in C++ mode\r
-\r
- In C++ mode labase++ is called when a token is matched.\r
- It appears that labase is not used in C++ mode at all, so\r
- this code has been commented out.\r
- \r
-#264. (Changed in MR23) Complete rewrite of ParserBlackBox.h\r
-\r
- The parser black box (PBlackBox.h) was completely rewritten\r
- by Chris Uzdavinis (chris atdesk.com) to improve its robustness.\r
-\r
-#263. (Changed in MR23) -preamble and -preamble_first rescinded\r
-\r
- Changes for item #253 have been rescinded.\r
-\r
-#262. (Changed in MR23) Crash with -alpha option during traceback\r
-\r
- Under some circumstances a -alpha traceback was started at the\r
- "wrong" time. As a result, internal data structures were not\r
- initialized.\r
-\r
- Reported by Arpad Beszedes (beszedes inf.u-szeged.hu).\r
-\r
-#261. (Changed in MR23) Defer token fetch for C++ mode\r
-\r
- Item #216 has been revised to indicate that use of the defer fetch\r
- option (ZZDEFER_FETCH) requires dlg option -i.\r
-\r
-#260. (MR22) Raise default lex buffer size from 8,000 to 32,000 bytes.\r
-\r
- ZZLEXBUFSIZE is the size (in bytes) of the buffer used by dlg \r
- generated lexers. The default value has been raised to 32,000 and\r
- the value used by antlr, dlg, and sorcerer has also been raised to\r
- 32,000.\r
-\r
-#259. (MR22) Default function arguments in C++ mode.\r
-\r
- If a rule is declared:\r
-\r
- rr [int i = 0] : ....\r
-\r
- then the declaration generated by pccts resembles:\r
-\r
- void rr(int i = 0);\r
-\r
- however, the definition must omit the default argument:\r
-\r
- void rr(int i) {...}\r
-\r
- In the past the default value was not omitted. In MR22\r
- the generated code resembles:\r
-\r
- void rr(int i /* = 0 */ ) {...}\r
-\r
- Implemented by Volker H. Simonis (simonis informatik.uni-tuebingen.de)\r
-\r
-\r
- Note: In MR23 this was changed so that nested C style comments\r
- ("/* ... */") would not cause problems.\r
-\r
-#258. (MR22) Using a base class for your parser\r
-\r
- In item #102 (MR10) the class statement was extended to allow one\r
- to specify a base class other than ANTLRParser for the generated\r
- parser. It turned out that this was less than useful because\r
- the constructor still specified ANTLRParser as the base class.\r
-\r
- The class statement now uses the first identifier appearing after\r
- the ":" as the name of the base class. For example:\r
-\r
- class MyParser : public FooParser {\r
-\r
- Generates in MyParser.h:\r
-\r
- class MyParser : public FooParser {\r
-\r
- Generates in MyParser.cpp something that resembles:\r
-\r
- MyParser::MyParser(ANTLRTokenBuffer *input) :\r
- FooParser(input,1,0,0,4)\r
- {\r
- token_tbl = _token_tbl;\r
- traceOptionValueDefault=1; // MR10 turn trace ON\r
- }\r
-\r
- The base class constructor must have a signature similar to\r
- that of ANTLRParser.\r
-\r
-#257. (MR21a) Removed dlg statement that -i has no effect in C++ mode.\r
-\r
- This was incorrect.\r
-\r
-#256. (MR21a) Malformed syntax graph causes crash after error message.\r
-\r
- In the past, certain kinds of errors in the very first grammar\r
- element could cause the construction of a malformed graph \r
- representing the grammar. This would eventually result in a\r
- fatal internal error. The code has been changed to be more\r
- resistant to this particular error.\r
-\r
-#255. (MR21a) ParserBlackBox(FILE* f) \r
-\r
- This constructor set openByBlackBox to the wrong value.\r
-\r
- Reported by Kees Bakker (kees_bakker tasking.nl).\r
-\r
-#254. (MR21a) Reporting syntax error at end-of-file\r
-\r
- When there was a syntax error at the end-of-file the syntax\r
- error routine would substitute "<eof>" for the programmer's\r
- end-of-file symbol. This substitution is now done only when\r
- the programmer does not define his own end-of-file symbol\r
- or the symbol begins with the character "@".\r
-\r
- Reported by Kees Bakker (kees_bakker tasking.nl).\r
-\r
-#253. (MR21) Generation of block preamble (-preamble and -preamble_first)\r
-\r
- *** This change was rescinded by item #263 ***\r
-\r
- The antlr option -preamble causes antlr to insert the code\r
- BLOCK_PREAMBLE at the start of each rule and block. It does\r
- not insert code before rules references, token references, or\r
- actions. By properly defining the macro BLOCK_PREAMBLE the\r
- user can generate code which is specific to the start of blocks.\r
-\r
- The antlr option -preamble_first is similar, but inserts the\r
- code BLOCK_PREAMBLE_FIRST(PreambleFirst_123) where the symbol\r
- PreambleFirst_123 is equivalent to the first set defined by\r
- the #FirstSetSymbol described in Item #248.\r
-\r
- I have not investigated how these options interact with guess\r
- mode (syntactic predicates).\r
-\r
-#252. (MR21) Check for null pointer in trace routine\r
-\r
- When some trace options are used when the parser is generated\r
- without the trace enabled, the current rule name may be a\r
- NULL pointer. A guard was added to check for this in\r
- restoreState.\r
-\r
- Reported by Douglas E. Forester (dougf projtech.com).\r
-\r
-#251. (MR21) Changes to #define zzTRACE_RULES\r
-\r
- The macro zzTRACE_RULES was being use to pass information to\r
- AParser.h. If this preprocessor symbol was not properly\r
- set the first time AParser.h was #included, the declaration\r
- of zzTRACEdata would be omitted (it is used by the -gd option).\r
- Subsequent #includes of AParser.h would be skipped because of \r
- the #ifdef guard, so the declaration of zzTracePrevRuleName would\r
- never be made. The result was that proper compilation was very \r
- order dependent.\r
-\r
- The declaration of zzTRACEdata was made unconditional and the\r
- problem of removing unused declarations will be left to optimizers.\r
- \r
- Diagnosed by Douglas E. Forester (dougf projtech.com).\r
-\r
-#250. (MR21) Option for EXPERIMENTAL change to error sets for blocks\r
-\r
- The antlr option -mrblkerr turns on an experimental feature\r
- which is supposed to provide more accurate syntax error messages\r
- for k=1, ck=1 grammars. When used with k>1 or ck>1 grammars the\r
- behavior should be no worse than the current behavior.\r
-\r
- There is no problem with the matching of elements or the computation\r
- of prediction expressions in pccts. The task is only one of listing\r
- the most appropriate tokens in the error message. The error sets used\r
- in pccts error messages are approximations of the exact error set when\r
- optional elements in (...)* or (...)+ are involved. While entirely\r
- correct, the error messages are sometimes not 100% accurate. \r
-\r
- There is also a minor philosophical issue. For example, suppose the\r
- grammar expects the token to be an optional A followed by Z, and it \r
- is X. X, of course, is neither A nor Z, so an error message is appropriate.\r
- Is it appropriate to say "Expected Z" ? It is correct, it is accurate,\r
- but it is not complete. \r
-\r
- When k>1 or ck>1 the problem of providing the exactly correct\r
- list of tokens for the syntax error messages ends up becoming\r
- equivalent to evaluating the prediction expression for the\r
- alternatives twice. However, for k=1 ck=1 grammars the prediction\r
- expression can be computed easily and evaluated cheaply, so I\r
- decided to try implementing it to satisfy a particular application.\r
- This application uses the error set in an interactive command language\r
- to provide prompts which list the alternatives available at that\r
- point in the parser. The user can then enter additional tokens to\r
- complete the command line. To do this required more accurate error \r
- sets then previously provided by pccts.\r
-\r
- In some cases the default pccts behavior may lead to more robust error\r
- recovery or clearer error messages then having the exact set of tokens.\r
- This is because (a) features like -ge allow the use of symbolic names for\r
- certain sets of tokens, so having extra tokens may simply obscure things\r
- and (b) the error set is use to resynchronize the parser, so a good\r
- choice is sometimes more important than having the exact set.\r
-\r
- Consider the following example:\r
-\r
- Note: All examples code has been abbreviated\r
- to the absolute minimum in order to make the\r
- examples concise.\r
-\r
- star1 : (A)* Z;\r
-\r
- The generated code resembles:\r
-\r
- old new (with -mrblkerr)\r
- --//----------- --------------------\r
- for (;;) { for (;;) {\r
- match(A); match(A);\r
- } }\r
- match(Z); if (! A and ! Z) then\r
- FAIL(...{A,Z}...);\r
- }\r
- match(Z);\r
-\r
-\r
- With input X\r
- old message: Found X, expected Z\r
- new message: Found X, expected A, Z\r
-\r
- For the example:\r
-\r
- star2 : (A|B)* Z;\r
-\r
- old new (with -mrblkerr)\r
- ------------- --------------------\r
- for (;;) { for (;;) {\r
- if (!A and !B) break; if (!A and !B) break;\r
- if (...) { if (...) {\r
- <same ...> <same ...>\r
- } }\r
- else { else {\r
- FAIL(...{A,B,Z}...) FAIL(...{A,B}...);\r
- } }\r
- } }\r
- match(B); if (! A and ! B and !Z) then\r
- FAIL(...{A,B,Z}...);\r
- }\r
- match(B);\r
-\r
- With input X\r
- old message: Found X, expected Z\r
- new message: Found X, expected A, B, Z\r
- With input A X\r
- old message: Found X, expected Z\r
- new message: Found X, expected A, B, Z\r
-\r
- This includes the choice of looping back to the\r
- star block.\r
-\r
- The code for plus blocks:\r
-\r
- plus1 : (A)+ Z;\r
-\r
- The generated code resembles:\r
-\r
- old new (with -mrblkerr)\r
- ------------- --------------------\r
- do { do {\r
- match(A); match(A);\r
- } while (A) } while (A)\r
- match(Z); if (! A and ! Z) then\r
- FAIL(...{A,Z}...);\r
- }\r
- match(Z);\r
-\r
- With input A X\r
- old message: Found X, expected Z\r
- new message: Found X, expected A, Z\r
-\r
- This includes the choice of looping back to the\r
- plus block.\r
-\r
- For the example:\r
-\r
- plus2 : (A|B)+ Z;\r
-\r
- old new (with -mrblkerr)\r
- ------------- --------------------\r
- do { do {\r
- if (A) { <same>\r
- match(A); <same>\r
- } else if (B) { <same>\r
- match(B); <same>\r
- } else { <same>\r
- if (cnt > 1) break; <same>\r
- FAIL(...{A,B,Z}...) FAIL(...{A,B}...);\r
- } }\r
- cnt++; <same>\r
- } }\r
-\r
- match(Z); if (! A and ! B and !Z) then\r
- FAIL(...{A,B,Z}...);\r
- }\r
- match(B);\r
-\r
- With input X\r
- old message: Found X, expected A, B, Z\r
- new message: Found X, expected A, B\r
- With input A X\r
- old message: Found X, expected Z\r
- new message: Found X, expected A, B, Z\r
-\r
- This includes the choice of looping back to the\r
- star block.\r
- \r
-#249. (MR21) Changes for DEC/VMS systems\r
-\r
- Jean-François Piéronne (jfp altavista.net) has updated some\r
- VMS related command files and fixed some minor problems related\r
- to building pccts under the DEC/VMS operating system. For DEC/VMS\r
- users the most important differences are:\r
-\r
- a. Revised makefile.vms\r
- b. Revised genMMS for genrating VMS style makefiles.\r
-\r
-#248. (MR21) Generate symbol for first set of an alternative\r
-\r
- pccts can generate a symbol which represents the tokens which may\r
- appear at the start of a block:\r
-\r
- rr : #FirstSetSymbol(rr_FirstSet) ( Foo | Bar ) ;\r
-\r
- This will generate the symbol rr_FirstSet of type SetWordType with\r
- elements Foo and Bar set. The bits can be tested using code similar \r
- to the following:\r
-\r
- if (set_el(Foo, &rr_FirstSet)) { ...\r
-\r
- This can be combined with the C array zztokens[] or the C++ routine\r
- tokenName() to get the print name of the token in the first set.\r
-\r
- The size of the set is given by the newly added enum SET_SIZE, a \r
- protected member of the generated parser's class. The number of\r
- elements in the generated set will not be exactly equal to the \r
- value of SET_SIZE because of synthetic tokens created by #tokclass,\r
- #errclass, the -ge option, and meta-tokens such as epsilon, and\r
- end-of-file.\r
-\r
- The #FirstSetSymbol must appear immediately before a block\r
- such as (...)+, (...)*, and {...}, and (...). It may not appear\r
- immediately before a token, a rule reference, or action. However\r
- a token or rule reference can be enclosed in a (...) in order to\r
- make the use of #pragma FirstSetSymbol legal.\r
-\r
- rr_bad : #FirstSetSymbol(rr_bad_FirstSet) Foo; // Illegal\r
-\r
- rr_ok : #FirstSetSymbol(rr_ok_FirstSet) (Foo); // Legal\r
- \r
- Do not confuse FirstSetSymbol sets with the sets used for testing\r
- lookahead. The sets used for FirstSetSymbol have one element per bit,\r
- so the number of bytes is approximately the largest token number\r
- divided by 8. The sets used for testing lookahead store 8 lookahead \r
- sets per byte, so the length of the array is approximately the largest\r
- token number.\r
-\r
- If there is demand, a similar routine for follow sets can be added.\r
-\r
-#247. (MR21) Misleading error message on syntax error for optional elements.\r
-\r
- ===================================================\r
- The behavior has been revised when parser exception\r
- handling is used. See Item #290\r
- ===================================================\r
-\r
- Prior to MR21, tokens which were optional did not appear in syntax\r
- error messages if the block which immediately followed detected a \r
- syntax error.\r
-\r
- Consider the following grammar which accepts Number, Word, and Other:\r
-\r
- rr : {Number} Word;\r
-\r
- For this rule the code resembles:\r
-\r
- if (LA(1) == Number) {\r
- match(Number);\r
- consume();\r
- }\r
- match(Word);\r
-\r
- Prior to MR21, the error message for input "$ a" would be:\r
-\r
- line 1: syntax error at "$" missing Word\r
-\r
- With MR21 the message will be:\r
-\r
- line 1: syntax error at "$" expecting Word, Number.\r
-\r
- The generate code resembles:\r
-\r
- if ( (LA(1)==Number) ) {\r
- zzmatch(Number);\r
- consume();\r
- }\r
- else {\r
- if ( (LA(1)==Word) ) {\r
- /* nothing */\r
- }\r
- else {\r
- FAIL(... message for both Number and Word ...);\r
- }\r
- }\r
- match(Word);\r
- \r
- The code generated for optional blocks in MR21 is slightly longer\r
- than the previous versions, but it should give better error messages.\r
-\r
- The code generated for:\r
-\r
- { a | b | c }\r
-\r
- should now be *identical* to:\r
-\r
- ( a | b | c | )\r
-\r
- which was not the case prior to MR21.\r
-\r
- Reported by Sue Marvin (sue siara.com).\r
-\r
-#246. (Changed in MR21) Use of $(MAKE) for calls to make\r
-\r
- Calls to make from the makefiles were replaced with $(MAKE)\r
- because of problems when using gmake.\r
-\r
- Reported with fix by Sunil K.Vallamkonda (sunil siara.com).\r
-\r
-#245. (Changed in MR21) Changes to genmk\r
-\r
- The following command line options have been added to genmk:\r
-\r
- -cfiles ... \r
- \r
- To add a user's C or C++ files into makefile automatically.\r
- The list of files must be enclosed in apostrophes. This\r
- option may be specified multiple times.\r
-\r
- -compiler ...\r
- \r
- The name of the compiler to use for $(CCC) or $(CC). The\r
- default in C++ mode is "CC". The default in C mode is "cc".\r
-\r
- -pccts_path ...\r
-\r
- The value for $(PCCTS), the pccts directory. The default\r
- is /usr/local/pccts.\r
-\r
- Contributed by Tomasz Babczynski (t.babczynski ict.pwr.wroc.pl).\r
-\r
-#244. (Changed in MR21) Rename variable "not" in antlr.g\r
-\r
- When antlr.g is compiled with a C++ compiler, a variable named\r
- "not" causes problems. Reported by Sinan Karasu\r
- (sinan.karasu boeing.com).\r
-\r
-#243 (Changed in MR21) Replace recursion with iteration in zzfree_ast\r
-\r
- Another refinement to zzfree_ast in ast.c to limit recursion.\r
-\r
- NAKAJIMA Mutsuki (muc isr.co.jp).\r
-\r
-\r
-#242. (Changed in MR21) LineInfoFormatStr\r
-\r
- Added an #ifndef/#endif around LineInfoFormatStr in pcctscfg.h.\r
-\r
-#241. (Changed in MR21) Changed macro PURIFY to a no-op\r
-\r
- ***********************\r
- *** NOT IMPLEMENTED ***\r
- ***********************\r
-\r
- The PURIFY macro was changed to a no-op because it was causing \r
- problems when passing C++ objects.\r
- \r
- The old definition:\r
- \r
- #define PURIFY(r,s) memset((char *) &(r),'\\0',(s));\r
- \r
- The new definition:\r
- \r
- #define PURIFY(r,s) /* nothing */\r
-#endif\r
-\r
-#240. (Changed in MR21) sorcerer/h/sorcerer.h _MATCH and _MATCHRANGE\r
-\r
- Added test for NULL token pointer.\r
-\r
- Suggested by Peter Keller (keller ebi.ac.uk)\r
-\r
-#239. (Changed in MR21) C++ mode AParser::traceGuessFail\r
-\r
- If tracing is turned on when the code has been generated\r
- without trace code, a failed guess generates a trace report\r
- even though there are no other trace reports. This\r
- make the behavior consistent with other parts of the\r
- trace system.\r
-\r
- Reported by David Wigg (wiggjd sbu.ac.uk).\r
-\r
-#238. (Changed in MR21) Namespace version #include files\r
-\r
- Changed reference from CStdio to cstdio (and other\r
- #include file names) in the namespace version of pccts.\r
- Should have known better.\r
-\r
-#237. (Changed in MR21) ParserBlackBox(FILE*)\r
- \r
- In the past, ParserBlackBox would close the FILE in the dtor\r
- even though it was not opened by ParserBlackBox. The problem\r
- is that there were two constructors, one which accepted a file \r
- name and did an fopen, the other which accepted a FILE and did\r
- not do an fopen. There is now an extra member variable which\r
- remembers whether ParserBlackBox did the open or not.\r
-\r
- Suggested by Mike Percy (mpercy scires.com).\r
-\r
-#236. (Changed in MR21) tmake now reports down pointer problem\r
-\r
- When ASTBase::tmake attempts to update the down pointer of \r
- an AST it checks to see if the down pointer is NULL. If it\r
- is not NULL it does not do the update and returns NULL.\r
- An attempt to update the down pointer is almost always a\r
- result of a user error. This can lead to difficult to find\r
- problems during tree construction.\r
-\r
- With this change, the routine calls a virtual function\r
- reportOverwriteOfDownPointer() which calls panic to\r
- report the problem. Users who want the old behavior can\r
- redefined the virtual function in their AST class.\r
-\r
- Suggested by Sinan Karasu (sinan.karasu boeing.com)\r
-\r
-#235. (Changed in MR21) Made ANTLRParser::resynch() virtual\r
-\r
- Suggested by Jerry Evans (jerry swsl.co.uk).\r
-\r
-#234. (Changed in MR21) Implicit int for function return value\r
-\r
- ATokenBuffer:bufferSize() did not specify a type for the\r
- return value.\r
-\r
- Reported by Hai Vo-Ba (hai fc.hp.com).\r
-\r
-#233. (Changed in MR20) Converted to MSVC 6.0\r
-\r
- Due to external circumstances I have had to convert to MSVC 6.0\r
- The MSVC 5.0 project files (.dsw and .dsp) have been retained as\r
- xxx50.dsp and xxx50.dsw. The MSVC 6.0 files are named xxx60.dsp\r
- and xxx60.dsw (where xxx is the related to the directory/project).\r
-\r
-#232. (Changed in MR20) Make setwd bit vectors protected in parser.h\r
-\r
- The access for the setwd array in the parser header was not\r
- specified. As a result, it would depend on the code which \r
- preceded it. In MR20 it will always have access "protected".\r
-\r
- Reported by Piotr Eljasiak (eljasiak zt.gdansk.tpsa.pl).\r
-\r
-#231. (Changed in MR20) Error in token buffer debug code.\r
-\r
- When token buffer debugging is selected via the pre-processor\r
- symbol DEBUG_TOKENBUFFER there is an erroneous check in\r
- AParser.cpp:\r
-\r
- #ifdef DEBUG_TOKENBUFFER\r
- if (i >= inputTokens->bufferSize() ||\r
- inputTokens->minTokens() < LLk ) /* MR20 Was "<=" */\r
- ...\r
- #endif\r
-\r
- Reported by David Wigg (wiggjd sbu.ac.uk).\r
-\r
-#230. (Changed in MR20) Fixed problem with #define for -gd option\r
-\r
- There was an error in setting zzTRACE_RULES for the -gd (trace) option.\r
-\r
- Reported by Gary Funck (gary intrepid.com).\r
-\r
-#229. (Changed in MR20) Additional "const" for literals\r
-\r
- "const" was added to the token name literal table.\r
- "const" was added to some panic() and similar routine\r
-\r
-#228. (Changed in MR20) dlg crashes on "()"\r
-\r
- The following token defintion will cause DLG to crash.\r
-\r
- #token "()"\r
-\r
- When there is a syntax error in a regular expression\r
- many of the dlg routines return a structure which has\r
- null pointers. When this is accessed by callers it\r
- generates the crash.\r
-\r
- I have attempted to fix the more common cases.\r
-\r
- Reported by Mengue Olivier (dolmen bigfoot.com).\r
-\r
-#227. (Changed in MR20) Array overwrite\r
-\r
- Steveh Hand (sassth unx.sas.com) reported a problem which\r
- was traced to a temporary array which was not properly\r
- resized for deeply nested blocks. This has been fixed.\r
-\r
-#226. (Changed in MR20) -pedantic conformance\r
- \r
- G. Hobbelt (i_a mbh.org) and THM made many, many minor \r
- changes to create prototypes for all the functions and\r
- bring antlr, dlg, and sorcerer into conformance with\r
- the gcc -pedantic option.\r
-\r
- This may require uses to add pccts/h/pcctscfg.h to some\r
- files or makefiles in order to have __USE_PROTOS defined.\r
-\r
-#225 (Changed in MR20) AST stack adjustment in C mode\r
-\r
- The fix in #214 for AST stack adjustment in C mode missed \r
- some cases.\r
-\r
- Reported with fix by Ger Hobbelt (i_a mbh.org).\r
-\r
-#224 (Changed in MR20) LL(1) and LL(2) with #pragma approx\r
-\r
- This may take a record for the oldest, most trival, lexical\r
- error in pccts. The regular expressions for LL(1) and LL(2)\r
- lacked an escape for the left and right parenthesis.\r
-\r
- Reported by Ger Hobbelt (i_a mbh.org).\r
-\r
-#223 (Changed in MR20) Addition of IBM_VISUAL_AGE directory\r
-\r
- Build files for antlr, dlg, and sorcerer under IBM Visual Age \r
- have been contributed by Anton Sergeev (ags mlc.ru). They have\r
- been placed in the pccts/IBM_VISUAL_AGE directory.\r
-\r
-#222 (Changed in MR20) Replace __STDC__ with __USE_PROTOS\r
-\r
- Most occurrences of __STDC__ replaced with __USE_PROTOS due to\r
- complaints from several users.\r
-\r
-#221 (Changed in MR20) Added #include for DLexerBase.h to PBlackBox.\r
-\r
- Added #include for DLexerBase.h to PBlackBox.\r
-\r
-#220 (Changed in MR19) strcat arguments reversed in #pred parse\r
-\r
- The arguments to strcat are reversed when creating a print\r
- name for a hash table entry for use with #pred feature.\r
-\r
- Problem diagnosed and fix reported by Scott Harrington \r
- (seh4 ix.netcom.com).\r
-\r
-#219. (Changed in MR19) C Mode routine zzfree_ast\r
-\r
- Changes to reduce use of recursion for AST trees with only right\r
- links or only left links in the C mode routine zzfree_ast.\r
-\r
- Implemented by SAKAI Kiyotaka (ksakai isr.co.jp).\r
-\r
-#218. (Changed in MR19) Changes to support unsigned char in C mode\r
-\r
- Changes to antlr.h and err.h to fix omissions in use of zzchar_t\r
-\r
- Implemented by SAKAI Kiyotaka (ksakai isr.co.jp).\r
-\r
-#217. (Changed in MR19) Error message when dlg -i and -CC options selected\r
- \r
- *** This change was rescinded by item #257 ***\r
-\r
- The parsers generated by pccts in C++ mode are not able to support the\r
- interactive lexer option (except, perhaps, when using the deferred fetch\r
- parser option.(Item #216).\r
-\r
- DLG now warns when both -i and -CC are selected.\r
-\r
- This warning was suggested by David Venditti (07751870267-0001 t-online.de).\r
-\r
-#216. (Changed in MR19) Defer token fetch for C++ mode\r
-\r
- Implemented by Volker H. Simonis (simonis informatik.uni-tuebingen.de)\r
-\r
- Normally, pccts keeps the lookahead token buffer completely filled.\r
- This requires max(k,ck) tokens of lookahead. For some applications\r
- this can cause deadlock problems. For example, there may be cases\r
- when the parser can't tell when the input has been completely consumed\r
- until the parse is complete, but the parse can't be completed because \r
- the input routines are waiting for additional tokens to fill the\r
- lookahead buffer.\r
- \r
- When the ANTLRParser class is built with the pre-processor option \r
- ZZDEFER_FETCH defined, the fetch of new tokens by consume() is deferred\r
- until LA(i) or LT(i) is called. \r
-\r
- To test whether this option has been built into the ANTLRParser class\r
- use "isDeferFetchEnabled()".\r
-\r
- Using the -gd trace option with the default tracein() and traceout()\r
- routines will defeat the effort to defer the fetch because the\r
- trace routines print out information about the lookahead token at\r
- the start of the rule.\r
- \r
- Because the tracein and traceout routines are virtual it is \r
- easy to redefine them in your parser:\r
-\r
- class MyParser {\r
- <<\r
- virtual void tracein(ANTLRChar * ruleName)\r
- { fprintf(stderr,"Entering: %s\n", ruleName); }\r
- virtual void traceout(ANTLRChar * ruleName)\r
- { fprintf(stderr,"Leaving: %s\n", ruleName); }\r
- >>\r
- \r
- The originals for those routines are pccts/h/AParser.cpp\r
- \r
- This requires use of the dlg option -i (interactive lexer).\r
-\r
- This is implemented only for C++ mode.\r
-\r
- This is experimental. The interaction with guess mode (syntactic\r
- predicates)is not known.\r
-\r
-#215. (Changed in MR19) Addition of reset() to DLGLexerBase\r
-\r
- There was no obvious way to reset the lexer for reuse. The\r
- reset() method now does this.\r
-\r
- Suggested by David Venditti (07751870267-0001 t-online.de).\r
-\r
-#214. (Changed in MR19) C mode: Adjust AST stack pointer at exit\r
-\r
- In C mode the AST stack pointer needs to be reset if there will\r
- be multiple calls to the ANTLRx macros.\r
-\r
- Reported with fix by Paul D. Smith (psmith baynetworks.com).\r
-\r
-#213. (Changed in MR18) Fatal error with -mrhoistk (k>1 hoisting)\r
-\r
- When rearranging code I forgot to un-comment a critical line of\r
- code that handles hoisting of predicates with k>1 lookahead. This\r
- is now fixed.\r
-\r
- Reported by Reinier van den Born (reinier vnet.ibm.com).\r
-\r
-#212. (Changed in MR17) Mac related changes by Kenji Tanaka\r
-\r
- Kenji Tanaka (kentar osa.att.ne.jp) has made a number of changes for\r
- Macintosh users.\r
-\r
- a. The following Macintosh MPW files aid in installing pccts on Mac:\r
-\r
- pccts/MPW_Read_Me\r
-\r
- pccts/install68K.mpw\r
- pccts/installPPC.mpw\r
-\r
- pccts/antlr/antlr.r\r
- pccts/antlr/antlr68K.make\r
- pccts/antlr/antlrPPC.make\r
-\r
- pccts/dlg/dlg.r\r
- pccts/dlg/dlg68K.make\r
- pccts/dlg/dlgPPC.make\r
-\r
- pccts/sorcerer/sor.r\r
- pccts/sorcerer/sor68K.make\r
- pccts/sorcerer/sorPPC.make\r
- \r
- They completely replace the previous Mac installation files.\r
- \r
- b. The most significant is a change in the MAC_FILE_CREATOR symbol\r
- in pcctscfg.h:\r
-\r
- old: #define MAC_FILE_CREATOR 'MMCC' /* Metrowerks C/C++ Text files */\r
- new: #define MAC_FILE_CREATOR 'CWIE' /* Metrowerks C/C++ Text files */\r
-\r
- c. Added calls to special_fopen_actions() where necessary.\r
-\r
-#211. (Changed in MR16a) C++ style comment in dlg\r
-\r
- This has been fixed.\r
-\r
-#210. (Changed in MR16a) Sor accepts \r\n, \r, or \n for end-of-line\r
-\r
- A user requested that Sorcerer be changed to accept other forms\r
- of end-of-line.\r
-\r
-#209. (Changed in MR16) Name of files changed.\r
-\r
- Old: CHANGES_FROM_1.33\r
- New: CHANGES_FROM_133.txt\r
-\r
- Old: KNOWN_PROBLEMS\r
- New: KNOWN_PROBLEMS.txt\r
-\r
-#208. (Changed in MR16) Change in use of pccts #include files\r
-\r
- There were problems with MS DevStudio when mixing Sorcerer and\r
- PCCTS in the same source file. The problem is caused by the\r
- redefinition of setjmp in the MS header file setjmp.h. In\r
- setjmp.h the pre-processor symbol setjmp was redefined to be\r
- _setjmp. A later effort to execute #include <setjmp.h> resulted \r
- in an effort to #include <_setjmp.h>. I'm not sure whether this\r
- is a bug or a feature. In any case, I decided to fix it by\r
- avoiding the use of pre-processor symbols in #include statements\r
- altogether. This has the added benefit of making pre-compiled\r
- headers work again.\r
-\r
- I've replaced statements:\r
-\r
- old: #include PCCTS_SETJMP_H\r
- new: #include "pccts_setjmp.h"\r
-\r
- Where pccts_setjmp.h contains:\r
-\r
- #ifndef __PCCTS_SETJMP_H__\r
- #define __PCCTS_SETJMP_H__\r
- \r
- #ifdef PCCTS_USE_NAMESPACE_STD\r
- #include <Csetjmp>\r
- #else\r
- #include <setjmp.h>\r
- #endif\r
-\r
- #endif\r
- \r
- A similar change has been made for other standard header files\r
- required by pccts and sorcerer: stdlib.h, stdarg.h, stdio.h, etc.\r
-\r
- Reported by Jeff Vincent (JVincent novell.com) and Dale Davis\r
- (DalDavis spectrace.com).\r
-\r
-#207. (Changed in MR16) dlg reports an invalid range for: [\0x00-\0xff]\r
-\r
- -----------------------------------------------------------------\r
- Note from MR23: This fix does not work. I am investigating why.\r
- -----------------------------------------------------------------\r
-\r
- dlg will report that this is an invalid range.\r
-\r
- Diagnosed by Piotr Eljasiak (eljasiak no-spam.zt.gdansk.tpsa.pl):\r
-\r
- I think this problem is not specific to unsigned chars\r
- because dlg reports no error for the range [\0x00-\0xfe].\r
-\r
- I've found that information on range is kept in field\r
- letter (unsigned char) of Attrib struct. Unfortunately\r
- the letter value internally is for some reasons increased\r
- by 1, so \0xff is represented here as 0.\r
-\r
- That's why dlg complains about the range [\0x00-\0xff] in\r
- dlg_p.g:\r
-\r
- if ($$.letter > $2.letter) {\r
- error("invalid range ", zzline);\r
- } \r
-\r
- The fix is:\r
-\r
- if ($$.letter > $2.letter && 255 != $$2.letter) {\r
- error("invalid range ", zzline);\r
- } \r
-\r
-#206. (Changed in MR16) Free zzFAILtext in ANTLRParser destructor\r
-\r
- The ANTLRParser destructor now frees zzFAILtext.\r
-\r
- Problem and fix reported by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#205. (Changed in MR16) DLGStringReset argument now const\r
-\r
- Changed: void DLGStringReset(DLGChar *s) {...}\r
- To: void DLGStringReset(const DLGChar *s) {...}\r
-\r
- Suggested by Dale Davis (daldavis spectrace.com)\r
-\r
-#204. (Changed in MR15a) Change __WATCOM__ to __WATCOMC__ in pcctscfg.h\r
- \r
- Reported by Oleg Dashevskii (olegdash my-dejanews.com).\r
-\r
-#203. (Changed in MR15) Addition of sorcerer to distribution kit\r
-\r
- I have finally caved in to popular demand. The pccts 1.33mr15\r
- kit will include sorcerer. The separate sorcerer kit will be\r
- discontinued.\r
-\r
-#202. (Changed) in MR15) Organization of MS Dev Studio Projects in Kit\r
-\r
- Previously there was one workspace that contained projects for\r
- all three parts of pccts: antlr, dlg, and sorcerer. Now each\r
- part (and directory) has its own workspace/project and there\r
- is an additional workspace/project to build a library from the\r
- .cpp files in the pccts/h directory.\r
-\r
- The library build will create pccts_debug.lib or pccts_release.lib\r
- according to the configuration selected. \r
-\r
- If you don't want to build pccts 1.33MR15 you can download a\r
- ready-to-run kit for win32 from http://www.polhode.com/win32.zip.\r
- The ready-to-run for win32 includes executables, a pre-built static\r
- library for the .cpp files in the pccts/h directory, and a sample\r
- application\r
-\r
- You will need to define the environment variable PCCTS to point to\r
- the root of the pccts directory hierarchy.\r
-\r
-#201. (Changed in MR15) Several fixes by K.J. Cummings (cummings peritus.com)\r
-\r
- Generation of SETJMP rather than SETJMP_H in gen.c.\r
-\r
- (Sor B19) Declaration of ref_vars_inits for ref_var_inits in\r
- pccts/sorcerer/sorcerer.h.\r
-\r
-#200. (Changed in MR15) Remove operator=() in AToken.h\r
-\r
- User reported that WatCom couldn't handle use of\r
- explicit operator =(). Replace with equivalent\r
- using cast operator.\r
-\r
-#199. (Changed in MR15) Don't allow use of empty #tokclass\r
-\r
- Change antlr.g to disallow empty #tokclass sets.\r
-\r
- Reported by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#198. Revised ANSI C grammar due to efforts by Manuel Kessler\r
-\r
- Manuel Kessler (mlkessler cip.physik.uni-wuerzburg.de)\r
-\r
- Allow trailing ... in function parameter lists.\r
- Add bit fields.\r
- Allow old-style function declarations.\r
- Support cv-qualified pointers.\r
- Better checking of combinations of type specifiers.\r
- Release of memory for local symbols on scope exit.\r
- Allow input file name on command line as well as by redirection.\r
-\r
- and other miscellaneous tweaks.\r
-\r
- This is not part of the pccts distribution kit. It must be\r
- downloaded separately from:\r
-\r
- http://www.polhode.com/ansi_mr15.zip\r
-\r
-#197. (Changed in MR14) Resetting the lookahead buffer of the parser\r
-\r
- Explanation and fix by Sinan Karasu (sinan.karasu boeing.com)\r
-\r
- Consider the code used to prime the lookahead buffer LA(i)\r
- of the parser when init() is called:\r
-\r
- void\r
- ANTLRParser::\r
- prime_lookahead()\r
- {\r
- int i;\r
- for(i=1;i<=LLk; i++) consume();\r
- dirty=0;\r
- //lap = 0; // MR14 - Sinan Karasu (sinan.karusu boeing.com)\r
- //labase = 0; // MR14\r
- labase=lap; // MR14\r
- }\r
-\r
- When the parser is instantiated, lap=0,labase=0 is set.\r
-\r
- The "for" loop runs LLk times. In consume(), lap = lap +1 (mod LLk) is\r
- computed. Therefore, lap(before the loop) == lap (after the loop).\r
-\r
- Now the only problem comes in when one does an init() of the parser\r
- after an Eof has been seen. At that time, lap could be non zero.\r
- Assume it was lap==1. Now we do a prime_lookahead(). If LLk is 2,\r
- then\r
-\r
- consume()\r
- {\r
- NLA = inputTokens->getToken()->getType();\r
- dirty--;\r
- lap = (lap+1)&(LLk-1);\r
- }\r
-\r
- or expanding NLA,\r
-\r
- token_type[lap&(LLk-1)]) = inputTokens->getToken()->getType();\r
- dirty--;\r
- lap = (lap+1)&(LLk-1);\r
-\r
- so now we prime locations 1 and 2. In prime_lookahead it used to set\r
- lap=0 and labase=0. Now, the next token will be read from location 0,\r
- NOT 1 as it should have been.\r
-\r
- This was never caught before, because if a parser is just instantiated,\r
- then lap and labase are 0, the offending assignment lines are\r
- basically no-ops, since the for loop wraps around back to 0.\r
-\r
-#196. (Changed in MR14) Problems with "(alpha)? beta" guess\r
-\r
- Consider the following syntactic predicate in a grammar\r
- with 2 tokens of lookahead (k=2 or ck=2):\r
-\r
- rule : ( alpha )? beta ;\r
- alpha : S t ;\r
- t : T U\r
- | T\r
- ;\r
- beta : S t Z ;\r
-\r
- When antlr computes the prediction expression with one token\r
- of lookahead for alts 1 and 2 of rule t it finds an ambiguity.\r
-\r
- Because the grammar has a lookahead of 2 it tries to compute\r
- two tokens of lookahead for alts 1 and 2 of t. Alt 1 clearly\r
- has a lookahead of (T U). Alt 2 is one token long so antlr\r
- tries to compute the follow set of alt 2, which means finding\r
- the things which can follow rule t in the context of (alpha)?.\r
- This cannot be computed, because alpha is only part of a rule,\r
- and antlr can't tell what part of beta is matched by alpha and\r
- what part remains to be matched. Thus it impossible for antlr\r
- to properly determine the follow set of rule t.\r
-\r
- Prior to 1.33MR14, the follow of (alpha)? was computed as\r
- FIRST(beta) as a result of the internal representation of\r
- guess blocks.\r
-\r
- With MR14 the follow set will be the empty set for that context.\r
-\r
- Normally, one expects a rule appearing in a guess block to also\r
- appear elsewhere. When the follow context for this other use\r
- is "ored" with the empty set, the context from the other use\r
- results, and a reasonable follow context results. However if\r
- there is *no* other use of the rule, or it is used in a different\r
- manner then the follow context will be inaccurate - it was\r
- inaccurate even before MR14, but it will be inaccurate in a\r
- different way.\r
-\r
- For the example given earlier, a reasonable way to rewrite the\r
- grammar:\r
-\r
- rule : ( alpha )? beta\r
- alpha : S t ;\r
- t : T U\r
- | T\r
- ;\r
- beta : alpha Z ;\r
-\r
- If there are no other uses of the rule appearing in the guess\r
- block it will generate a test for EOF - a workaround for\r
- representing a null set in the lookahead tests.\r
-\r
- If you encounter such a problem you can use the -alpha option\r
- to get additional information:\r
-\r
- line 2: error: not possible to compute follow set for alpha\r
- in an "(alpha)? beta" block.\r
-\r
- With the antlr -alpha command line option the following information\r
- is inserted into the generated file:\r
-\r
- #if 0\r
-\r
- Trace of references leading to attempt to compute the follow set of\r
- alpha in an "(alpha)? beta" block. It is not possible for antlr to\r
- compute this follow set because it is not known what part of beta has\r
- already been matched by alpha and what part remains to be matched.\r
-\r
- Rules which make use of the incorrect follow set will also be incorrect\r
-\r
- 1 #token T alpha/2 line 7 brief.g\r
- 2 end alpha alpha/3 line 8 brief.g\r
- 2 end (...)? block at start/1 line 2 brief.g\r
-\r
- #endif\r
-\r
- At the moment, with the -alpha option selected the program marks\r
- any rules which appear in the trace back chain (above) as rules with\r
- possible problems computing follow set.\r
-\r
- Reported by Greg Knapen (gregory.knapen bell.ca).\r
-\r
-#195. (Changed in MR14) #line directive not at column 1\r
-\r
- Under certain circunstances a predicate test could generate\r
- a #line directive which was not at column 1.\r
-\r
- Reported with fix by David KÃ¥gedal (davidk lysator.liu.se)\r
- (http://www.lysator.liu.se/~davidk/).\r
-\r
-#194. (Changed in MR14) (C Mode only) Demand lookahead with #tokclass\r
-\r
- In C mode with the demand lookahead option there is a bug in the\r
- code which handles matches for #tokclass (zzsetmatch and\r
- zzsetmatch_wsig).\r
-\r
- The bug causes the lookahead pointer to get out of synchronization\r
- with the current token pointer.\r
-\r
- The problem was reported with a fix by Ger Hobbelt (hobbelt axa.nl).\r
-\r
-#193. (Changed in MR14) Use of PCCTS_USE_NAMESPACE_STD\r
-\r
- The pcctscfg.h now contains the following definitions:\r
-\r
- #ifdef PCCTS_USE_NAMESPACE_STD\r
- #define PCCTS_STDIO_H <Cstdio>\r
- #define PCCTS_STDLIB_H <Cstdlib>\r
- #define PCCTS_STDARG_H <Cstdarg>\r
- #define PCCTS_SETJMP_H <Csetjmp>\r
- #define PCCTS_STRING_H <Cstring>\r
- #define PCCTS_ASSERT_H <Cassert>\r
- #define PCCTS_ISTREAM_H <istream>\r
- #define PCCTS_IOSTREAM_H <iostream>\r
- #define PCCTS_NAMESPACE_STD namespace std {}; using namespace std;\r
- #else\r
- #define PCCTS_STDIO_H <stdio.h>\r
- #define PCCTS_STDLIB_H <stdlib.h>\r
- #define PCCTS_STDARG_H <stdarg.h>\r
- #define PCCTS_SETJMP_H <setjmp.h>\r
- #define PCCTS_STRING_H <string.h>\r
- #define PCCTS_ASSERT_H <assert.h>\r
- #define PCCTS_ISTREAM_H <istream.h>\r
- #define PCCTS_IOSTREAM_H <iostream.h>\r
- #define PCCTS_NAMESPACE_STD\r
- #endif\r
-\r
- The runtime support in pccts/h uses these pre-processor symbols\r
- consistently.\r
-\r
- Also, antlr and dlg have been changed to generate code which uses\r
- these pre-processor symbols rather than having the names of the\r
- #include files hard-coded in the generated code.\r
-\r
- This required the addition of "#include pcctscfg.h" to a number of\r
- files in pccts/h.\r
-\r
- It appears that this sometimes causes problems for MSVC 5 in\r
- combination with the "automatic" option for pre-compiled headers.\r
- In such cases disable the "automatic" pre-compiled headers option.\r
-\r
- Suggested by Hubert Holin (Hubert.Holin Bigfoot.com).\r
-\r
-#192. (Changed in MR14) Change setText() to accept "const ANTLRChar *"\r
-\r
- Changed ANTLRToken::setText(ANTLRChar *) to setText(const ANTLRChar *).\r
- This allows literal strings to be used to initialize tokens. Since\r
- the usual token implementation (ANTLRCommonToken) makes a copy of the\r
- input string, this was an unnecessary limitation.\r
-\r
- Suggested by Bob McWhirter (bob netwrench.com).\r
-\r
-#191. (Changed in MR14) HP/UX aCC compiler compatibility problem\r
-\r
- Needed to explicitly declare zzINF_DEF_TOKEN_BUFFER_SIZE and\r
- zzINF_BUFFER_TOKEN_CHUNK_SIZE as ints in pccts/h/AParser.cpp.\r
-\r
- Reported by David Cook (dcook bmc.com).\r
-\r
-#190. (Changed in MR14) IBM OS/2 CSet compiler compatibility problem\r
-\r
- Name conflict with "_cs" in pccts/h/ATokenBuffer.cpp\r
-\r
- Reported by David Cook (dcook bmc.com).\r
-\r
-#189. (Changed in MR14) -gxt switch in C mode\r
-\r
- The -gxt switch in C mode didn't work because of incorrect\r
- initialization.\r
-\r
- Reported by Sinan Karasu (sinan boeing.com).\r
-\r
-#188. (Changed in MR14) Added pccts/h/DLG_stream_input.h\r
-\r
- This is a DLG stream class based on C++ istreams.\r
-\r
- Contributed by Hubert Holin (Hubert.Holin Bigfoot.com).\r
-\r
-#187. (Changed in MR14) Rename config.h to pcctscfg.h\r
-\r
- The PCCTS configuration file has been renamed from config.h to\r
- pcctscfg.h. The problem with the original name is that it led\r
- to name collisions when pccts parsers were combined with other\r
- software.\r
-\r
- All of the runtime support routines in pccts/h/* have been\r
- changed to use the new name. Existing software can continue\r
- to use pccts/h/config.h. The contents of pccts/h/config.h is\r
- now just "#include "pcctscfg.h".\r
-\r
- I don't have a record of the user who suggested this.\r
-\r
-#186. (Changed in MR14) Pre-processor symbol DllExportPCCTS class modifier\r
-\r
- Classes in the C++ runtime support routines are now declared:\r
-\r
- class DllExportPCCTS className ....\r
-\r
- By default, the pre-processor symbol is defined as the empty\r
- string. This if for use by MSVC++ users to create DLL classes.\r
-\r
- Suggested by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#185. (Changed in MR14) Option to not use PCCTS_AST base class for ASTBase\r
-\r
- Normally, the ASTBase class is derived from PCCTS_AST which contains\r
- functions useful to Sorcerer. If these are not necessary then the\r
- user can define the pre-processor symbol "PCCTS_NOT_USING_SOR" which\r
- will cause the ASTBase class to replace references to PCCTS_AST with\r
- references to ASTBase where necessary.\r
-\r
- The class ASTDoublyLinkedBase will contain a pure virtual function\r
- shallowCopy() that was formerly defined in class PCCTS_AST.\r
-\r
- Suggested by Bob McWhirter (bob netwrench.com).\r
-\r
-#184. (Changed in MR14) Grammars with no tokens generate invalid tokens.h\r
-\r
- Reported by Hubert Holin (Hubert.Holin bigfoot.com).\r
-\r
-#183. (Changed in MR14) -f to specify file with names of grammar files\r
-\r
- In DEC/VMS it is difficult to specify very long command lines.\r
- The -f option allows one to place the names of the grammar files\r
- in a data file in order to bypass limitations of the DEC/VMS\r
- command language interpreter.\r
-\r
- Addition supplied by Bernard Giroud (b_giroud decus.ch).\r
-\r
-#182. (Changed in MR14) Output directory option for DEC/VMS\r
-\r
- Fix some problems with the -o option under DEC/VMS.\r
-\r
- Fix supplied by Bernard Giroud (b_giroud decus.ch).\r
-\r
-#181. (Changed in MR14) Allow chars > 127 in DLGStringInput::nextChar()\r
-\r
- Changed DLGStringInput to cast the character using (unsigned char)\r
- so that languages with character codes greater than 127 work\r
- without changes.\r
-\r
- Suggested by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#180. (Added in MR14) ANTLRParser::getEofToken()\r
-\r
- Added "ANTLRToken ANTLRParser::getEofToken() const" to match the\r
- setEofToken routine.\r
-\r
- Requested by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#179. (Fixed in MR14) Memory leak for BufFileInput subclass of DLGInputStream\r
-\r
- The BufFileInput class described in Item #142 neglected to release\r
- the allocated buffer when an instance was destroyed.\r
-\r
- Reported by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#178. (Fixed in MR14) Bug in "(alpha)? beta" guess blocks first sets\r
-\r
- In 1.33 vanilla, and all maintenance releases prior to MR14\r
- there is a bug in the handling of guess blocks which use the\r
- "long" form:\r
-\r
- (alpha)? beta\r
-\r
- inside a (...)*, (...)+, or {...} block.\r
-\r
- This problem does *not* apply to the case where beta is omitted\r
- or when the syntactic predicate is on the leading edge of an\r
- alternative.\r
-\r
- The problem is that both alpha and beta are stored in the\r
- syntax diagram, and that some analysis routines would fail\r
- to skip the alpha portion when it was not on the leading edge.\r
- Consider the following grammar with -ck 2:\r
-\r
- r : ( (A)? B )* C D\r
-\r
- | A B /* forces -ck 2 computation for old antlr */\r
- /* reports ambig for alts 1 & 2 */\r
-\r
- | B C /* forces -ck 2 computation for new antlr */\r
- /* reports ambig for alts 1 & 3 */\r
- ;\r
-\r
- The prediction expression for the first alternative should be\r
- LA(1)={B C} LA(2)={B C D}, but previous versions of antlr\r
- would compute the prediction expression as LA(1)={A C} LA(2)={B D}\r
-\r
- Reported by Arpad Beszedes (beszedes inf.u-szeged.hu) who provided\r
- a very clear example of the problem and identified the probable cause.\r
-\r
-#177. (Changed in MR14) #tokdefs and #token with regular expression\r
-\r
- In MR13 the change described by Item #162 caused an existing\r
- feature of antlr to fail. Prior to the change it was possible\r
- to give regular expression definitions and actions to tokens\r
- which were defined via the #tokdefs directive.\r
-\r
- This now works again.\r
-\r
- Reported by Manfred Kogler (km cast.uni-linz.ac.at).\r
-\r
-#176. (Changed in MR14) Support for #line in antlr source code\r
-\r
- Note: this was implemented by Arpad Beszedes (beszedes inf.u-szeged.hu).\r
-\r
- In 1.33MR14 it is possible for a pre-processor to generate #line\r
- directives in the antlr source and have those line numbers and file\r
- names used in antlr error messages and in the #line directives\r
- generated by antlr.\r
-\r
- The #line directive may appear in the following forms:\r
-\r
- #line ll "sss" xx xx ...\r
-\r
- where ll represents a line number, "sss" represents the name of a file\r
- enclosed in quotation marks, and xxx are arbitrary integers.\r
-\r
- The following form (without "line") is not supported at the moment:\r
-\r
- # ll "sss" xx xx ...\r
-\r
- The result:\r
-\r
- zzline\r
-\r
- is replaced with ll from the # or #line directive\r
-\r
- FileStr[CurFile]\r
-\r
- is updated with the contents of the string (if any)\r
- following the line number\r
-\r
- Note\r
- ----\r
- The file-name string following the line number can be a complete\r
- name with a directory-path. Antlr generates the output files from\r
- the input file name (by replacing the extension from the file-name\r
- with .c or .cpp).\r
-\r
- If the input file (or the file-name from the line-info) contains\r
- a path:\r
-\r
- "../grammar.g"\r
-\r
- the generated source code will be placed in "../grammar.cpp" (i.e.\r
- in the parent directory). This is inconvenient in some cases\r
- (even the -o switch can not be used) so the path information is\r
- removed from the #line directive. Thus, if the line-info was\r
-\r
- #line 2 "../grammar.g"\r
-\r
- then the current file-name will become "grammar.g"\r
-\r
- In this way, the generated source code according to the grammar file\r
- will always be in the current directory, except when the -o switch\r
- is used.\r
-\r
-#175. (Changed in MR14) Bug when guess block appears at start of (...)*\r
-\r
- In 1.33 vanilla and all maintenance releases prior to 1.33MR14\r
- there is a bug when a guess block appears at the start of a (...)+.\r
- Consider the following k=1 (ck=1) grammar:\r
-\r
- rule :\r
- ( (STAR)? ZIP )* ID ;\r
-\r
- Prior to 1.33MR14, the generated code resembled:\r
-\r
- ...\r
- zzGUESS_BLOCK\r
- while ( 1 ) {\r
- if ( ! LA(1)==STAR) break;\r
- zzGUESS\r
- if ( !zzrv ) {\r
- zzmatch(STAR);\r
- zzCONSUME;\r
- zzGUESS_DONE\r
- zzmatch(ZIP);\r
- zzCONSUME;\r
- ...\r
-\r
- Note that the routine uses STAR for the prediction expression\r
- rather than ZIP. With 1.33MR14 the generated code resembles:\r
-\r
- ...\r
- while ( 1 ) {\r
- if ( ! LA(1)==ZIP) break;\r
- ...\r
-\r
- This problem existed only with (...)* blocks and was caused\r
- by the slightly more complicated graph which represents (...)*\r
- blocks. This caused the analysis routine to compute the first\r
- set for the alpha part of the "(alpha)? beta" rather than the\r
- beta part.\r
-\r
- Both (...)+ and {...} blocks handled the guess block correctly.\r
-\r
- Reported by Arpad Beszedes (beszedes inf.u-szeged.hu) who provided\r
- a very clear example of the problem and identified the probable cause.\r
-\r
-#174. (Changed in MR14) Bug when action precedes syntactic predicate\r
-\r
- In 1.33 vanilla, and all maintenance releases prior to MR14,\r
- there was a bug when a syntactic predicate was immediately\r
- preceded by an action. Consider the following -ck 2 grammar:\r
-\r
- rule :\r
- <<int i;>>\r
- (alpha)? beta C\r
- | A B\r
- ;\r
-\r
- alpha : A ;\r
- beta : A B;\r
-\r
- Prior to MR14, the code generated for the first alternative\r
- resembled:\r
-\r
- ...\r
- zzGUESS\r
- if ( !zzrv && LA(1)==A && LA(2)==A) {\r
- alpha();\r
- zzGUESS_DONE\r
- beta();\r
- zzmatch(C);\r
- zzCONSUME;\r
- } else {\r
- ...\r
-\r
- The prediction expression (i.e. LA(1)==A && LA(2)==A) is clearly\r
- wrong because LA(2) should be matched to B (first[2] of beta is {B}).\r
-\r
- With 1.33MR14 the prediction expression is:\r
-\r
- ...\r
- if ( !zzrv && LA(1)==A && LA(2)==B) {\r
- alpha();\r
- zzGUESS_DONE\r
- beta();\r
- zzmatch(C);\r
- zzCONSUME;\r
- } else {\r
- ...\r
-\r
- This will only affect users in which alpha is shorter than\r
- than max(k,ck) and there is an action immediately preceding\r
- the syntactic predicate.\r
-\r
- This problem was reported by reported by Arpad Beszedes\r
- (beszedes inf.u-szeged.hu) who provided a very clear example\r
- of the problem and identified the presence of the init-action\r
- as the likely culprit.\r
-\r
-#173. (Changed in MR13a) -glms for Microsoft style filenames with -gl\r
-\r
- With the -gl option antlr generates #line directives using the\r
- exact name of the input files specified on the command line.\r
- An oddity of the Microsoft C and C++ compilers is that they\r
- don't accept file names in #line directives containing "\"\r
- even though these are names from the native file system.\r
-\r
- With -glms option, the "\" in file names appearing in #line\r
- directives is replaced with a "/" in order to conform to\r
- Microsoft compiler requirements.\r
-\r
- Reported by Erwin Achermann (erwin.achermann switzerland.org).\r
-\r
-#172. (Changed in MR13) \r\n in antlr source counted as one line\r
-\r
- Some MS software uses \r\n to indicate a new line. Antlr\r
- now recognizes this in counting lines.\r
-\r
- Reported by Edward L. Hepler (elh ece.vill.edu).\r
-\r
-#171. (Changed in MR13) #tokclass L..U now allowed\r
-\r
- The following is now allowed:\r
-\r
- #tokclass ABC { A..B C }\r
-\r
- Reported by Dave Watola (dwatola amtsun.jpl.nasa.gov)\r
-\r
-#170. (Changed in MR13) Suppression for predicates with lookahead depth >1\r
-\r
- In MR12 the capability for suppression of predicates with lookahead\r
- depth=1 was introduced. With MR13 this had been extended to\r
- predicates with lookahead depth > 1 and released for use by users\r
- on an experimental basis.\r
-\r
- Consider the following grammar with -ck 2 and the predicate in rule\r
- "a" with depth 2:\r
-\r
- r1 : (ab)* "@"\r
- ;\r
-\r
- ab : a\r
- | b\r
- ;\r
-\r
- a : (A B)? => <<p(LATEXT(2))>>? A B C\r
- ;\r
-\r
- b : A B C\r
- ;\r
-\r
- Normally, the predicate would be hoisted into rule r1 in order to\r
- determine whether to call rule "ab". However it should *not* be\r
- hoisted because, even if p is false, there is a valid alternative\r
- in rule b. With "-mrhoistk on" the predicate will be suppressed.\r
-\r
- If "-info p" command line option is present the following information\r
- will appear in the generated code:\r
-\r
- while ( (LA(1)==A)\r
- #if 0\r
-\r
- Part (or all) of predicate with depth > 1 suppressed by alternative\r
- without predicate\r
-\r
- pred << p(LATEXT(2))>>?\r
- depth=k=2 ("=>" guard) rule a line 8 t1.g\r
- tree context:\r
- (root = A\r
- B\r
- )\r
-\r
- The token sequence which is suppressed: ( A B )\r
- The sequence of references which generate that sequence of tokens:\r
-\r
- 1 to ab r1/1 line 1 t1.g\r
- 2 ab ab/1 line 4 t1.g\r
- 3 to b ab/2 line 5 t1.g\r
- 4 b b/1 line 11 t1.g\r
- 5 #token A b/1 line 11 t1.g\r
- 6 #token B b/1 line 11 t1.g\r
-\r
- #endif\r
-\r
- A slightly more complicated example:\r
-\r
- r1 : (ab)* "@"\r
- ;\r
-\r
- ab : a\r
- | b\r
- ;\r
-\r
- a : (A B)? => <<p(LATEXT(2))>>? (A B | D E)\r
- ;\r
-\r
- b : <<q(LATEXT(2))>>? D E\r
- ;\r
-\r
-\r
- In this case, the sequence (D E) in rule "a" which lies behind\r
- the guard is used to suppress the predicate with context (D E)\r
- in rule b.\r
-\r
- while ( (LA(1)==A || LA(1)==D)\r
- #if 0\r
-\r
- Part (or all) of predicate with depth > 1 suppressed by alternative\r
- without predicate\r
-\r
- pred << q(LATEXT(2))>>?\r
- depth=k=2 rule b line 11 t2.g\r
- tree context:\r
- (root = D\r
- E\r
- )\r
-\r
- The token sequence which is suppressed: ( D E )\r
- The sequence of references which generate that sequence of tokens:\r
-\r
- 1 to ab r1/1 line 1 t2.g\r
- 2 ab ab/1 line 4 t2.g\r
- 3 to a ab/1 line 4 t2.g\r
- 4 a a/1 line 8 t2.g\r
- 5 #token D a/1 line 8 t2.g\r
- 6 #token E a/1 line 8 t2.g\r
-\r
- #endif\r
- &&\r
- #if 0\r
-\r
- pred << p(LATEXT(2))>>?\r
- depth=k=2 ("=>" guard) rule a line 8 t2.g\r
- tree context:\r
- (root = A\r
- B\r
- )\r
-\r
- #endif\r
-\r
- (! ( LA(1)==A && LA(2)==B ) || p(LATEXT(2)) ) {\r
- ab();\r
- ...\r
-\r
-#169. (Changed in MR13) Predicate test optimization for depth=1 predicates\r
-\r
- When the MR12 generated a test of a predicate which had depth 1\r
- it would use the depth >1 routines, resulting in correct but\r
- inefficient behavior. In MR13, a bit test is used.\r
-\r
-#168. (Changed in MR13) Token expressions in context guards\r
-\r
- The token expressions appearing in context guards such as:\r
-\r
- (A B)? => <<test(LT(1))>>? someRule\r
-\r
- are computed during an early phase of antlr processing. As\r
- a result, prior to MR13, complex expressions such as:\r
-\r
- ~B\r
- L..U\r
- ~L..U\r
- TokClassName\r
- ~TokClassName\r
-\r
- were not computed properly. This resulted in incorrect\r
- context being computed for such expressions.\r
-\r
- In MR13 these context guards are verified for proper semantics\r
- in the initial phase and then re-evaluated after complex token\r
- expressions have been computed in order to produce the correct\r
- behavior.\r
-\r
- Reported by Arpad Beszedes (beszedes inf.u-szeged.hu).\r
-\r
-#167. (Changed in MR13) ~L..U\r
-\r
- Prior to MR13, the complement of a token range was\r
- not properly computed.\r
-\r
-#166. (Changed in MR13) token expression L..U\r
-\r
- The token U was represented as an unsigned char, restricting\r
- the use of L..U to cases where U was assigned a token number\r
- less than 256. This is corrected in MR13.\r
-\r
-#165. (Changed in MR13) option -newAST\r
-\r
- To create ASTs from an ANTLRTokenPtr antlr usually calls\r
- "new AST(ANTLRTokenPtr)". This option generates a call\r
- to "newAST(ANTLRTokenPtr)" instead. This allows a user\r
- to define a parser member function to create an AST object.\r
-\r
- Similar changes for ASTBase::tmake and ASTBase::link were not\r
- thought necessary since they do not create AST objects, only\r
- use existing ones.\r
-\r
-#164. (Changed in MR13) Unused variable _astp\r
-\r
- For many compilations, we have lived with warnings about\r
- the unused variable _astp. It turns out that this varible\r
- can *never* be used because the code which references it was\r
- commented out.\r
-\r
- This investigation was sparked by a note from Erwin Achermann\r
- (erwin.achermann switzerland.org).\r
-\r
-#163. (Changed in MR13) Incorrect makefiles for testcpp examples\r
-\r
- All the examples in pccts/testcpp/* had incorrect definitions\r
- in the makefiles for the symbol "CCC". Instead of CCC=CC they\r
- had CC=$(CCC).\r
-\r
- There was an additional problem in testcpp/1/test.g due to the\r
- change in ANTLRToken::getText() to a const member function\r
- (Item #137).\r
-\r
- Reported by Maurice Mass (maas cuci.nl).\r
-\r
-#162. (Changed in MR13) Combining #token with #tokdefs\r
-\r
- When it became possible to change the print-name of a\r
- #token (Item #148) it became useful to give a #token\r
- statement whose only purpose was to giving a print name\r
- to the #token. Prior to this change this could not be\r
- combined with the #tokdefs feature.\r
-\r
-#161. (Changed in MR13) Switch -gxt inhibits generation of tokens.h\r
-\r
-#160. (Changed in MR13) Omissions in list of names for remap.h\r
-\r
- When a user selects the -gp option antlr creates a list\r
- of macros in remap.h to rename some of the standard\r
- antlr routines from zzXXX to userprefixXXX.\r
-\r
- There were number of omissions from the remap.h name\r
- list related to the new trace facility. This was reported,\r
- along with a fix, by Bernie Solomon (bernard ug.eds.com).\r
-\r
-#159. (Changed in MR13) Violations of classic C rules\r
-\r
- There were a number of violations of classic C style in\r
- the distribution kit. This was reported, along with fixes,\r
- by Bernie Solomon (bernard ug.eds.com).\r
-\r
-#158. (Changed in MR13) #header causes problem for pre-processors\r
-\r
- A user who runs the C pre-processor on antlr source suggested\r
- that another syntax be allowed. With MR13 such directives\r
- such as #header, #pragma, etc. may be written as "\#header",\r
- "\#pragma", etc. For escaping pre-processor directives inside\r
- a #header use something like the following:\r
-\r
- \#header\r
- <<\r
- \#include <stdio.h>\r
- >>\r
-\r
-#157. (Fixed in MR13) empty error sets for rules with infinite recursion\r
-\r
- When the first set for a rule cannot be computed due to infinite\r
- left recursion and it is the only alternative for a block then\r
- the error set for the block would be empty. This would result\r
- in a fatal error.\r
-\r
- Reported by Darin Creason (creason genedax.com)\r
-\r
-#156. (Changed in MR13) DLGLexerBase::getToken() now public\r
-\r
-#155. (Changed in MR13) Context behind predicates can suppress\r
-\r
- With -mrhoist enabled the context behind a guarded predicate can\r
- be used to suppress other predicates. Consider the following grammar:\r
-\r
- r0 : (r1)+;\r
-\r
- r1 : rp\r
- | rq\r
- ;\r
- rp : <<p LATEXT(1)>>? B ;\r
- rq : (A)? => <<q LATEXT(1)>>? (A|B);\r
-\r
- In earlier versions both predicates "p" and "q" would be hoisted into\r
- rule r0. With MR12c predicate p is suppressed because the context which\r
- follows predicate q includes "B" which can "cover" predicate "p". In\r
- other words, in trying to decide in r0 whether to call r1, it doesn't\r
- really matter whether p is false or true because, either way, there is\r
- a valid choice within r1.\r
-\r
-#154. (Changed in MR13) Making hoist suppression explicit using <<nohoist>>\r
-\r
- A common error, even among experienced pccts users, is to code\r
- an init-action to inhibit hoisting rather than a leading action.\r
- An init-action does not inhibit hoisting.\r
-\r
- This was coded:\r
-\r
- rule1 : <<;>> rule2\r
-\r
- This is what was meant:\r
-\r
- rule1 : <<;>> <<;>> rule2\r
-\r
- With MR13, the user can code:\r
-\r
- rule1 : <<;>> <<nohoist>> rule2\r
-\r
- The following will give an error message:\r
-\r
- rule1 : <<nohoist>> rule2\r
-\r
- If the <<nohoist>> appears as an init-action rather than a leading\r
- action an error message is issued. The meaning of an init-action\r
- containing "nohoist" is unclear: does it apply to just one\r
- alternative or to all alternatives ?\r
-\r
-\r
-\r
-\r
-\r
-\r
-\r
-\r
- -------------------------------------------------------\r
- Note: Items #153 to #1 are now in a separate file named\r
- CHANGES_FROM_133_BEFORE_MR13.txt\r
- -------------------------------------------------------\r