+++ /dev/null
-\r
- =======================================================\r
- Known Problems In PCCTS - Last revised 14 November 1998\r
- =======================================================\r
-\r
-#17. The dlg fix for handling characters up to 255 is incorrect.\r
-\r
- See item #207.\r
-\r
- Reported by Frank Hartmann.\r
- \r
-#16. A note about "&&" predicates (Mike Dimmick)\r
-\r
- Mike Dimmick has pointed out a potential pitfall in the use of the\r
- "&&" style predicate. Consider:\r
- \r
- r0: (g)? => <<P>>? r1\r
- | ...\r
- ;\r
- r1: A | B;\r
- \r
- If the context guard g is not a subset of the lookahead context for r1\r
- (in other words g is neither A nor B) then the code may execute r1 \r
- even when the lookahead context is not satisfied. This is an error\r
- by the person coding the grammer, and the error should be reported to\r
- the user, but it isn't. expect. Some examples I've run seem to\r
- indicate that such an error actually results in the rule becoming\r
- unreachable.\r
- \r
- When g is properly coded the code is correct, the problem is when g\r
- is not properly coded.\r
- \r
- A second problem reported by Mike Dimmick is that the test for a\r
- failed validation predicate is equivalent to a test on the predicate\r
- along. In other words, if the "&&" has not been hoisted then it may\r
- falsely report a validation error.\r
-\r
-#15. (Changed in MR23) Warning for LT(i), LATEXT(i) in token match actions\r
-\r
- An bug (or at least an oddity) is that a reference to LT(1), LA(1),\r
- or LATEXT(1) in an action which immediately follows a token match\r
- in a rule refers to the token matched, not the token which is in\r
- the lookahead buffer. Consider:\13\r
-\r
- r : abc <<action alpha>> D <<action beta>> E;\r
-\r
- In this case LT(1) in action alpha will refer to the next token in\r
- the lookahead buffer ("D"), but LT(1) in action beta will refer to\r
- the token matched by D - the preceding token.\r
-\r
- A warning has been added which warns users about this when an action\r
- following a token match contains a reference to LT(1), LA(1), or LATEXT(1).\r
-\r
- This behavior should be changed, but it appears in too many programs\r
- now. Another problem, perhaps more significant, is that the obvious\r
- fix (moving the consume() call to before the action) could change the \r
- order in which input is requested and output appears in existing programs.\r
-\r
- This problem was reported, along with a fix by Benjamin Mandel\r
- (beny@sd.co.il). However, I felt that changing the behavior was too\r
- dangerous for existing code.\r
-\r
-#14. Parsing bug in dlg\r
-\r
- THM: I have been unable to reproduce this problem.\r
-\r
- Reported by Rick Howard Mijenix Corporation (rickh@mijenix.com).\r
-\r
- The regular expression parser (in rexpr.c) fails while\r
- trying to parse the following regular expression:\r
-\r
- {[a-zA-Z]:}(\\\\[a-zA-Z0-9]*)+\r
-\r
- See my comment in the following excerpt from rexpr.c:\r
-\r
- /*\r
- * <regExpr> ::= <andExpr> ( '|' {<andExpr>} )*\r
- *\r
- * Return -1 if syntax error\r
- * Return 0 if none found\r
- * Return 1 if a regExrp was found\r
- */\r
- static\r
- regExpr(g)\r
- GraphPtr g;\r
- {\r
- Graph g1, g2;\r
- \r
- if ( andExpr(&g1) == -1 )\r
- {\r
- return -1;\r
- }\r
- \r
- while ( token == '|' )\r
- {\r
- int a;\r
- next();\r
- a = andExpr(&g2);\r
- if ( a == -1 ) return -1; /* syntax error below */\r
- else if ( !a ) return 1; /* empty alternative */\r
- g1 = BuildNFA_AorB(g1, g2);\r
- }\r
- \r
- if ( token!='\0' ) return -1;\r
- *****\r
- ***** It appears to fail here becuause token is 125 - the closing '}'\r
- ***** If I change it to:\r
- ***** if ( token!='\0' && token!='}' && token!= ')' ) return -1;\r
- *****\r
- ***** It succeeds, but I'm not sure this is the corrrect approach.\r
- *****\r
- *g = g1;\r
- return 1;\r
- }\r
-\r
-#13. dlg reports an invalid range for: [\0x00-\0xff]\r
-\r
- Diagnosed by Piotr Eljasiak (eljasiak@no-spam.zt.gdansk.tpsa.pl):\r
-\r
- Fixed in MR16.\r
-\r
-#12. Strings containing comment actions\r
-\r
- Sequences that looked like C style comments appearing in string\r
- literals are improperly parsed by antlr/dlg.\r
-\r
- << fprintf(out," /* obsolete */ ");\r
-\r
- For this case use:\r
-\r
- << fprintf(out," \/\* obsolete \*\/ ");\r
-\r
- Reported by K.J. Cummings (cummings@peritus.com).\r
-\r
-#11. User hook for deallocation of variables on guess fail\r
-\r
- The mechanism outlined in Item #108 works only for\r
- heap allocated variables.\r
-\r
-#10. Label re-initialization in ( X {y:Y} )*\r
-\r
- If a label assignment is optional and appears in a\r
- (...)* or (...)+ block it will not be reset to NULL\r
- when it is skipped by a subsequent iteration.\r
-\r
- Consider the example:\r
-\r
- ( X { y:Y })* Z\r
-\r
- with input:\r
-\r
- X Y X Z\r
-\r
- The first time through the block Y will be matched and\r
- y will be set to point to the token. On the second\r
- iteration of the (...)* block there is no match for Y.\r
- But y will not be reset to NULL, as the user might\r
- expect, it will contain a reference to the Y that was\r
- matched in the first iteration.\r
-\r
- The work-around is to manually reset y:\r
-\r
- ( X << y = NULL; >> { y:Y } )* Z\r
-\r
- or\r
-\r
- ( X ( y:Y | << y = NULL; >> /* epsilon */ ) )* Z\r
-\r
- Reported by Jeff Vincent (JVincent@novell.com).\r
-\r
-#9. PCCTAST.h PCCTSAST::setType() is a noop\r
-\r
-#8. #tokdefs with ~Token and .\r
-\r
- THM: I have been unable to reproduce this problem.\r
-\r
- When antlr uses #tokdefs to define tokens the fields of\r
- #errclass and #tokclass do not get properly defined.\r
- When it subsequently attempts to take the complement of\r
- the set of tokens (using ~Token or .) it can refer to\r
- tokens which don't have names, generating a fatal error.\r
-\r
-#7. DLG crashes on some invalid inputs\r
-\r
- THM: In MR20 have fixed the most common cases.\r
-\r
- The following token defintion will cause DLG to crash.\r
-\r
- #token "()"\r
-\r
- Reported by Mengue Olivier (dolmen@bigfoot.com).\r
-\r
-#6. On MS systems \n\r is treated as two new lines\r
-\r
- Fixed.\r
-\r
-#5. Token expressions in #tokclass\r
-\r
- #errclass does not support TOK1..TOK2 or ~TOK syntax.\r
- #tokclass does not support ~TOKEN syntax\r
-\r
- A workaround for #errclass TOK1..TOK2 is to use a\r
- #tokclass.\r
-\r
- Reported by Dave Watola (dwatola@amtsun.jpl.nasa.gov)\r
-\r
-#4. A #tokdef must appear "early" in the grammar file.\r
-\r
- The "early" section of the grammar file is the only\r
- place where the following directives may appear:\r
-\r
- #header\r
- #first\r
- #tokdefs\r
- #parser\r
-\r
- Any other kind of statement signifiies the end of the\r
- "early" section.\r
-\r
-#3. Use of PURIFY macro for C++ mode\r
-\r
- Item #93 of the CHANGES_FROM_1.33 describes the use of\r
- the PURIFY macro to zero arguments to be passed by\r
- upward inheritance.\r
-\r
- #define PURIFY(r, s) memset((char *) &(r), '\0', (s));\r
-\r
- This may not be the right thing to do for C++ objects that\r
- have constructors. Reported by Bonny Rais (bonny@werple.net.au).\r
-\r
- For those cases one should #define PURIFY to be an empty macro\r
- in the #header or #first actions.\r
-\r
-#2. Fixed in 1.33MR10 - See CHANGES_FROM_1.33 Item #80.\r
-\r
-#1. The quality of support for systems with 8.3 file names leaves\r
- much to be desired. Since the kit is distributed using the\r
- long file names and the make file uses long file names it requires\r
- some effort to generate. This will probably not be changed due\r
- to the large number of systems already written using the long\r
- file names.\r