]> git.proxmox.com Git - mirror_edk2.git/blame - Tools/CodeTools/TianoTools/Pccts/CHANGES_FROM_131.txt
Restructuring for better separation of Tool packages.
[mirror_edk2.git] / Tools / CodeTools / TianoTools / Pccts / CHANGES_FROM_131.txt
CommitLineData
878ddf1f 1CHANGES FROM 1.31\r
2\r
3This file contains the migration of PCCTS from 1.31 in the order that\r
4changes were made. 1.32b7 is the last beta before full 1.32.\r
5Terence Parr, Parr Research Corporation 1995.\r
6\r
7\r
8======================================================================\r
91.32b1\r
10Added Russell Quong to banner, changed banner for output slightly\r
11Fixed it so that you have before / after actions for C++ in class def\r
12Fixed bug in optimizer that made it sometimes forget to set internal\r
13 token pointers. Only showed up when a {...} was in the "wrong spot".\r
14\r
15======================================================================\r
161.32b2\r
17Added fixes by Dave Seidel for PC compilers in 32 bit mode (config.h\r
18and set.h).\r
19\r
20======================================================================\r
211.32b3\r
22Fixed hideous bug in code generator for wildcard and for ~token op.\r
23\r
24from Dave Seidel\r
25\r
26 Added pcnames.bat\r
27 1. in antlr/main.c: change strcasecmp() to stricmp()\r
28\r
29 2. in dlg/output.c: use DLEXER_C instead on "DLexer.C"\r
30\r
31 3. in h/PBlackBox.h: use <iostream.h> instead of <stream.h>\r
32\r
33======================================================================\r
341.32b4\r
35When the -ft option was used, any path prefix screwed up\r
36the gate on the .h files\r
37\r
38Fixed yet another bug due to the optimizer.\r
39\r
40The exception handling thing was a bit wacko:\r
41\r
42a : ( A B )? A B\r
43 | A C\r
44 ;\r
45 exception ...\r
46\r
47caused an exception if "A C" was the input. In other words,\r
48it found that A C didn't match the (A B)? pred and caused\r
49an exception rather than trying the next alt. All I did\r
50was to change the zzmatch_wsig() macros.\r
51\r
52Fixed some problems in gen.c relating to the name of token\r
53class bit sets in the output.\r
54\r
55Added the tremendously cool generalized predicate. For the\r
56moment, I'll give this bried description.\r
57\r
58a : <<predicate>>? blah\r
59 | foo\r
60 ;\r
61\r
62This implies that (assuming blah and foo are syntactically\r
63ambiguous) "predicate" indicates the semantic validity of\r
64applying "blah". If "predicate" is false, "foo" is attempted.\r
65\r
66Previously, you had to say:\r
67\r
68a : <<LA(1)==ID ? predicate : 1>>? ID\r
69 | ID\r
70 ;\r
71\r
72Now, you can simply use "predicate" without the ?: operator\r
73if you turn on ANTLR command line option: "-prc on". This\r
74tells ANTLR to compute that all by itself. It computes n\r
75tokens of lookahead where LT(n) or LATEXT(n) is the farthest\r
76ahead you look.\r
77\r
78If you give a predicate using "-prc on" that is followed\r
79by a construct that can recognize more than one n-sequence,\r
80you will get a warning from ANTLR. For example,\r
81\r
82a : <<isTypeName(LT(1)->getText())>>? (ID|INT)\r
83 ;\r
84\r
85This is wrong because the predicate will be applied to INTs\r
86as well as ID's. You should use this syntax to make\r
87the predicate more specific:\r
88\r
89a : (ID)? => <<isTypeName(LT(1)->getText())>>? (ID|INT)\r
90 ;\r
91\r
92which says "don't apply the predicate unless ID is the\r
93current lookahead context".\r
94\r
95You cannot currently have anything in the "(context)? =>"\r
96except sequences such as:\r
97\r
98( LPAREN ID | LPAREN SCOPE )? => <<pred>>?\r
99\r
100I haven't tested this THAT much, but it does work for the\r
101C++ grammar.\r
102\r
103======================================================================\r
1041.32b5\r
105\r
106Added getLine() to the ANTLRTokenBase and DLGBasedToken classes\r
107left line() for backward compatibility.\r
108----\r
109Removed SORCERER_TRANSFORM from the ast.h stuff.\r
110-------\r
111Fixed bug in code gen of ANTLR such that nested syn preds work more\r
112efficiently now. The ANTLRTokenBuffer was getting very large\r
113with nested predicates.\r
114------\r
115Memory leak is now gone from ANTLRTokenBuf; all tokens are deleted.\r
116For backward compatibility reasons, you have to say parser->deleteTokens()\r
117or mytokenbuffer->deleteTokens() but later it will be the default mode.\r
118Say this after the parser is constructed. E.g.,\r
119\r
120 ParserBlackBox<DLGLexer, MyParser, ANTLRToken> p(stdin);\r
121 p.parser()->deleteTokens();\r
122 p.parser()->start_symbol();\r
123\r
124\r
125==============================\r
1261.32b6\r
127\r
128Changed so that deleteTokens() will do a delete ((ANTLRTokenBase *))\r
129on the ptr. This gets the virtual destructor.\r
130\r
131Fixed some weird things in the C++ header files (a few return types).\r
132\r
133Made the AST routines correspond to the book and SORCERER stuff.\r
134\r
135New token stuff: See testcpp/14/test.g\r
136\r
137ANTLR accepts a #pragma gc_tokens which says\r
138[1] Generate label = copy(LT(1)) instead of label=LT(1) for\r
139 all labeled token references.\r
140[2] User now has to define ANTLRTokenPtr (as a class or a typedef\r
141 to just a pointer) as well as the ANTLRToken class itself.\r
142 See the example.\r
143\r
144To delete tokens in token buffer, use deleteTokens() message on parser.\r
145\r
146 All tokens that fall off the ANTLRTokenBuffer get deleted\r
147 which is what currently happens when deleteTokens() message\r
148 has been sent to token buffer.\r
149\r
150We always generate ANTLRTokenPtr instead of 'ANTLRToken *' now.\r
151Then if no pragma set, ANTLR generates a\r
152\r
153 class ANTLRToken;\r
154 typedef ANTLRToken *ANTLRTokenPtr;\r
155\r
156in each file.\r
157\r
158Made a warning for x:rule_ref <<$x>>; still no warning for $i's, however.\r
159class BB {\r
160\r
161a : x:b y:A <<$x\r
162$y>>\r
163 ;\r
164\r
165b : B;\r
166\r
167}\r
168generates\r
169Antlr parser generator Version 1.32b6 1989-1995\r
170test.g, line 3: error: There are no token ptrs for rule references: '$x'\r
171\r
172===================\r
1731.32b7:\r
174\r
175[With respect to token object garbage collection (GC), 1.32b7\r
176 backtracks from 1.32b6, but results in better and less intrusive GC.\r
177 This is the last beta version before full 1.32.]\r
178\r
179BIGGEST CHANGES:\r
180\r
181o The "#pragma gc_tokens" is no longer used.\r
182\r
183o .C files are now .cpp files (hence, makefiles will have to\r
184 be changed; or you can rerun genmk). This is a good move,\r
185 but causes some backward incompatibility problems. You can\r
186 avoid this by changing CPP_FILE_SUFFIX to ".C" in pccts/h/config.h.\r
187\r
188o The token object class hierarchy has been flattened to include\r
189 only three classes: ANTLRAbstractToken, ANTLRCommonToken, and\r
190 ANTLRCommonNoRefCountToken. The common token now does garbage\r
191 collection via ref counting.\r
192\r
193o "Smart" pointers are now used for garbage collection. That is,\r
194 ANTLRTokenPtr is used instead of "ANTLRToken *".\r
195\r
196o The antlr.1 man page has been cleaned up slightly.\r
197\r
198o The SUN C++ compiler now complains less about C++ support code.\r
199\r
200o Grammars which subclass ANTLRCommonToken must wrap all token\r
201 pointer references in mytoken(token_ptr). This is the only\r
202 serious backward incompatibility. See below.\r
203\r
204\r
205MINOR CHANGES:\r
206\r
207--------------------------------------------------------\r
2081 deleteTokens()\r
209\r
210The deleteTokens() message to the parser or token buffer has been changed\r
211to one of:\r
212\r
213 void noGarbageCollectTokens() { inputTokens->noGarbageCollectTokens(); }\r
214 void garbageCollectTokens() { inputTokens->garbageCollectTokens(); }\r
215\r
216The token buffer deletes all non-referenced tokens by default now.\r
217\r
218--------------------------------------------------------\r
2192 makeToken()\r
220\r
221The makeToken() message returns a new type. The function should look\r
222like:\r
223\r
224 virtual ANTLRAbstractToken *makeToken(ANTLRTokenType tt,\r
225 ANTLRChar *txt,\r
226 int line)\r
227 {\r
228 ANTLRAbstractToken *t = new ANTLRCommonToken(tt,txt);\r
229 t->setLine(line);\r
230 return t;\r
231 }\r
232\r
233--------------------------------------------------------\r
2343 TokenType\r
235\r
236Changed TokenType-> ANTLRTokenType (often forces changes in AST defs due\r
237to #[] constructor called to AST(tokentype, string)).\r
238\r
239--------------------------------------------------------\r
2404 AST()\r
241\r
242You must define AST(ANTLRTokenPtr t) now in your AST class definition.\r
243You might also have to include ATokPtr.h above the definition; e.g.,\r
244if AST is defined in a separate file, such as AST.h, it's a good idea\r
245to include ATOKPTR_H (ATokPtr.h). For example,\r
246\r
247 #include ATOKPTR_H\r
248 class AST : public ASTBase {\r
249 protected:\r
250 ANTLRTokenPtr token;\r
251 public:\r
252 AST(ANTLRTokenPtr t) { token = t; }\r
253 void preorder_action() {\r
254 char *s = token->getText();\r
255 printf(" %s", s);\r
256 }\r
257 };\r
258\r
259Note the use of smart pointers rather than "ANTLRToken *".\r
260\r
261--------------------------------------------------------\r
2625 SUN C++\r
263\r
264From robertb oakhill.sps.mot.com Bob Bailey. Changed ANTLR C++ output\r
265to avoid an error in Sun C++ 3.0.1. Made "public" return value\r
266structs created to hold multiple return values public.\r
267\r
268--------------------------------------------------------\r
2696 genmk\r
270\r
271Fixed genmk so that target List.* is not included anymore. It's\r
272called SList.* anyway.\r
273\r
274--------------------------------------------------------\r
2757 \r vs \n\r
276\r
277Scott Vorthmann <vorth cmu.edu> fixed antlr.g in ANTLR so that \r\r
278is allowed as the return character as well as \n.\r
279\r
280--------------------------------------------------------\r
2818 Exceptions\r
282\r
283Bug in exceptions attached to labeled token/tokclass references. Didn't gen\r
284code for exceptions. This didn't work:\r
285\r
286a : "help" x:ID\r
287 ;\r
288 exception[x]\r
289 catch MismatchedToken : <<printf("eh?\n");>>\r
290\r
291Now ANTLR generates (which is kinda big, but necessary):\r
292\r
293 if ( !_match_wsig(ID) ) {\r
294 if ( guessing ) goto fail;\r
295 _signal=MismatchedToken;\r
296 switch ( _signal ) {\r
297 case MismatchedToken :\r
298 printf("eh?\n");\r
299 _signal = NoSignal;\r
300 break;\r
301 default :\r
302 goto _handler;\r
303 }\r
304 }\r
305\r
306which implies that you can recover and continue parsing after a missing/bad\r
307token reference.\r
308\r
309--------------------------------------------------------\r
3109 genmk\r
311\r
312genmk now correctly uses config file for CPP_FILE_SUFFIX stuff.\r
313\r
314--------------------------------------------------------\r
31510 general cleanup / PURIFY\r
316\r
317Anthony Green <green vizbiz.com> suggested a bunch of good general\r
318clean up things for the code; he also suggested a few things to\r
319help out the "PURIFY" memory allocation checker.\r
320\r
321--------------------------------------------------------\r
32211 $-variable references.\r
323\r
324Manuel ORNATO indicated that a $-variable outside of a rule caused\r
325ANTLR to crash. I fixed this.\r
326\r
32712 Tom Moog suggestion\r
328\r
329Fail action of semantic predicate needs "{}" envelope. FIXED.\r
330\r
33113 references to LT(1).\r
332\r
333I have enclosed all assignments such as:\r
334\r
335 _t22 = (ANTLRTokenPtr)LT(1);\r
336\r
337in "if ( !guessing )" so that during backtracking the reference count\r
338for token objects is not increased.\r
339\r
340\r
341TOKEN OBJECT GARBAGE COLLECTION\r
342\r
3431 INTRODUCTION\r
344\r
345The class ANTLRCommonToken is now garbaged collected through a "smart"\r
346pointer called ANTLRTokenPtr using reference counting. Any token\r
347object not referenced by your grammar actions is destroyed by the\r
348ANTLRTokenBuffer when it must make room for more token objects.\r
349Referenced tokens are then destroyed in your parser when local\r
350ANTLRTokenPtr objects are deleted. For example,\r
351\r
352a : label:ID ;\r
353\r
354would be converted to something like:\r
355\r
356void yourclass::a(void)\r
357{\r
358 zzRULE;\r
359 ANTLRTokenPtr label=NULL; // used to be ANTLRToken *label;\r
360 zzmatch(ID);\r
361 label = (ANTLRTokenPtr)LT(1);\r
362 consume();\r
363 ...\r
364}\r
365\r
366When the "label" object is destroyed (it's just a pointer to your\r
367input token object LT(1)), it decrements the reference count on the\r
368object created for the ID. If the count goes to zero, the object\r
369pointed by label is deleted.\r
370\r
371To correctly manage the garbage collection, you should use\r
372ANTLRTokenPtr instead of "ANTLRToken *". Most ANTLR support code\r
373(visible to the user) has been modified to use the smart pointers.\r
374\r
375***************************************************************\r
376Remember that any local objects that you create are not deleted when a\r
377lonjmp() is executed. Unfortunately, the syntactic predicates (...)?\r
378use setjmp()/longjmp(). There are some situations when a few tokens\r
379will "leak".\r
380***************************************************************\r
381\r
3822 DETAILS\r
383\r
384o The default is to perform token object garbage collection.\r
385 You may use parser->noGarbageCollectTokens() to turn off\r
386 garbage collection.\r
387\r
388\r
389o The type ANTLRTokenPtr is always defined now (automatically).\r
390 If you do not wish to use smart pointers, you will have to\r
391 redefined ANTLRTokenPtr by subclassing, changing the header\r
392 file or changing ANTLR's code generation (easy enough to\r
393 do in gen.c).\r
394\r
395o If you don't use ParserBlackBox, the new initialization sequence is:\r
396\r
397 ANTLRTokenPtr aToken = new ANTLRToken;\r
398 scan.setToken(mytoken(aToken));\r
399\r
400 where mytoken(aToken) gets an ANTLRToken * from the smart pointer.\r
401\r
402o Define C++ preprocessor symbol DBG_REFCOUNTTOKEN to see a bunch of\r
403 debugging stuff for reference counting if you suspect something.\r
404\r
405\r
4063 WHY DO I HAVE TO TYPECAST ALL MY TOKEN POINTERS NOW??????\r
407\r
408If you subclass ANTLRCommonToken and then attempt to refer to one of\r
409your token members via a token pointer in your grammar actions, the\r
410C++ compiler will complain that your token object does not have that\r
411member. For example, if you used to do this\r
412\r
413<<\r
414class ANTLRToken : public ANTLRCommonToken {\r
415 int muck;\r
416 ...\r
417};\r
418>>\r
419\r
420class Foo {\r
421a : t:ID << t->muck = ...; >> ;\r
422}\r
423\r
424Now, you must do change the t->muck reference to:\r
425\r
426a : t:ID << mytoken(t)->muck = ...; >> ;\r
427\r
428in order to downcast 't' to be an "ANTLRToken *" not the\r
429"ANTLRAbstractToken *" resulting from ANTLRTokenPtr::operator->().\r
430The macro is defined as:\r
431\r
432/*\r
433 * Since you cannot redefine operator->() to return one of the user's\r
434 * token object types, we must down cast. This is a drag. Here's\r
435 * a macro that helps. template: "mytoken(a-smart-ptr)->myfield".\r
436 */\r
437#define mytoken(tp) ((ANTLRToken *)(tp.operator->()))\r
438\r
439You have to use macro mytoken(grammar-label) now because smart\r
440pointers are not specific to a parser's token objects. In other\r
441words, the ANTLRTokenPtr class has a pointer to a generic\r
442ANTLRAbstractToken not your ANTLRToken; the ANTLR support code must\r
443use smart pointers too, but be able to work with any kind of\r
444ANTLRToken. Sorry about this, but it's C++'s fault not mine. Some\r
445nebulous future version of the C++ compilers should obviate the need\r
446to downcast smart pointers with runtime type checking (and by allowing\r
447different return type of overridden functions).\r
448\r
449A way to have backward compatible code is to shut off the token object\r
450garbage collection; i.e., use parser->noGarbageCollectTokens() and\r
451change the definition of ANTLRTokenPtr (that's why you get source code\r
452<wink>).\r
453\r
454\r
455PARSER EXCEPTION HANDLING\r
456\r
457I've noticed some weird stuff with the exception handling. I intend\r
458to give this top priority for the "book release" of ANTLR.\r
459\r
460==========\r
4611.32 Full Release\r
462\r
463o Changed Token class hierarchy to be (Thanks to Tom Moog):\r
464\r
465 ANTLRAbstractToken\r
466 ANTLRRefCountToken\r
467 ANTLRCommonToken\r
468 ANTLRNoRefCountCommonToken\r
469\r
470o Added virtual panic() to ANTLRAbstractToken. Made ANTLRParser::panic()\r
471 virtual also.\r
472\r
473o Cleaned up the dup() stuff in AST hierarchy to use shallowCopy() to\r
474 make node copies. John Farr at Medtronic suggested this. I.e.,\r
475 if you want to use dup() with either ANTLR or SORCERER or -transform\r
476 mode with SORCERER, you must defined shallowCopy() as:\r
477\r
478 virtual PCCTS_AST *shallowCopy()\r
479 {\r
480 return new AST;\r
481 p->setDown(NULL);\r
482 p->setRight(NULL);\r
483 return p;\r
484 }\r
485\r
486 or\r
487\r
488 virtual PCCTS_AST *shallowCopy()\r
489 {\r
490 return new AST(*this);\r
491 }\r
492 \r
493 if you have defined a copy constructor such as\r
494\r
495 AST(const AST &t) // shallow copy constructor\r
496 {\r
497 token = t.token;\r
498 iconst = t.iconst;\r
499 setDown(NULL);\r
500 setRight(NULL);\r
501 }\r
502\r
503o Added a warning with -CC and -gk are used together. This is broken,\r
504 hence a warning is appropriate.\r
505\r
506o Added warning when #-stuff is used w/o -gt option.\r
507\r
508o Updated MPW installation.\r
509\r
510o "Miller, Philip W." <MILLERPW f1groups.fsd.jhuapl.edu> suggested\r
511 that genmk be use RENAME_OBJ_FLAG RENAME_EXE_FLAG instead of\r
512 hardcoding "-o" in genmk.c.\r
513\r
514o made all exit() calls use EXIT_SUCCESS or EXIT_FAILURE.\r
515\r
516===========================================================================\r
5171.33\r
518\r
519EXIT_FAILURE and EXIT_SUCCESS were not always defined. I had to modify\r
520a bunch of files to use PCCTS_EXIT_XXX, which forces a new version. Sorry\r
521about that.\r
522\r