]> git.proxmox.com Git - mirror_edk2.git/blob - Tools/CCode/Source/Pccts/CHANGES_FROM_131.txt
More moves for Tool Packages
[mirror_edk2.git] / Tools / CCode / Source / Pccts / CHANGES_FROM_131.txt
1 CHANGES FROM 1.31
2
3 This file contains the migration of PCCTS from 1.31 in the order that
4 changes were made. 1.32b7 is the last beta before full 1.32.
5 Terence Parr, Parr Research Corporation 1995.
6
7
8 ======================================================================
9 1.32b1
10 Added Russell Quong to banner, changed banner for output slightly
11 Fixed it so that you have before / after actions for C++ in class def
12 Fixed bug in optimizer that made it sometimes forget to set internal
13 token pointers. Only showed up when a {...} was in the "wrong spot".
14
15 ======================================================================
16 1.32b2
17 Added fixes by Dave Seidel for PC compilers in 32 bit mode (config.h
18 and set.h).
19
20 ======================================================================
21 1.32b3
22 Fixed hideous bug in code generator for wildcard and for ~token op.
23
24 from Dave Seidel
25
26 Added pcnames.bat
27 1. in antlr/main.c: change strcasecmp() to stricmp()
28
29 2. in dlg/output.c: use DLEXER_C instead on "DLexer.C"
30
31 3. in h/PBlackBox.h: use <iostream.h> instead of <stream.h>
32
33 ======================================================================
34 1.32b4
35 When the -ft option was used, any path prefix screwed up
36 the gate on the .h files
37
38 Fixed yet another bug due to the optimizer.
39
40 The exception handling thing was a bit wacko:
41
42 a : ( A B )? A B
43 | A C
44 ;
45 exception ...
46
47 caused an exception if "A C" was the input. In other words,
48 it found that A C didn't match the (A B)? pred and caused
49 an exception rather than trying the next alt. All I did
50 was to change the zzmatch_wsig() macros.
51
52 Fixed some problems in gen.c relating to the name of token
53 class bit sets in the output.
54
55 Added the tremendously cool generalized predicate. For the
56 moment, I'll give this bried description.
57
58 a : <<predicate>>? blah
59 | foo
60 ;
61
62 This implies that (assuming blah and foo are syntactically
63 ambiguous) "predicate" indicates the semantic validity of
64 applying "blah". If "predicate" is false, "foo" is attempted.
65
66 Previously, you had to say:
67
68 a : <<LA(1)==ID ? predicate : 1>>? ID
69 | ID
70 ;
71
72 Now, you can simply use "predicate" without the ?: operator
73 if you turn on ANTLR command line option: "-prc on". This
74 tells ANTLR to compute that all by itself. It computes n
75 tokens of lookahead where LT(n) or LATEXT(n) is the farthest
76 ahead you look.
77
78 If you give a predicate using "-prc on" that is followed
79 by a construct that can recognize more than one n-sequence,
80 you will get a warning from ANTLR. For example,
81
82 a : <<isTypeName(LT(1)->getText())>>? (ID|INT)
83 ;
84
85 This is wrong because the predicate will be applied to INTs
86 as well as ID's. You should use this syntax to make
87 the predicate more specific:
88
89 a : (ID)? => <<isTypeName(LT(1)->getText())>>? (ID|INT)
90 ;
91
92 which says "don't apply the predicate unless ID is the
93 current lookahead context".
94
95 You cannot currently have anything in the "(context)? =>"
96 except sequences such as:
97
98 ( LPAREN ID | LPAREN SCOPE )? => <<pred>>?
99
100 I haven't tested this THAT much, but it does work for the
101 C++ grammar.
102
103 ======================================================================
104 1.32b5
105
106 Added getLine() to the ANTLRTokenBase and DLGBasedToken classes
107 left line() for backward compatibility.
108 ----
109 Removed SORCERER_TRANSFORM from the ast.h stuff.
110 -------
111 Fixed bug in code gen of ANTLR such that nested syn preds work more
112 efficiently now. The ANTLRTokenBuffer was getting very large
113 with nested predicates.
114 ------
115 Memory leak is now gone from ANTLRTokenBuf; all tokens are deleted.
116 For backward compatibility reasons, you have to say parser->deleteTokens()
117 or mytokenbuffer->deleteTokens() but later it will be the default mode.
118 Say this after the parser is constructed. E.g.,
119
120 ParserBlackBox<DLGLexer, MyParser, ANTLRToken> p(stdin);
121 p.parser()->deleteTokens();
122 p.parser()->start_symbol();
123
124
125 ==============================
126 1.32b6
127
128 Changed so that deleteTokens() will do a delete ((ANTLRTokenBase *))
129 on the ptr. This gets the virtual destructor.
130
131 Fixed some weird things in the C++ header files (a few return types).
132
133 Made the AST routines correspond to the book and SORCERER stuff.
134
135 New token stuff: See testcpp/14/test.g
136
137 ANTLR accepts a #pragma gc_tokens which says
138 [1] Generate label = copy(LT(1)) instead of label=LT(1) for
139 all labeled token references.
140 [2] User now has to define ANTLRTokenPtr (as a class or a typedef
141 to just a pointer) as well as the ANTLRToken class itself.
142 See the example.
143
144 To delete tokens in token buffer, use deleteTokens() message on parser.
145
146 All tokens that fall off the ANTLRTokenBuffer get deleted
147 which is what currently happens when deleteTokens() message
148 has been sent to token buffer.
149
150 We always generate ANTLRTokenPtr instead of 'ANTLRToken *' now.
151 Then if no pragma set, ANTLR generates a
152
153 class ANTLRToken;
154 typedef ANTLRToken *ANTLRTokenPtr;
155
156 in each file.
157
158 Made a warning for x:rule_ref <<$x>>; still no warning for $i's, however.
159 class BB {
160
161 a : x:b y:A <<$x
162 $y>>
163 ;
164
165 b : B;
166
167 }
168 generates
169 Antlr parser generator Version 1.32b6 1989-1995
170 test.g, line 3: error: There are no token ptrs for rule references: '$x'
171
172 ===================
173 1.32b7:
174
175 [With respect to token object garbage collection (GC), 1.32b7
176 backtracks from 1.32b6, but results in better and less intrusive GC.
177 This is the last beta version before full 1.32.]
178
179 BIGGEST CHANGES:
180
181 o The "#pragma gc_tokens" is no longer used.
182
183 o .C files are now .cpp files (hence, makefiles will have to
184 be changed; or you can rerun genmk). This is a good move,
185 but causes some backward incompatibility problems. You can
186 avoid this by changing CPP_FILE_SUFFIX to ".C" in pccts/h/config.h.
187
188 o The token object class hierarchy has been flattened to include
189 only three classes: ANTLRAbstractToken, ANTLRCommonToken, and
190 ANTLRCommonNoRefCountToken. The common token now does garbage
191 collection via ref counting.
192
193 o "Smart" pointers are now used for garbage collection. That is,
194 ANTLRTokenPtr is used instead of "ANTLRToken *".
195
196 o The antlr.1 man page has been cleaned up slightly.
197
198 o The SUN C++ compiler now complains less about C++ support code.
199
200 o Grammars which subclass ANTLRCommonToken must wrap all token
201 pointer references in mytoken(token_ptr). This is the only
202 serious backward incompatibility. See below.
203
204
205 MINOR CHANGES:
206
207 --------------------------------------------------------
208 1 deleteTokens()
209
210 The deleteTokens() message to the parser or token buffer has been changed
211 to one of:
212
213 void noGarbageCollectTokens() { inputTokens->noGarbageCollectTokens(); }
214 void garbageCollectTokens() { inputTokens->garbageCollectTokens(); }
215
216 The token buffer deletes all non-referenced tokens by default now.
217
218 --------------------------------------------------------
219 2 makeToken()
220
221 The makeToken() message returns a new type. The function should look
222 like:
223
224 virtual ANTLRAbstractToken *makeToken(ANTLRTokenType tt,
225 ANTLRChar *txt,
226 int line)
227 {
228 ANTLRAbstractToken *t = new ANTLRCommonToken(tt,txt);
229 t->setLine(line);
230 return t;
231 }
232
233 --------------------------------------------------------
234 3 TokenType
235
236 Changed TokenType-> ANTLRTokenType (often forces changes in AST defs due
237 to #[] constructor called to AST(tokentype, string)).
238
239 --------------------------------------------------------
240 4 AST()
241
242 You must define AST(ANTLRTokenPtr t) now in your AST class definition.
243 You might also have to include ATokPtr.h above the definition; e.g.,
244 if AST is defined in a separate file, such as AST.h, it's a good idea
245 to include ATOKPTR_H (ATokPtr.h). For example,
246
247 #include ATOKPTR_H
248 class AST : public ASTBase {
249 protected:
250 ANTLRTokenPtr token;
251 public:
252 AST(ANTLRTokenPtr t) { token = t; }
253 void preorder_action() {
254 char *s = token->getText();
255 printf(" %s", s);
256 }
257 };
258
259 Note the use of smart pointers rather than "ANTLRToken *".
260
261 --------------------------------------------------------
262 5 SUN C++
263
264 From robertb oakhill.sps.mot.com Bob Bailey. Changed ANTLR C++ output
265 to avoid an error in Sun C++ 3.0.1. Made "public" return value
266 structs created to hold multiple return values public.
267
268 --------------------------------------------------------
269 6 genmk
270
271 Fixed genmk so that target List.* is not included anymore. It's
272 called SList.* anyway.
273
274 --------------------------------------------------------
275 7 \r vs \n
276
277 Scott Vorthmann <vorth cmu.edu> fixed antlr.g in ANTLR so that \r
278 is allowed as the return character as well as \n.
279
280 --------------------------------------------------------
281 8 Exceptions
282
283 Bug in exceptions attached to labeled token/tokclass references. Didn't gen
284 code for exceptions. This didn't work:
285
286 a : "help" x:ID
287 ;
288 exception[x]
289 catch MismatchedToken : <<printf("eh?\n");>>
290
291 Now ANTLR generates (which is kinda big, but necessary):
292
293 if ( !_match_wsig(ID) ) {
294 if ( guessing ) goto fail;
295 _signal=MismatchedToken;
296 switch ( _signal ) {
297 case MismatchedToken :
298 printf("eh?\n");
299 _signal = NoSignal;
300 break;
301 default :
302 goto _handler;
303 }
304 }
305
306 which implies that you can recover and continue parsing after a missing/bad
307 token reference.
308
309 --------------------------------------------------------
310 9 genmk
311
312 genmk now correctly uses config file for CPP_FILE_SUFFIX stuff.
313
314 --------------------------------------------------------
315 10 general cleanup / PURIFY
316
317 Anthony Green <green vizbiz.com> suggested a bunch of good general
318 clean up things for the code; he also suggested a few things to
319 help out the "PURIFY" memory allocation checker.
320
321 --------------------------------------------------------
322 11 $-variable references.
323
324 Manuel ORNATO indicated that a $-variable outside of a rule caused
325 ANTLR to crash. I fixed this.
326
327 12 Tom Moog suggestion
328
329 Fail action of semantic predicate needs "{}" envelope. FIXED.
330
331 13 references to LT(1).
332
333 I have enclosed all assignments such as:
334
335 _t22 = (ANTLRTokenPtr)LT(1);
336
337 in "if ( !guessing )" so that during backtracking the reference count
338 for token objects is not increased.
339
340
341 TOKEN OBJECT GARBAGE COLLECTION
342
343 1 INTRODUCTION
344
345 The class ANTLRCommonToken is now garbaged collected through a "smart"
346 pointer called ANTLRTokenPtr using reference counting. Any token
347 object not referenced by your grammar actions is destroyed by the
348 ANTLRTokenBuffer when it must make room for more token objects.
349 Referenced tokens are then destroyed in your parser when local
350 ANTLRTokenPtr objects are deleted. For example,
351
352 a : label:ID ;
353
354 would be converted to something like:
355
356 void yourclass::a(void)
357 {
358 zzRULE;
359 ANTLRTokenPtr label=NULL; // used to be ANTLRToken *label;
360 zzmatch(ID);
361 label = (ANTLRTokenPtr)LT(1);
362 consume();
363 ...
364 }
365
366 When the "label" object is destroyed (it's just a pointer to your
367 input token object LT(1)), it decrements the reference count on the
368 object created for the ID. If the count goes to zero, the object
369 pointed by label is deleted.
370
371 To correctly manage the garbage collection, you should use
372 ANTLRTokenPtr instead of "ANTLRToken *". Most ANTLR support code
373 (visible to the user) has been modified to use the smart pointers.
374
375 ***************************************************************
376 Remember that any local objects that you create are not deleted when a
377 lonjmp() is executed. Unfortunately, the syntactic predicates (...)?
378 use setjmp()/longjmp(). There are some situations when a few tokens
379 will "leak".
380 ***************************************************************
381
382 2 DETAILS
383
384 o The default is to perform token object garbage collection.
385 You may use parser->noGarbageCollectTokens() to turn off
386 garbage collection.
387
388
389 o The type ANTLRTokenPtr is always defined now (automatically).
390 If you do not wish to use smart pointers, you will have to
391 redefined ANTLRTokenPtr by subclassing, changing the header
392 file or changing ANTLR's code generation (easy enough to
393 do in gen.c).
394
395 o If you don't use ParserBlackBox, the new initialization sequence is:
396
397 ANTLRTokenPtr aToken = new ANTLRToken;
398 scan.setToken(mytoken(aToken));
399
400 where mytoken(aToken) gets an ANTLRToken * from the smart pointer.
401
402 o Define C++ preprocessor symbol DBG_REFCOUNTTOKEN to see a bunch of
403 debugging stuff for reference counting if you suspect something.
404
405
406 3 WHY DO I HAVE TO TYPECAST ALL MY TOKEN POINTERS NOW??????
407
408 If you subclass ANTLRCommonToken and then attempt to refer to one of
409 your token members via a token pointer in your grammar actions, the
410 C++ compiler will complain that your token object does not have that
411 member. For example, if you used to do this
412
413 <<
414 class ANTLRToken : public ANTLRCommonToken {
415 int muck;
416 ...
417 };
418 >>
419
420 class Foo {
421 a : t:ID << t->muck = ...; >> ;
422 }
423
424 Now, you must do change the t->muck reference to:
425
426 a : t:ID << mytoken(t)->muck = ...; >> ;
427
428 in order to downcast 't' to be an "ANTLRToken *" not the
429 "ANTLRAbstractToken *" resulting from ANTLRTokenPtr::operator->().
430 The macro is defined as:
431
432 /*
433 * Since you cannot redefine operator->() to return one of the user's
434 * token object types, we must down cast. This is a drag. Here's
435 * a macro that helps. template: "mytoken(a-smart-ptr)->myfield".
436 */
437 #define mytoken(tp) ((ANTLRToken *)(tp.operator->()))
438
439 You have to use macro mytoken(grammar-label) now because smart
440 pointers are not specific to a parser's token objects. In other
441 words, the ANTLRTokenPtr class has a pointer to a generic
442 ANTLRAbstractToken not your ANTLRToken; the ANTLR support code must
443 use smart pointers too, but be able to work with any kind of
444 ANTLRToken. Sorry about this, but it's C++'s fault not mine. Some
445 nebulous future version of the C++ compilers should obviate the need
446 to downcast smart pointers with runtime type checking (and by allowing
447 different return type of overridden functions).
448
449 A way to have backward compatible code is to shut off the token object
450 garbage collection; i.e., use parser->noGarbageCollectTokens() and
451 change the definition of ANTLRTokenPtr (that's why you get source code
452 <wink>).
453
454
455 PARSER EXCEPTION HANDLING
456
457 I've noticed some weird stuff with the exception handling. I intend
458 to give this top priority for the "book release" of ANTLR.
459
460 ==========
461 1.32 Full Release
462
463 o Changed Token class hierarchy to be (Thanks to Tom Moog):
464
465 ANTLRAbstractToken
466 ANTLRRefCountToken
467 ANTLRCommonToken
468 ANTLRNoRefCountCommonToken
469
470 o Added virtual panic() to ANTLRAbstractToken. Made ANTLRParser::panic()
471 virtual also.
472
473 o Cleaned up the dup() stuff in AST hierarchy to use shallowCopy() to
474 make node copies. John Farr at Medtronic suggested this. I.e.,
475 if you want to use dup() with either ANTLR or SORCERER or -transform
476 mode with SORCERER, you must defined shallowCopy() as:
477
478 virtual PCCTS_AST *shallowCopy()
479 {
480 return new AST;
481 p->setDown(NULL);
482 p->setRight(NULL);
483 return p;
484 }
485
486 or
487
488 virtual PCCTS_AST *shallowCopy()
489 {
490 return new AST(*this);
491 }
492
493 if you have defined a copy constructor such as
494
495 AST(const AST &t) // shallow copy constructor
496 {
497 token = t.token;
498 iconst = t.iconst;
499 setDown(NULL);
500 setRight(NULL);
501 }
502
503 o Added a warning with -CC and -gk are used together. This is broken,
504 hence a warning is appropriate.
505
506 o Added warning when #-stuff is used w/o -gt option.
507
508 o Updated MPW installation.
509
510 o "Miller, Philip W." <MILLERPW f1groups.fsd.jhuapl.edu> suggested
511 that genmk be use RENAME_OBJ_FLAG RENAME_EXE_FLAG instead of
512 hardcoding "-o" in genmk.c.
513
514 o made all exit() calls use EXIT_SUCCESS or EXIT_FAILURE.
515
516 ===========================================================================
517 1.33
518
519 EXIT_FAILURE and EXIT_SUCCESS were not always defined. I had to modify
520 a bunch of files to use PCCTS_EXIT_XXX, which forces a new version. Sorry
521 about that.
522