[mirror_edk2.git] / Tools / CodeTools / TianoTools / Pccts / CHANGES_FROM_131.txt

CHANGES FROM 1.31\r
\r
This file contains the migration of PCCTS from 1.31 in the order that\r
changes were made.  1.32b7 is the last beta before full 1.32.\r
Terence Parr, Parr Research Corporation 1995.\r
\r
\r
======================================================================\r
1.32b1\r
Added Russell Quong to banner, changed banner for output slightly\r
Fixed it so that you have before / after actions for C++ in class def\r
Fixed bug in optimizer that made it sometimes forget to set internal\r
        token pointers.  Only showed up when a {...} was in the "wrong spot".\r
\r
======================================================================\r
1.32b2\r
Added fixes by Dave Seidel for PC compilers in 32 bit mode (config.h\r
and set.h).\r
\r
======================================================================\r
1.32b3\r
Fixed hideous bug in code generator for wildcard and for ~token op.\r
\r
from Dave Seidel\r
\r
   Added pcnames.bat\r
   1. in antlr/main.c: change strcasecmp() to stricmp()\r
\r
   2. in dlg/output.c: use DLEXER_C instead on "DLexer.C"\r
\r
   3. in h/PBlackBox.h: use <iostream.h> instead of <stream.h>\r
\r
======================================================================\r
1.32b4\r
When the -ft option was used, any path prefix screwed up\r
the gate on the .h files\r
\r
Fixed yet another bug due to the optimizer.\r
\r
The exception handling thing was a bit wacko:\r
\r
a : ( A B )? A B\r
  | A C\r
  ;\r
  exception ...\r
\r
caused an exception if "A C" was the input.  In other words,\r
it found that A C didn't match the (A B)? pred and caused\r
an exception rather than trying the next alt.  All I did\r
was to change the zzmatch_wsig() macros.\r
\r
Fixed some problems in gen.c relating to the name of token\r
class bit sets in the output.\r
\r
Added the tremendously cool generalized predicate.  For the\r
moment, I'll give this bried description.\r
\r
a : <<predicate>>? blah\r
  | foo\r
  ;\r
\r
This implies that (assuming blah and foo are syntactically\r
ambiguous) "predicate" indicates the semantic validity of\r
applying "blah".  If "predicate" is false, "foo" is attempted.\r
\r
Previously, you had to say:\r
\r
a : <<LA(1)==ID ? predicate : 1>>? ID\r
  | ID\r
  ;\r
\r
Now, you can simply use "predicate" without the ?: operator\r
if you turn on ANTLR command line option: "-prc on".  This\r
tells ANTLR to compute that all by itself.  It computes n\r
tokens of lookahead where LT(n) or LATEXT(n) is the farthest\r
ahead you look.\r
\r
If you give a predicate using "-prc on" that is followed\r
by a construct that can recognize more than one n-sequence,\r
you will get a warning from ANTLR.  For example,\r
\r
a : <<isTypeName(LT(1)->getText())>>? (ID|INT)\r
  ;\r
\r
This is wrong because the predicate will be applied to INTs\r
as well as ID's.  You should use this syntax to make\r
the predicate more specific:\r
\r
a : (ID)? => <<isTypeName(LT(1)->getText())>>? (ID|INT)\r
  ;\r
\r
which says "don't apply the predicate unless ID is the\r
current lookahead context".\r
\r
You cannot currently have anything in the "(context)? =>"\r
except sequences such as:\r
\r
( LPAREN ID | LPAREN SCOPE )? => <<pred>>?\r
\r
I haven't tested this THAT much, but it does work for the\r
C++ grammar.\r
\r
======================================================================\r
1.32b5\r
\r
Added getLine() to the ANTLRTokenBase and DLGBasedToken classes\r
left line() for backward compatibility.\r
----\r
Removed SORCERER_TRANSFORM from the ast.h stuff.\r
-------\r
Fixed bug in code gen of ANTLR such that nested syn preds work more\r
efficiently now.  The ANTLRTokenBuffer was getting very large\r
with nested predicates.\r
------\r
Memory leak is now gone from ANTLRTokenBuf; all tokens are deleted.\r
For backward compatibility reasons, you have to say parser->deleteTokens()\r
or mytokenbuffer->deleteTokens() but later it will be the default mode.\r
Say this after the parser is constructed. E.g.,\r
\r
    ParserBlackBox<DLGLexer, MyParser, ANTLRToken> p(stdin);\r
    p.parser()->deleteTokens();\r
    p.parser()->start_symbol();\r
\r
\r
==============================\r
1.32b6\r
\r
Changed so that deleteTokens() will do a delete ((ANTLRTokenBase *))\r
on the ptr.  This gets the virtual destructor.\r
\r
Fixed some weird things in the C++ header files (a few return types).\r
\r
Made the AST routines correspond to the book and SORCERER stuff.\r
\r
New token stuff:  See testcpp/14/test.g\r
\r
ANTLR accepts a #pragma gc_tokens which says\r
[1]     Generate label = copy(LT(1)) instead of label=LT(1) for\r
        all labeled token references.\r
[2]     User now has to define ANTLRTokenPtr (as a class or a typedef\r
        to just a pointer) as well as the ANTLRToken class itself.\r
		See the example.\r
\r
To delete tokens in token buffer, use deleteTokens() message on parser.\r
\r
        All tokens that fall off the ANTLRTokenBuffer get deleted\r
        which is what currently happens when deleteTokens() message\r
        has been sent to token buffer.\r
\r
We always generate ANTLRTokenPtr instead of 'ANTLRToken *' now.\r
Then if no pragma set, ANTLR generates a\r
\r
        class ANTLRToken;\r
        typedef ANTLRToken *ANTLRTokenPtr;\r
\r
in each file.\r
\r
Made a warning for x:rule_ref <<$x>>; still no warning for $i's, however.\r
class BB {\r
\r
a : x:b y:A <<$x\r
$y>>\r
  ;\r
\r
b : B;\r
\r
}\r
generates\r
Antlr parser generator   Version 1.32b6   1989-1995\r
test.g, line 3: error: There are no token ptrs for rule references: '$x'\r
\r
===================\r
1.32b7:\r
\r
[With respect to token object garbage collection (GC), 1.32b7\r
 backtracks from 1.32b6, but results in better and less intrusive GC.\r
 This is the last beta version before full 1.32.]\r
\r
BIGGEST CHANGES:\r
\r
o	The "#pragma gc_tokens" is no longer used.\r
\r
o	.C files are now .cpp files (hence, makefiles will have to\r
	be changed; or you can rerun genmk).  This is a good move,\r
	but causes some backward incompatibility problems.  You can\r
	avoid this by changing CPP_FILE_SUFFIX to ".C" in pccts/h/config.h.\r
\r
o	The token object class hierarchy has been flattened to include\r
	only three classes: ANTLRAbstractToken, ANTLRCommonToken, and\r
	ANTLRCommonNoRefCountToken.  The common token now does garbage\r
	collection via ref counting.\r
\r
o	"Smart" pointers are now used for garbage collection.  That is,\r
	ANTLRTokenPtr is used instead of "ANTLRToken *".\r
\r
o	The antlr.1 man page has been cleaned up slightly.\r
\r
o	The SUN C++ compiler now complains less about C++ support code.\r
\r
o	Grammars which subclass ANTLRCommonToken must wrap all token\r
	pointer references in mytoken(token_ptr).  This is the only\r
	serious backward incompatibility.  See below.\r
\r
\r
MINOR CHANGES:\r
\r
--------------------------------------------------------\r
1	deleteTokens()\r
\r
The deleteTokens() message to the parser or token buffer has been changed\r
to one of:\r
\r
    void noGarbageCollectTokens()   { inputTokens->noGarbageCollectTokens(); }\r
    void garbageCollectTokens()     { inputTokens->garbageCollectTokens(); }\r
\r
The token buffer deletes all non-referenced tokens by default now.\r
\r
--------------------------------------------------------\r
2	makeToken()\r
\r
The makeToken() message returns a new type.  The function should look\r
like:\r
\r
    virtual ANTLRAbstractToken *makeToken(ANTLRTokenType tt,\r
                                          ANTLRChar *txt,\r
                                          int line)\r
    {\r
        ANTLRAbstractToken *t = new ANTLRCommonToken(tt,txt);\r
        t->setLine(line);\r
        return t;\r
    }\r
\r
--------------------------------------------------------\r
3	TokenType\r
\r
Changed TokenType-> ANTLRTokenType  (often forces changes in AST defs due\r
to #[] constructor called to AST(tokentype, string)).\r
\r
--------------------------------------------------------\r
4	AST()\r
\r
You must define AST(ANTLRTokenPtr t) now in your AST class definition.\r
You might also have to include ATokPtr.h above the definition; e.g.,\r
if AST is defined in a separate file, such as AST.h, it's a good idea\r
to include ATOKPTR_H (ATokPtr.h).  For example,\r
\r
	#include ATOKPTR_H\r
	class AST : public ASTBase {\r
	protected:\r
	    ANTLRTokenPtr token;\r
	public:\r
	    AST(ANTLRTokenPtr t) { token = t; }\r
	    void preorder_action() {\r
	        char *s = token->getText();\r
	        printf(" %s", s);\r
	    }\r
	};\r
\r
Note the use of smart pointers rather than "ANTLRToken *".\r
\r
--------------------------------------------------------\r
5	SUN C++\r
\r
From robertb oakhill.sps.mot.com Bob Bailey. Changed ANTLR C++ output\r
to avoid an error in Sun C++ 3.0.1.  Made "public" return value\r
structs created to hold multiple return values public.\r
\r
--------------------------------------------------------\r
6	genmk\r
\r
Fixed genmk so that target List.* is not included anymore.  It's\r
called SList.* anyway.\r
\r
--------------------------------------------------------\r
7	\r vs \n\r
\r
Scott Vorthmann <vorth cmu.edu> fixed antlr.g in ANTLR so that \r\r
is allowed as the return character as well as \n.\r
\r
--------------------------------------------------------\r
8	Exceptions\r
\r
Bug in exceptions attached to labeled token/tokclass references.  Didn't gen\r
code for exceptions.  This didn't work:\r
\r
a : "help" x:ID\r
  ;\r
        exception[x]\r
        catch MismatchedToken : <<printf("eh?\n");>>\r
\r
Now ANTLR generates (which is kinda big, but necessary):\r
\r
        if ( !_match_wsig(ID) ) {\r
                if ( guessing ) goto fail;\r
                _signal=MismatchedToken;\r
                switch ( _signal ) {\r
                case MismatchedToken :\r
                        printf("eh?\n");\r
                        _signal = NoSignal;\r
                        break;\r
                default :\r
                        goto _handler;\r
                }\r
        }\r
\r
which implies that you can recover and continue parsing after a missing/bad\r
token reference.\r
\r
--------------------------------------------------------\r
9	genmk\r
\r
genmk now correctly uses config file for CPP_FILE_SUFFIX stuff.\r
\r
--------------------------------------------------------\r
10	general cleanup / PURIFY\r
\r
Anthony Green <green vizbiz.com> suggested a bunch of good general\r
clean up things for the code; he also suggested a few things to\r
help out the "PURIFY" memory allocation checker.\r
\r
--------------------------------------------------------\r
11	$-variable references.\r
\r
Manuel ORNATO indicated that a $-variable outside of a rule caused\r
ANTLR to crash.  I fixed this.\r
\r
12	Tom Moog suggestion\r
\r
Fail action of semantic predicate needs "{}" envelope.  FIXED.\r
\r
13	references to LT(1).\r
\r
I have enclosed all assignments such as:\r
\r
             _t22 = (ANTLRTokenPtr)LT(1);\r
\r
in "if ( !guessing )" so that during backtracking the reference count\r
for token objects is not increased.\r
\r
\r
TOKEN OBJECT GARBAGE COLLECTION\r
\r
1	INTRODUCTION\r
\r
The class ANTLRCommonToken is now garbaged collected through a "smart"\r
pointer called ANTLRTokenPtr using reference counting.  Any token\r
object not referenced by your grammar actions is destroyed by the\r
ANTLRTokenBuffer when it must make room for more token objects.\r
Referenced tokens are then destroyed in your parser when local\r
ANTLRTokenPtr objects are deleted.  For example,\r
\r
a : label:ID ;\r
\r
would be converted to something like:\r
\r
void yourclass::a(void)\r
{\r
	zzRULE;\r
	ANTLRTokenPtr label=NULL;	// used to be ANTLRToken *label;\r
        zzmatch(ID);\r
        label = (ANTLRTokenPtr)LT(1);\r
	consume();\r
	...\r
}\r
\r
When the "label" object is destroyed (it's just a pointer to your\r
input token object LT(1)), it decrements the reference count on the\r
object created for the ID.  If the count goes to zero, the object\r
pointed by label is deleted.\r
\r
To correctly manage the garbage collection, you should use\r
ANTLRTokenPtr instead of "ANTLRToken *".  Most ANTLR support code\r
(visible to the user) has been modified to use the smart pointers.\r
\r
***************************************************************\r
Remember that any local objects that you create are not deleted when a\r
lonjmp() is executed.  Unfortunately, the syntactic predicates (...)?\r
use setjmp()/longjmp().  There are some situations when a few tokens\r
will "leak".\r
***************************************************************\r
\r
2	DETAILS\r
\r
o	The default is to perform token object garbage collection.\r
	You may use parser->noGarbageCollectTokens() to turn off\r
	garbage collection.\r
\r
\r
o	The type ANTLRTokenPtr is always defined now (automatically).\r
	If you do not wish to use smart pointers, you will have to\r
	redefined ANTLRTokenPtr by subclassing, changing the header\r
	file or changing ANTLR's code generation (easy enough to\r
	do in gen.c).\r
\r
o	If you don't use ParserBlackBox, the new initialization sequence is:\r
\r
	    ANTLRTokenPtr aToken = new ANTLRToken;\r
	    scan.setToken(mytoken(aToken));\r
\r
	where mytoken(aToken) gets an ANTLRToken * from the smart pointer.\r
\r
o	Define C++ preprocessor symbol DBG_REFCOUNTTOKEN to see a bunch of\r
	debugging stuff for reference counting if you suspect something.\r
\r
\r
3	WHY DO I HAVE TO TYPECAST ALL MY TOKEN POINTERS NOW??????\r
\r
If you subclass ANTLRCommonToken and then attempt to refer to one of\r
your token members via a token pointer in your grammar actions, the\r
C++ compiler will complain that your token object does not have that\r
member.  For example, if you used to do this\r
\r
<<\r
class ANTLRToken : public ANTLRCommonToken {\r
        int muck;\r
	...\r
};\r
>>\r
\r
class Foo {\r
a : t:ID << t->muck = ...; >> ;\r
}\r
\r
Now, you must do change the t->muck reference to:\r
\r
a : t:ID << mytoken(t)->muck = ...; >> ;\r
\r
in order to downcast 't' to be an "ANTLRToken *" not the\r
"ANTLRAbstractToken *" resulting from ANTLRTokenPtr::operator->().\r
The macro is defined as:\r
\r
/*\r
 * Since you cannot redefine operator->() to return one of the user's\r
 * token object types, we must down cast.  This is a drag.  Here's\r
 * a macro that helps.  template: "mytoken(a-smart-ptr)->myfield".\r
 */\r
#define mytoken(tp) ((ANTLRToken *)(tp.operator->()))\r
\r
You have to use macro mytoken(grammar-label) now because smart\r
pointers are not specific to a parser's token objects.  In other\r
words, the ANTLRTokenPtr class has a pointer to a generic\r
ANTLRAbstractToken not your ANTLRToken; the ANTLR support code must\r
use smart pointers too, but be able to work with any kind of\r
ANTLRToken.  Sorry about this, but it's C++'s fault not mine.  Some\r
nebulous future version of the C++ compilers should obviate the need\r
to downcast smart pointers with runtime type checking (and by allowing\r
different return type of overridden functions).\r
\r
A way to have backward compatible code is to shut off the token object\r
garbage collection; i.e., use parser->noGarbageCollectTokens() and\r
change the definition of ANTLRTokenPtr (that's why you get source code\r
<wink>).\r
\r
\r
PARSER EXCEPTION HANDLING\r
\r
I've noticed some weird stuff with the exception handling.  I intend\r
to give this top priority for the "book release" of ANTLR.\r
\r
==========\r
1.32 Full Release\r
\r
o	Changed Token class hierarchy to be (Thanks to Tom Moog):\r
\r
        ANTLRAbstractToken\r
          ANTLRRefCountToken\r
             ANTLRCommonToken\r
          ANTLRNoRefCountCommonToken\r
\r
o	Added virtual panic() to ANTLRAbstractToken.  Made ANTLRParser::panic()\r
	virtual also.\r
\r
o	Cleaned up the dup() stuff in AST hierarchy to use shallowCopy() to\r
	make node copies.  John Farr at Medtronic suggested this.  I.e.,\r
	if you want to use dup() with either ANTLR or SORCERER or -transform\r
	mode with SORCERER, you must defined shallowCopy() as:\r
\r
	virtual PCCTS_AST *shallowCopy()\r
	{\r
	    return new AST;\r
	    p->setDown(NULL);\r
	    p->setRight(NULL);\r
	    return p;\r
	}\r
\r
	or\r
\r
	virtual PCCTS_AST *shallowCopy()\r
	{\r
	    return new AST(*this);\r
	}\r
	\r
	if you have defined a copy constructor such as\r
\r
	AST(const AST &t)	// shallow copy constructor\r
	{\r
		token = t.token;\r
		iconst = t.iconst;\r
		setDown(NULL);\r
		setRight(NULL);\r
	}\r
\r
o	Added a warning with -CC and -gk are used together.  This is broken,\r
	hence a warning is appropriate.\r
\r
o	Added warning when #-stuff is used w/o -gt option.\r
\r
o	Updated MPW installation.\r
\r
o	"Miller, Philip W." <MILLERPW f1groups.fsd.jhuapl.edu> suggested\r
	that genmk be use RENAME_OBJ_FLAG RENAME_EXE_FLAG instead of\r
	hardcoding "-o" in genmk.c.\r
\r
o	made all exit() calls use EXIT_SUCCESS or EXIT_FAILURE.\r
\r
===========================================================================\r
1.33\r
\r
EXIT_FAILURE and EXIT_SUCCESS were not always defined.  I had to modify\r
a bunch of files to use PCCTS_EXIT_XXX, which forces a new version.  Sorry\r
about that.\r
\r
Commit	Line	Data
878ddf1f	1	CHANGES FROM 1.31\r
	2	\r
	3	This file contains the migration of PCCTS from 1.31 in the order that\r
	4	changes were made. 1.32b7 is the last beta before full 1.32.\r
	5	Terence Parr, Parr Research Corporation 1995.\r
	6	\r
	7	\r
	8	======================================================================\r
	9	1.32b1\r
	10	Added Russell Quong to banner, changed banner for output slightly\r
	11	Fixed it so that you have before / after actions for C++ in class def\r
	12	Fixed bug in optimizer that made it sometimes forget to set internal\r
	13	token pointers. Only showed up when a {...} was in the "wrong spot".\r
	14	\r
	15	======================================================================\r
	16	1.32b2\r
	17	Added fixes by Dave Seidel for PC compilers in 32 bit mode (config.h\r
	18	and set.h).\r
	19	\r
	20	======================================================================\r
	21	1.32b3\r
	22	Fixed hideous bug in code generator for wildcard and for ~token op.\r
	23	\r
	24	from Dave Seidel\r
	25	\r
	26	Added pcnames.bat\r
	27	1. in antlr/main.c: change strcasecmp() to stricmp()\r
	28	\r
	29	2. in dlg/output.c: use DLEXER_C instead on "DLexer.C"\r
	30	\r
	31	3. in h/PBlackBox.h: use <iostream.h> instead of <stream.h>\r
	32	\r
	33	======================================================================\r
	34	1.32b4\r
	35	When the -ft option was used, any path prefix screwed up\r
	36	the gate on the .h files\r
	37	\r
	38	Fixed yet another bug due to the optimizer.\r
	39	\r
	40	The exception handling thing was a bit wacko:\r
	41	\r
	42	a : ( A B )? A B\r
	43	\| A C\r
	44	;\r
	45	exception ...\r
	46	\r
	47	caused an exception if "A C" was the input. In other words,\r
	48	it found that A C didn't match the (A B)? pred and caused\r
	49	an exception rather than trying the next alt. All I did\r
	50	was to change the zzmatch_wsig() macros.\r
	51	\r
	52	Fixed some problems in gen.c relating to the name of token\r
	53	class bit sets in the output.\r
	54	\r
	55	Added the tremendously cool generalized predicate. For the\r
	56	moment, I'll give this bried description.\r
	57	\r
	58	a : <<predicate>>? blah\r
	59	\| foo\r
	60	;\r
	61	\r
	62	This implies that (assuming blah and foo are syntactically\r
	63	ambiguous) "predicate" indicates the semantic validity of\r
	64	applying "blah". If "predicate" is false, "foo" is attempted.\r
65	\r
66	Previously, you had to say:\r
67	\r
68	a : <<LA(1)==ID ? predicate : 1>>? ID\r
69	\| ID\r
70	;\r
71	\r
72	Now, you can simply use "predicate" without the ?: operator\r
73	if you turn on ANTLR command line option: "-prc on". This\r
74	tells ANTLR to compute that all by itself. It computes n\r
75	tokens of lookahead where LT(n) or LATEXT(n) is the farthest\r
76	ahead you look.\r
77	\r
78	If you give a predicate using "-prc on" that is followed\r
79	by a construct that can recognize more than one n-sequence,\r
80	you will get a warning from ANTLR. For example,\r
81	\r
82	a : <<isTypeName(LT(1)->getText())>>? (ID\|INT)\r
83	;\r
84	\r
85	This is wrong because the predicate will be applied to INTs\r
86	as well as ID's. You should use this syntax to make\r
87	the predicate more specific:\r
88	\r
89	a : (ID)? => <<isTypeName(LT(1)->getText())>>? (ID\|INT)\r
90	;\r
91	\r
92	which says "don't apply the predicate unless ID is the\r
93	current lookahead context".\r
94	\r
95	You cannot currently have anything in the "(context)? =>"\r
96	except sequences such as:\r
97	\r
98	( LPAREN ID \| LPAREN SCOPE )? => <<pred>>?\r
99	\r
100	I haven't tested this THAT much, but it does work for the\r
101	C++ grammar.\r
102	\r
103	======================================================================\r
104	1.32b5\r
105	\r
106	Added getLine() to the ANTLRTokenBase and DLGBasedToken classes\r
107	left line() for backward compatibility.\r
108	----\r
109	Removed SORCERER_TRANSFORM from the ast.h stuff.\r
110	-------\r
111	Fixed bug in code gen of ANTLR such that nested syn preds work more\r
112	efficiently now. The ANTLRTokenBuffer was getting very large\r
113	with nested predicates.\r
114	------\r
115	Memory leak is now gone from ANTLRTokenBuf; all tokens are deleted.\r
116	For backward compatibility reasons, you have to say parser->deleteTokens()\r
117	or mytokenbuffer->deleteTokens() but later it will be the default mode.\r
118	Say this after the parser is constructed. E.g.,\r
119	\r
120	ParserBlackBox<DLGLexer, MyParser, ANTLRToken> p(stdin);\r
121	p.parser()->deleteTokens();\r
122	p.parser()->start_symbol();\r
123	\r
124	\r
125	==============================\r
126	1.32b6\r
127	\r
128	Changed so that deleteTokens() will do a delete ((ANTLRTokenBase *))\r
129	on the ptr. This gets the virtual destructor.\r
130	\r
131	Fixed some weird things in the C++ header files (a few return types).\r
132	\r
133	Made the AST routines correspond to the book and SORCERER stuff.\r
134	\r
135	New token stuff: See testcpp/14/test.g\r
136	\r
137	ANTLR accepts a #pragma gc_tokens which says\r
138	[1] Generate label = copy(LT(1)) instead of label=LT(1) for\r
139	all labeled token references.\r
140	[2] User now has to define ANTLRTokenPtr (as a class or a typedef\r
141	to just a pointer) as well as the ANTLRToken class itself.\r
142	See the example.\r
143	\r
144	To delete tokens in token buffer, use deleteTokens() message on parser.\r
145	\r
146	All tokens that fall off the ANTLRTokenBuffer get deleted\r
147	which is what currently happens when deleteTokens() message\r
148	has been sent to token buffer.\r
149	\r
150	We always generate ANTLRTokenPtr instead of 'ANTLRToken *' now.\r
151	Then if no pragma set, ANTLR generates a\r
152	\r
153	class ANTLRToken;\r
154	typedef ANTLRToken *ANTLRTokenPtr;\r
155	\r
156	in each file.\r
157	\r
158	Made a warning for x:rule_ref <<$x>>; still no warning for $i's, however.\r
159	class BB {\r
160	\r
161	a : x:b y:A <<$x\r
162	$y>>\r
163	;\r
164	\r
165	b : B;\r
166	\r
167	}\r
168	generates\r
169	Antlr parser generator Version 1.32b6 1989-1995\r
170	test.g, line 3: error: There are no token ptrs for rule references: '$x'\r
171	\r
172	===================\r
173	1.32b7:\r
174	\r
175	[With respect to token object garbage collection (GC), 1.32b7\r
176	backtracks from 1.32b6, but results in better and less intrusive GC.\r
177	This is the last beta version before full 1.32.]\r
178	\r
179	BIGGEST CHANGES:\r
180	\r
181	o The "#pragma gc_tokens" is no longer used.\r
182	\r
183	o .C files are now .cpp files (hence, makefiles will have to\r
184	be changed; or you can rerun genmk). This is a good move,\r
185	but causes some backward incompatibility problems. You can\r
186	avoid this by changing CPP_FILE_SUFFIX to ".C" in pccts/h/config.h.\r
187	\r
188	o The token object class hierarchy has been flattened to include\r
189	only three classes: ANTLRAbstractToken, ANTLRCommonToken, and\r
190	ANTLRCommonNoRefCountToken. The common token now does garbage\r
191	collection via ref counting.\r
192	\r
193	o "Smart" pointers are now used for garbage collection. That is,\r
194	ANTLRTokenPtr is used instead of "ANTLRToken *".\r
195	\r
196	o The antlr.1 man page has been cleaned up slightly.\r
197	\r
198	o The SUN C++ compiler now complains less about C++ support code.\r
199	\r
200	o Grammars which subclass ANTLRCommonToken must wrap all token\r
201	pointer references in mytoken(token_ptr). This is the only\r
202	serious backward incompatibility. See below.\r
203	\r
204	\r
205	MINOR CHANGES:\r
206	\r
207	--------------------------------------------------------\r
208	1 deleteTokens()\r
209	\r
210	The deleteTokens() message to the parser or token buffer has been changed\r
211	to one of:\r
212	\r
213	void noGarbageCollectTokens() { inputTokens->noGarbageCollectTokens(); }\r
214	void garbageCollectTokens() { inputTokens->garbageCollectTokens(); }\r
215	\r
216	The token buffer deletes all non-referenced tokens by default now.\r
217	\r
218	--------------------------------------------------------\r
219	2 makeToken()\r
220	\r
221	The makeToken() message returns a new type. The function should look\r
222	like:\r
223	\r
224	virtual ANTLRAbstractToken *makeToken(ANTLRTokenType tt,\r
225	ANTLRChar *txt,\r
226	int line)\r
227	{\r
228	ANTLRAbstractToken *t = new ANTLRCommonToken(tt,txt);\r
229	t->setLine(line);\r
230	return t;\r
231	}\r
232	\r
233	--------------------------------------------------------\r
234	3 TokenType\r
235	\r
236	Changed TokenType-> ANTLRTokenType (often forces changes in AST defs due\r
237	to #[] constructor called to AST(tokentype, string)).\r
238	\r
239	--------------------------------------------------------\r
240	4 AST()\r
241	\r
242	You must define AST(ANTLRTokenPtr t) now in your AST class definition.\r
243	You might also have to include ATokPtr.h above the definition; e.g.,\r
244	if AST is defined in a separate file, such as AST.h, it's a good idea\r
245	to include ATOKPTR_H (ATokPtr.h). For example,\r
246	\r
247	#include ATOKPTR_H\r
248	class AST : public ASTBase {\r
249	protected:\r
250	ANTLRTokenPtr token;\r
251	public:\r
252	AST(ANTLRTokenPtr t) { token = t; }\r
253	void preorder_action() {\r
254	char *s = token->getText();\r
255	printf(" %s", s);\r
256	}\r
257	};\r
258	\r
259	Note the use of smart pointers rather than "ANTLRToken *".\r
260	\r
261	--------------------------------------------------------\r
262	5 SUN C++\r
263	\r
264	From robertb oakhill.sps.mot.com Bob Bailey. Changed ANTLR C++ output\r
265	to avoid an error in Sun C++ 3.0.1. Made "public" return value\r
266	structs created to hold multiple return values public.\r
267	\r
268	--------------------------------------------------------\r
269	6 genmk\r
270	\r
271	Fixed genmk so that target List.* is not included anymore. It's\r
272	called SList.* anyway.\r
273	\r
274	--------------------------------------------------------\r
275	7 \r vs \n\r
276	\r
277	Scott Vorthmann <vorth cmu.edu> fixed antlr.g in ANTLR so that \r\r
278	is allowed as the return character as well as \n.\r
279	\r
280	--------------------------------------------------------\r
281	8 Exceptions\r
282	\r
283	Bug in exceptions attached to labeled token/tokclass references. Didn't gen\r
284	code for exceptions. This didn't work:\r
285	\r
286	a : "help" x:ID\r
287	;\r
288	exception[x]\r
289	catch MismatchedToken : <<printf("eh?\n");>>\r
290	\r
291	Now ANTLR generates (which is kinda big, but necessary):\r
292	\r
293	if ( !_match_wsig(ID) ) {\r
294	if ( guessing ) goto fail;\r
295	_signal=MismatchedToken;\r
296	switch ( _signal ) {\r
297	case MismatchedToken :\r
298	printf("eh?\n");\r
299	_signal = NoSignal;\r
300	break;\r
301	default :\r
302	goto _handler;\r
303	}\r
304	}\r
305	\r
306	which implies that you can recover and continue parsing after a missing/bad\r
307	token reference.\r
308	\r
309	--------------------------------------------------------\r
310	9 genmk\r
311	\r
312	genmk now correctly uses config file for CPP_FILE_SUFFIX stuff.\r
313	\r
314	--------------------------------------------------------\r
315	10 general cleanup / PURIFY\r
316	\r
317	Anthony Green <green vizbiz.com> suggested a bunch of good general\r
318	clean up things for the code; he also suggested a few things to\r
319	help out the "PURIFY" memory allocation checker.\r
320	\r
321	--------------------------------------------------------\r
322	11 $-variable references.\r
323	\r
324	Manuel ORNATO indicated that a $-variable outside of a rule caused\r
325	ANTLR to crash. I fixed this.\r
326	\r
327	12 Tom Moog suggestion\r
328	\r
329	Fail action of semantic predicate needs "{}" envelope. FIXED.\r
330	\r
331	13 references to LT(1).\r
332	\r
333	I have enclosed all assignments such as:\r
334	\r
335	_t22 = (ANTLRTokenPtr)LT(1);\r
336	\r
337	in "if ( !guessing )" so that during backtracking the reference count\r
338	for token objects is not increased.\r
339	\r
340	\r
341	TOKEN OBJECT GARBAGE COLLECTION\r
342	\r
343	1 INTRODUCTION\r
344	\r
345	The class ANTLRCommonToken is now garbaged collected through a "smart"\r
346	pointer called ANTLRTokenPtr using reference counting. Any token\r
347	object not referenced by your grammar actions is destroyed by the\r
348	ANTLRTokenBuffer when it must make room for more token objects.\r
349	Referenced tokens are then destroyed in your parser when local\r
350	ANTLRTokenPtr objects are deleted. For example,\r
351	\r
352	a : label:ID ;\r
353	\r
354	would be converted to something like:\r
355	\r
356	void yourclass::a(void)\r
357	{\r
358	zzRULE;\r
359	ANTLRTokenPtr label=NULL; // used to be ANTLRToken *label;\r
360	zzmatch(ID);\r
361	label = (ANTLRTokenPtr)LT(1);\r
362	consume();\r
363	...\r
364	}\r
365	\r
366	When the "label" object is destroyed (it's just a pointer to your\r
367	input token object LT(1)), it decrements the reference count on the\r
368	object created for the ID. If the count goes to zero, the object\r
369	pointed by label is deleted.\r
370	\r
371	To correctly manage the garbage collection, you should use\r
372	ANTLRTokenPtr instead of "ANTLRToken *". Most ANTLR support code\r
373	(visible to the user) has been modified to use the smart pointers.\r
374	\r
375	***************************************************************\r
376	Remember that any local objects that you create are not deleted when a\r
377	lonjmp() is executed. Unfortunately, the syntactic predicates (...)?\r
378	use setjmp()/longjmp(). There are some situations when a few tokens\r
379	will "leak".\r
380	***************************************************************\r
381	\r
382	2 DETAILS\r
383	\r
384	o The default is to perform token object garbage collection.\r
385	You may use parser->noGarbageCollectTokens() to turn off\r
386	garbage collection.\r
387	\r
388	\r
389	o The type ANTLRTokenPtr is always defined now (automatically).\r
390	If you do not wish to use smart pointers, you will have to\r
391	redefined ANTLRTokenPtr by subclassing, changing the header\r
392	file or changing ANTLR's code generation (easy enough to\r
393	do in gen.c).\r
394	\r
395	o If you don't use ParserBlackBox, the new initialization sequence is:\r
396	\r
397	ANTLRTokenPtr aToken = new ANTLRToken;\r
398	scan.setToken(mytoken(aToken));\r
399	\r
400	where mytoken(aToken) gets an ANTLRToken * from the smart pointer.\r
401	\r
402	o Define C++ preprocessor symbol DBG_REFCOUNTTOKEN to see a bunch of\r
403	debugging stuff for reference counting if you suspect something.\r
404	\r
405	\r
406	3 WHY DO I HAVE TO TYPECAST ALL MY TOKEN POINTERS NOW??????\r
407	\r
408	If you subclass ANTLRCommonToken and then attempt to refer to one of\r
409	your token members via a token pointer in your grammar actions, the\r
410	C++ compiler will complain that your token object does not have that\r
411	member. For example, if you used to do this\r
412	\r
413	<<\r
414	class ANTLRToken : public ANTLRCommonToken {\r
415	int muck;\r
416	...\r
417	};\r
418	>>\r
419	\r
420	class Foo {\r
421	a : t:ID << t->muck = ...; >> ;\r
422	}\r
423	\r
424	Now, you must do change the t->muck reference to:\r
425	\r
426	a : t:ID << mytoken(t)->muck = ...; >> ;\r
427	\r
428	in order to downcast 't' to be an "ANTLRToken *" not the\r
429	"ANTLRAbstractToken *" resulting from ANTLRTokenPtr::operator->().\r
430	The macro is defined as:\r
431	\r
432	/*\r
433	* Since you cannot redefine operator->() to return one of the user's\r
434	* token object types, we must down cast. This is a drag. Here's\r
435	* a macro that helps. template: "mytoken(a-smart-ptr)->myfield".\r
436	*/\r
437	#define mytoken(tp) ((ANTLRToken *)(tp.operator->()))\r
438	\r
439	You have to use macro mytoken(grammar-label) now because smart\r
440	pointers are not specific to a parser's token objects. In other\r
441	words, the ANTLRTokenPtr class has a pointer to a generic\r
442	ANTLRAbstractToken not your ANTLRToken; the ANTLR support code must\r
443	use smart pointers too, but be able to work with any kind of\r
444	ANTLRToken. Sorry about this, but it's C++'s fault not mine. Some\r
445	nebulous future version of the C++ compilers should obviate the need\r
446	to downcast smart pointers with runtime type checking (and by allowing\r
447	different return type of overridden functions).\r
448	\r
449	A way to have backward compatible code is to shut off the token object\r
450	garbage collection; i.e., use parser->noGarbageCollectTokens() and\r
451	change the definition of ANTLRTokenPtr (that's why you get source code\r
452	<wink>).\r
453	\r
454	\r
455	PARSER EXCEPTION HANDLING\r
456	\r
457	I've noticed some weird stuff with the exception handling. I intend\r
458	to give this top priority for the "book release" of ANTLR.\r
459	\r
460	==========\r
461	1.32 Full Release\r
462	\r
463	o Changed Token class hierarchy to be (Thanks to Tom Moog):\r
464	\r
465	ANTLRAbstractToken\r
466	ANTLRRefCountToken\r
467	ANTLRCommonToken\r
468	ANTLRNoRefCountCommonToken\r
469	\r
470	o Added virtual panic() to ANTLRAbstractToken. Made ANTLRParser::panic()\r
471	virtual also.\r
472	\r
473	o Cleaned up the dup() stuff in AST hierarchy to use shallowCopy() to\r
474	make node copies. John Farr at Medtronic suggested this. I.e.,\r
475	if you want to use dup() with either ANTLR or SORCERER or -transform\r
476	mode with SORCERER, you must defined shallowCopy() as:\r
477	\r
478	virtual PCCTS_AST *shallowCopy()\r
479	{\r
480	return new AST;\r
481	p->setDown(NULL);\r
482	p->setRight(NULL);\r
483	return p;\r
484	}\r
485	\r
486	or\r
487	\r
488	virtual PCCTS_AST *shallowCopy()\r
489	{\r
490	return new AST(*this);\r
491	}\r
492	\r
493	if you have defined a copy constructor such as\r
494	\r
495	AST(const AST &t) // shallow copy constructor\r
496	{\r
497	token = t.token;\r
498	iconst = t.iconst;\r
499	setDown(NULL);\r
500	setRight(NULL);\r
501	}\r
502	\r
503	o Added a warning with -CC and -gk are used together. This is broken,\r
504	hence a warning is appropriate.\r
505	\r
506	o Added warning when #-stuff is used w/o -gt option.\r
507	\r
508	o Updated MPW installation.\r
509	\r
510	o "Miller, Philip W." <MILLERPW f1groups.fsd.jhuapl.edu> suggested\r
511	that genmk be use RENAME_OBJ_FLAG RENAME_EXE_FLAG instead of\r
512	hardcoding "-o" in genmk.c.\r
513	\r
514	o made all exit() calls use EXIT_SUCCESS or EXIT_FAILURE.\r
515	\r
516	===========================================================================\r
517	1.33\r
518	\r
519	EXIT_FAILURE and EXIT_SUCCESS were not always defined. I had to modify\r
520	a bunch of files to use PCCTS_EXIT_XXX, which forces a new version. Sorry\r
521	about that.\r
522	\r