PL_BEGIN(kext: extending tokens, lexers, and parsers) Using _BF(kext), generated _TIs, _LIs, and _YIs are extended to produce new interfaces with extended _TT(ParseType)s. Additionally, user code can extend lexer expression methods and parser reduction methods. Declarations and code can also be added to the body of the new interface and implementation.
_A(input)_H(extfile) An _IT(extfile) is a file having the _TT(.e) suffix, and which begins with the following two lines
%source MyLang.t [MyLang.l] [MyLang.y] %import MyLangTok [MyLangLex] [MyLangParse]where _TT([]) indicates optional arguments. The _TT(%source) line names the specifications of the interfaces to be extended, and the _TT(%import) line gives the names of those interfaces. The output interface is always a _TI, and if a _LI or _YI was _TT(%import)ed as a second argument, then the output interface will also be the same kind of interface. This allows tokens to be extended either alone in a _TI, or if desired in a _LI or _YI.
As with any lexer or parser, an extended lexer or parser is _LN(ktok.html#compat, compatible) with another lexer or parser if it imports the same _TI, even if the _TI is itself a _LI or _YI.
_H(extending _TT(ParseType)s) The following line in an extfile:
parseType: {val: INTEGER; t: TEXT}means _TT(parseType) is extended with the fields _TT(val) and _TT(t) added to the object type. If _TT(parseType) is not a _TT(Token) then the fields can have initializations (current implementation restriction). _H(extending lexers) The following lines in an extfile:
exprName {RETURN NEW(STRING, val := 15, t := $)} exprName {$R STRING{$$ := 15; $$.t := $}}both mean that the expression method named _TT(exprName) is overridden to return a new _TT(STRING) token whose _TT(val) field is initialized to _TT(15), and whose _TT(t) field is initialized to the _TT(TEXT) of the matched token. The second version might not call _TT(NEW) if previously allocated tokens of type _TT(STRING) have been _TT(discard())ed but not _TT(detach())ed (see below).
As in other _BF(lex)es, the type of the result can depend upon the matched text, so long as the result is a _TT(Token). This means, for example, that the returned type need not be the same as in the default method. _H(extending parsers) The following line:
ruleName {$$ := $1 + $2}assumes that the last line of the form _TT(parseType: {...}) refers to the return type of a rule declared as _TT(ruleName). The meaning is that the reduction method (named _TT(ruleName_returnType)) is overridden. _TT("$1") and _TT("$2") refer respectively to the first and second _LN(ktok.html#const,nonconstant) symbols appearing in the reduction method _LN(kyacc.html#reducmeth,declaration). It is assumed that the types of both of these symbols and the return symbol have a field named _TT(val), which in this case must be of type _TT(INTEGER). The line could have been written equivalently as
ruleName {$$.val := $1.val + $2.val}To refer to the _TT(ParseType) itself, _TT("$x") is written _TT("$x.detach()"), and _TT("$$") is written _TT("result"). For example, consider a rule which reduces its only nonconstant symbol to its own type (imagine removing balanced parentheses). The following two lines in an extfile
ruleName {$$ := $1} ruleName {result := $1.detach()}do not quite mean the same thing. Even if the only field in _TT(returnType) is _TT(val), the generated interface may be further extended, so that in the second version copying the _TT(ParseType) copies more fields than just _TT(val). Additionally, if _TT(ruleName) is extended then the extended method will see _TT(result = $1.detach()).
Detaching _TT(ParseType)s gives a simple way to construct parse trees whose nodes are the _TT(ParseType)s themselves. For an example, see _LN(calc.html#CPT,_TT(CalcParseTree.e)).
_H(extra interface and module code) Sometimes it is necessary to add code to the generated interface and module, for example to _TT(IMPORT) interfaces defining types referenced in the extended types and methods. Also it is common to add _TT(init) methods to parsers, etc. The following is the form of such code:
%place { (* your Modula-3 code here *) }For a _TI, _TT(%place) can be either _TT(%interface) or _TT(%module). For a _LI or _YI, _TT(%place) can also be _TT(%public), _TT(%private), or _TT(%overrides). The code is inserted into the generated interface as follows:
INTERFACE MyLangThingSpecial; IMPORT ...; %interface TYPE T <: Public Public = MyLangThing.T OBJECT %public END; ...and code is inserted into the generated implementation as follows:
MODULE MyLangThingSpecial; IMPORT ...; %module REVEAL T = Public BRANDED OBJECT %private ... OVERRIDES %overrides ... END; ...PL_END $Id: kext.html,v 1.2 2001-09-19 15:31:35 wagner Exp $ HTML_END