LY_BEGIN(lex, klex, lexer, MyLangLex, Lexer,l) A _IT(lexer specification) is a file with the _TT(.l) suffix which specifies regular expressions to be converted to tokens. The syntax differs from that of a UNIX _TT(.l) file in that each regular expression is associated with an _IT(expression method), which must be given a name:
_A(defmeth)_H(default method construction) The longest suffix of the method name matching a token name causes that token type to be returned by default. For example, if there is a token named _TT(STRING), then any methods named _TT(bare_STRING) or _TT(quoted_STRING) will by default be assigned a procedure which returns a new token of type _TT(STRING).
The default method named _TT(char) returns a constant token whose value equals the character code of the first matched character. If a method name is not _TT(char) and does not have a suffix matching a token name, the default method returns _TT(NIL) (instructing the lexer to skip the token) and a warning is printed. The warning is not printed, however, if the method is named _TT(skip); in that case skipping is assumed to be the desired default behavior.
In addition it is customary to define a token named _TT(ERROR), which does not ordinarily match any grammar rules. Thus a lexer specification will typically end with the following 3 lines:
char {%char} skip [ \t]* ERROR [^]The behavior of any default method can be changed by overriding the method, for example using _EXT. _H(regular expressions) _BF(klex) supports the following subset of _LN(#bib,flex) regular expressions, in order of decreasing precedence:
#define _RS _TT(r)'s #define _R _TT(r)
The type _TT(T) representing a lexer is declared as an opaque subtype of the _LN(ktok.html#RdLexer,_TT(RdLexer)) instantiated in the token interface. Hence the following uses are possible:
_TT(myLexer := NEW(MyLangLex.T).setRd(rd);) |
Initialize the new lexer _TT(myLexer) using the reader _TT(rd: Rd.T). |
  |
_TT(start := myParser.parse(NEW(MylangLex.T).fromText(text));) |
Parse _TT(text: TEXT), using a new lexer and _TT(myParser). The interface which was used to initialize _TT(myParser) must be _LN(ktok.html#compat,compatible) with _TT(MyLangLex.i3). |
There is no method named _TT(init), to allow customized initialization parameters in extended lexers. _A(bib)_H(see also) M. E. Lesk and E. Schmidt, _IT(LEX - Lexical Analyzer Generator)
Vern Paxson et. al., _IT(flex - fast lexical analyzer generator)
A. Aho, R. Sethi and J. Ullman, Compilers: Principles, Techniques and Tools PL_END $Id: klex.html,v 1.2 2001-09-19 15:31:35 wagner Exp $ HTML_END