Documentation ¶
Overview ¶
Package lexer defines lexical analyzer.
Index ¶
Constants ¶
const ( // ErrorTokenType is the type for fake tokens capturing broken lexemes (e.g. incorrect string literals). // The purpose of these tokens is to generate more informative error messages. // Lexer will never return a token of this type, an error with message containing token text will be returned instead. ErrorTokenType = LowestTokenType - 1 // ErrorTokenName is the type name for ErrorTokenType. ErrorTokenName = "-error-" )
const ( // WrongCharError indicates that lexer cannot fetch any token at current position. // Error message contains the rune at current source position. WrongCharError = llx.LexicalErrors + iota // BadTokenError indicates that lexer has fetched a token of ErrorTokenType. BadTokenError )
Error codes used by lexer:
const ( // EofTokenType is a fake token indicating the end of source file. // Line and column (if present) mark the position right after the last rune of source file. EofTokenType = -2 // EofTokenName is the type name for EofTokenType EofTokenName = "-end-of-file-" // EoiTokenType is a fake token indicating absence of queued sources (i.e. all sources are processed). EoiTokenType = -3 // EoiTokenName is the type name for EoiTokenType EoiTokenName = "-end-of-input-" LowestTokenType = -3 )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Lexer ¶
type Lexer struct {
// contains filtered or unexported fields
}
Lexer performs lexical analysis of current source in source.Queue using regexp.Regexp. Lexer itself is immutable, stateless, and safe for concurrent use (i.e. the same Lexer instance may be used with different queues by different goroutines), but it affects queue state. Each token type that may be returned by lexer maps to its own regexp capturing group index. A match containing no captured groups is treated as insignificant lexeme (e.g. whitespace), in this case lexer tries to fetch a token again at new position. Every byte of source file must belong to some lexeme.
func New ¶
New creates new Lexer. Each n-th element of types describes token type for (n+1)-th regexp capturing group. A group that has no description is treated as ErrorTokenType.
func (*Lexer) Next ¶
Next fetches token starting at current source position and advances current position. Returns nil token and llx.Error and does not make any changes if there is a lexical error. Returns EoI token if queue is empty. Returns EoF token and discards current source if current position is beyond the end of current source.
func (*Lexer) Shrink ¶
Shrink tries to fetch a token which starts at the same position as given and is at least one byte shorter. Adjusts current position and returns shrunk token on success. Makes no changes and returns nil if given token has no captured source and position information, was fetched from source other than current, or a lexical error occurs.
type Token ¶
type Token struct {
// contains filtered or unexported fields
}
Token represents a lexeme, either fetched from a source file or "external" one. Contains token type, text, and source and starting position (if known). Immutable.
func (*Token) Col ¶
Col returns 1-based column number of the first byte of the token. Returns 0 if source is not known.
func (*Token) Line ¶
Line returns 1-based line number of the first byte of the token. Returns 0 if source is not known.
func (*Token) SourceName ¶
SourceName returns source file name. Returns empty string if source is not known.