Documentation ¶
Overview ¶
Package lexer is a subpackage of ptk that contains various implementations of the ILexer, an interface describing lexers, along with related support types and code such as Token. A lexer is an object that can be queried for the next token in a token stream; generally, a lexer takes a character stream such as that generated by the scanners in the scanner subpackage and groups them together into words with semantic meaning. Its return value is a Token, which contains the token type, its location (to be used for error reporting), and any meaning associated with the token, such as the numerical value of numeric literals.
Index ¶
Constants ¶
const ChanLexerSize = 20
ChanLexerSize is the size of the input channel.
const TrackAll = -1
TrackAll is a special value for the max argument to BackTracker.SetMax that indicates the desire to track all characters.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BackTracker ¶ added in v0.4.0
type BackTracker struct { Src scanner.Scanner // The source scanner // contains filtered or unexported fields }
BackTracker is an implementation of scanner.Scanner that includes backtracking capability. A BackTracker wraps another scanner.Scanner (including another instance of BackTracker), but provides additional methods for controlling backtracking.
func NewBackTracker ¶
func NewBackTracker(src scanner.Scanner, max int) *BackTracker
NewBackTracker wraps another scanner (which may also be a BackTracker, if desired) in a BackTracker. The max parameter indicates the maximum number of characters to track; use 0 to track no characters, and TrackAll to track all characters.
func (*BackTracker) Accept ¶ added in v0.4.0
func (bt *BackTracker) Accept(leave int)
Accept accepts characters from the backtracking queue, leaving only the specified number of characters on the queue.
func (*BackTracker) BackTrack ¶ added in v0.4.0
func (bt *BackTracker) BackTrack()
BackTrack resets to the beginning of the backtracking queue.
func (*BackTracker) Len ¶ added in v0.4.0
func (bt *BackTracker) Len() int
Len returns the number of characters saved so far on the backtracking queue.
func (*BackTracker) More ¶ added in v0.4.0
func (bt *BackTracker) More() bool
More is used to determine if there are any more characters available for Next to return, given the current state of the BackTracker.
func (*BackTracker) Next ¶ added in v0.5.0
func (bt *BackTracker) Next() (ch scanner.Char, err error)
Next returns the next character from the stream as a Char, which will include the character's location. If an error was encountered, that will also be returned.
func (*BackTracker) Pos ¶ added in v0.4.0
func (bt *BackTracker) Pos() int
Pos returns the position of the most recently returned character within the saved character list.
func (*BackTracker) SetMax ¶ added in v0.4.0
func (bt *BackTracker) SetMax(max int)
SetMax allows updating the maximum number of characters to allow backtracking over. Setting a TrackAll value will allow all newly returned characters to be backtracked over. If the new value for max is less than the previous value, characters at the front of the backtracking queue will be discarded to bring the size down to max.
type BaseState ¶ added in v0.5.0
type BaseState struct {
Cls Classifier // The classifier for the lex
}
BaseState is a basic implementation of the State interface. It assumes a fixed Classifier for the lifetime of the lexer's operation.
func (*BaseState) Classifier ¶ added in v0.5.0
func (bs *BaseState) Classifier() Classifier
Classifier must return the classifier to use. It is safe for the application to return different Classifier implementations depending on the lexer state.
type ChanLexer ¶ added in v0.5.0
type ChanLexer struct {
Chan chan *Token // The input channel
}
ChanLexer is a trivial implementation of Lexer that uses a channel to retrieve tokens. It implements an extra Push method, that allows pushing tokens onto the lexer, as well as a Done method to signal the lexer that all tokens have been pushed.
func (*ChanLexer) Done ¶ added in v0.5.0
func (q *ChanLexer) Done()
Done indicates to the lexer that there will be no more tokens pushed onto the queue.
type Classifier ¶
type Classifier interface { // Classify takes a lexer, a state, and a backtracking scanner // and determines one or more recognizers to extract a token // or a set of tokens from the lexer input. Classify(lexer *Lexer) []Recognizer // Error is called by the lexer if all recognizers returned by // Classify return without success. Error(lexer *Lexer) }
Classifier represents a character classification tool. A classifier has a Classify method that takes the lexer, the state, and a backtracker, and returns a list of recognizers, which the lexer then runs in order until one of them succeeds.
type IBackTracker ¶ added in v0.5.0
type IBackTracker interface { scanner.Scanner // More is used to determine if there are any more characters // available for Next to return, given the current state of // the BackTracker. More() bool // SetMax allows updating the maximum number of characters to // allow backtracking over. Setting a TrackAll value will // allow all newly returned characters to be backtracked over. // If the new value for max is less than the previous value, // characters at the front of the backtracking queue will be // discarded to bring the size down to max. SetMax(max int) // Accept accepts characters from the backtracking queue, // leaving only the specified number of characters on the // queue. Accept(leave int) // Len returns the number of characters saved so far on the // backtracking queue. Len() int // Pos returns the position of the most recently returned // character within the saved character list. Pos() int // BackTrack resets to the beginning of the backtracking // queue. BackTrack() }
IBackTracker is an interface for a backtracker, a scanner.Scanner that also provides the ability to back up to an earlier character in the stream.
type ILexer ¶ added in v0.5.0
type ILexer interface { // Next returns the next token. At the end of the lexer, a // nil should be returned. Next() *Token }
ILexer presents a stream of tokens. The basic lexer does not provide token pushback.
func NewAsyncLexer ¶ added in v0.5.0
NewAsyncLexer wraps another lexer and uses the ChanLexer to allow running that other lexer in a separate goroutine.
type Lexer ¶
type Lexer struct { Scanner IBackTracker // The character source, wrapped in a BackTracker State State // The state of the lexer // contains filtered or unexported fields }
Lexer is an implementation of ILexer.
type ListLexer ¶ added in v0.5.0
type ListLexer struct {
// contains filtered or unexported fields
}
ListLexer is an implementation of Lexer that is initialized with a list of tokens, and simply returns the tokens in sequence.
func NewListLexer ¶ added in v0.5.0
NewListLexer returns a Lexer that retrieves its tokens from a list passed to the function. This actually uses a ChanLexer under the covers.
type Recognizer ¶
type Recognizer interface { // Recognize is called by the lexer on the objects returned by // the Classifier. Each will be called in turn until one of // the methods returns a boolean true value. If no recognizer // returns true, or if the Classifier returns an empty list, // then the Error recognizer will be called, if one is // declared, after which the character will be discarded. The // Recognize method will be called with the lexer, the state, // and a backtracking scanner. Recognize(lexer *Lexer) bool }
Recognizer describes a recognizer. A recognizer is an object returned by the Classify method of a Classifier; its Recognize method will be passed the lexer, the state, and a backtracker, and it should read input from the backtracker until it has a complete lexeme (think "word" in your grammar). Assuming that lexeme is a valid token (a comment or a run of whitespace would not be), the Recognize method should then use Lexer.Push to push one or more tokens.
type State ¶
type State interface { // Classifier must return the classifier to use. It is safe // for the application to return different Classifier // implementations depending on the lexer state. Classifier() Classifier }
State represents the state of the lexer. This is an interface; an implementation must be provided by the user. A base implementation is available as BaseState.
type Token ¶ added in v0.5.0
type Token struct { Type string // The type of token Loc scanner.Location // The location of the token Value interface{} // The semantic value of the token; optional Text string // The original text of the token; optional }
Token represents a single token emitted by the lexical analyzer. A token has an associated symbol, a location, and optionally the original text and a semantic value.