Documentation ¶
Overview ¶
Package lexer defines interfaces and implementations used by Participle to perform lexing.
The primary interfaces are Definition and Lexer. There are three implementations of these interfaces:
TextScannerLexer is based on text/scanner. This is the fastest, but least flexible, in that tokens are restricted to those supported by that package. It can scan about 5M tokens/second on a late 2013 15" MacBook Pro.
The second lexer is constructed via the Regexp() function, mapping regexp capture groups to tokens. The complete input source is read into memory, so it is unsuitable for large inputs.
The final lexer provided accepts a lexical grammar in EBNF. Each capitalised production is a lexical token supported by the resulting Lexer. This is very flexible, but a bit slower, scanning around 730K tokens/second on the same machine, though it is currently completely unoptimised. This could/should be converted to a table-based lexer.
Lexer implementations must use Panic/Panicf to report errors.
Index ¶
- Constants
- func FormatError(pos Position, message string) string
- func MakeSymbolTable(def Definition, types ...string) (map[rune]bool, error)
- func NameOfReader(r interface{}) string
- func SymbolsByRune(def Definition) map[rune]string
- type Definition
- type Error
- type Lexer
- type PeekingLexer
- type Position
- type Token
Constants ¶
const ( // EOF represents an end of file. EOF rune = -(iota + 1) )
Variables ¶
This section is empty.
Functions ¶
func FormatError ¶ added in v0.4.0
FormatError formats an error in the form "[<filename>:][<line>:<pos>:] <message>"
func MakeSymbolTable ¶
func MakeSymbolTable(def Definition, types ...string) (map[rune]bool, error)
MakeSymbolTable builds a lookup table for checking token ID existence.
For each symbolic name in "types", the returned map will contain the corresponding token ID as a key.
func NameOfReader ¶
func NameOfReader(r interface{}) string
NameOfReader attempts to retrieve the filename of a reader.
func SymbolsByRune ¶
func SymbolsByRune(def Definition) map[rune]string
SymbolsByRune returns a map of lexer symbol names keyed by rune.
Types ¶
type Definition ¶
type Definition interface { // Lex an io.Reader. Lex(io.Reader) (Lexer, error) // Symbols returns a map of symbolic names to the corresponding pseudo-runes for those symbols. // This is the same approach as used by text/scanner. For example, "EOF" might have the rune // value of -1, "Ident" might be -2, and so on. Symbols() map[string]rune }
Definition provides the parser with metadata for a lexer.
var ( TextScannerLexer Definition = &defaultDefinition{} // DefaultDefinition defines properties for the default lexer. DefaultDefinition = TextScannerLexer )
TextScannerLexer is a lexer that uses the text/scanner module.
func Must ¶
func Must(def Definition, err error) Definition
Must takes the result of a Definition constructor call and returns the definition, but panics if it errors
eg.
lex = lexer.Must(lexer.Build(`Symbol = "symbol" .`))
func Regexp ¶
func Regexp(pattern string) (Definition, error)
Regexp creates a lexer definition from a regular expression.
Each named sub-expression in the regular expression matches a token. Anonymous sub-expressions will be matched and discarded.
eg.
def, err := Regexp(`(?P<Ident>[a-z]+)|(\s+)|(?P<Number>\d+)`)
type Error ¶
Error represents an error while parsing.
func ErrorWithTokenf ¶ added in v0.4.2
ErrorWithTokenf creats a new Error with the given token as context.
type Lexer ¶
A Lexer returns tokens from a source.
func Lex ¶
Lex an io.Reader with text/scanner.Scanner.
This provides very fast lexing of source code compatible with Go tokens.
Note that this differs from text/scanner.Scanner in that string tokens will be unquoted.
type PeekingLexer ¶
type PeekingLexer struct {
// contains filtered or unexported fields
}
PeekingLexer supports arbitrary lookahead as well as cloning.
func Upgrade ¶
func Upgrade(lex Lexer) (*PeekingLexer, error)
Upgrade a Lexer to a PeekingLexer with arbitrary lookahead.
func (*PeekingLexer) Clone ¶ added in v0.4.0
func (p *PeekingLexer) Clone() *PeekingLexer
Clone creates a clone of this PeekingLexer at its current token.
The parent and clone are completely independent.
func (*PeekingLexer) Cursor ¶ added in v0.4.0
func (p *PeekingLexer) Cursor() int
Cursor position in tokens.
func (*PeekingLexer) Length ¶ added in v0.5.0
func (p *PeekingLexer) Length() int
Length returns the number of tokens consumed by the lexer.
func (*PeekingLexer) Next ¶ added in v0.4.0
func (p *PeekingLexer) Next() (Token, error)
Next consumes and returns the next token.
type Token ¶
type Token struct { // Type of token. This is the value keyed by symbol as returned by Definition.Symbols(). Type rune Value string Pos Position }
A Token returned by a Lexer.
func ConsumeAll ¶
ConsumeAll reads all tokens from a Lexer.
Directories ¶
Path | Synopsis |
---|---|
Package ebnf is an EBNF lexer for Participle.
|
Package ebnf is an EBNF lexer for Participle. |
internal
Package internal is a library for EBNF grammars.
|
Package internal is a library for EBNF grammars. |
Package regex provides a regex based lexer using a readable list of named patterns.
|
Package regex provides a regex based lexer using a readable list of named patterns. |
Package stateful defines a nested stateful lexer.
|
Package stateful defines a nested stateful lexer. |