lexer

package

v0.0.0-...-9e081f2 Latest Latest Go to latest Published: Mar 8, 2024 License: MIT Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/ava12/llx

Links

Open Source Insights

Documentation ¶

Overview ¶

Package lexer defines lexical analyzer.

Index ¶

Constants
type Lexer
- func New(re *regexp.Regexp, types []TokenType) *Lexer
- func (l *Lexer) Next(q *source.Queue) (*Token, error)
- func (l *Lexer) Shrink(q *source.Queue, tok *Token) *Token
type Token
type TokenType

Constants ¶

View Source

const (
	// ErrorTokenType is the type for fake tokens capturing broken lexemes (e.g. incorrect string literals).
	// The purpose of these tokens is to generate more informative error messages.
	// Lexer will never return a token of this type, an error with message containing token text will be returned instead.
	ErrorTokenType = LowestTokenType - 1

	// ErrorTokenName is the type name for ErrorTokenType.
	ErrorTokenName = "-error-"
)

View Source

const (
	// WrongCharError indicates that lexer cannot fetch any token at current position.
	// Error message contains the rune at current source position.
	WrongCharError = llx.LexicalErrors + iota

	// BadTokenError indicates that lexer has fetched a token of ErrorTokenType.
	BadTokenError
)

Error codes used by lexer:

View Source

const (
	// EofTokenType is a fake token indicating the end of source file.
	// Line and column (if present) mark the position right after the last rune of source file.
	EofTokenType = -2

	// EofTokenName is the type name for EofTokenType
	EofTokenName = "-end-of-file-"

	// EoiTokenType is a fake token indicating absence of queued sources (i.e. all sources are processed).
	EoiTokenType = -3

	// EoiTokenName is the type name for EoiTokenType
	EoiTokenName = "-end-of-input-"

	LowestTokenType = -3
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Lexer ¶

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer performs lexical analysis of current source in source.Queue using regexp.Regexp. Lexer itself is immutable, stateless, and safe for concurrent use (i.e. the same Lexer instance may be used with different queues by different goroutines), but it affects queue state. Each token type that may be returned by lexer maps to its own regexp capturing group index. A match containing no captured groups is treated as insignificant lexeme (e.g. whitespace), in this case lexer tries to fetch a token again at new position. Every byte of source file must belong to some lexeme.

func New ¶

func New(re *regexp.Regexp, types []TokenType) *Lexer

New creates new Lexer. Each n-th element of types describes token type for (n+1)-th regexp capturing group. A group that has no description is treated as ErrorTokenType.

func (*Lexer) Next ¶

func (l *Lexer) Next(q *source.Queue) (*Token, error)

Next fetches token starting at current source position and advances current position. Returns nil token and llx.Error and does not make any changes if there is a lexical error. Returns EoI token if queue is empty. Returns EoF token and discards current source if current position is beyond the end of current source.

func (*Lexer) Shrink ¶

func (l *Lexer) Shrink(q *source.Queue, tok *Token) *Token

Shrink tries to fetch a token which starts at the same position as given and is at least one byte shorter. Adjusts current position and returns shrunk token on success. Makes no changes and returns nil if given token has no captured source and position information, was fetched from source other than current, or a lexical error occurs.

type Token ¶

type Token struct {
	// contains filtered or unexported fields
}

Token represents a lexeme, either fetched from a source file or "external" one. Contains token type, text, and source and starting position (if known). Immutable.

func EofToken ¶

func EofToken(s *source.Source) *Token

EofToken creates a token of EofTokenType. s may be nil.

func EoiToken ¶

func EoiToken() *Token

EoiToken creates a token of EoiTokenType.

func NewToken ¶

func NewToken(tokenType int, typeName string, content []byte, sp source.Pos) *Token

NewToken creates a token. Expects zero value for sp if token source is not known.

func (*Token) Col ¶

func (t *Token) Col() int

Col returns 1-based column number of the first byte of the token. Returns 0 if source is not known.

func (*Token) Content ¶

func (t *Token) Content() []byte

Content returns token content.

func (*Token) Line ¶

func (t *Token) Line() int

Line returns 1-based line number of the first byte of the token. Returns 0 if source is not known.

func (*Token) Pos ¶

func (t *Token) Pos() source.Pos

Pos returns captured source position.

func (*Token) Source ¶

func (t *Token) Source() *source.Source

Source returns captured source. Returns nil if source is not known.

func (*Token) SourceName ¶

func (t *Token) SourceName() string

SourceName returns source file name. Returns empty string if source is not known.

func (*Token) Text ¶

func (t *Token) Text() string

Text returns lexeme body converted to string. Conversion occurs on first call, resulting string is stored and reused to minimize number of allocations.

func (*Token) Type ¶

func (t *Token) Type() int

Type returns token type.

func (*Token) TypeName ¶

func (t *Token) TypeName() string

TypeName returns token type name.

type TokenType ¶

type TokenType struct {
	// Type contains token type, may be any value. ErrorTokenType is treated specially.
	Type int

	// TypeName contains token type name, may be any value.
	TypeName string
}

TokenType describes token type for specific capturing group of regular expression.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL