lexing

package
v0.0.0-...-0195ecf Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 3, 2022 License: AGPL-3.0 Imports: 6 Imported by: 24

Documentation

Index

Constants

View Source
const (
	EOF = -1 - iota
	Comment
	Illegal
)

Standard token types

View Source
const (
	Word = iota
	Punc
)

Token types for the example lexer.

Variables

This section is empty.

Functions

func FprintErrs

func FprintErrs(w io.Writer, errs []*Error, workDir string)

FprintErrs prints a list of errors.

func IsDigit

func IsDigit(r rune) bool

IsDigit returns true when the rune is in 0-9

func IsHexDigit

func IsHexDigit(r rune) bool

IsHexDigit returns true when the rune is in 0-9, a-f or A-F

func IsIdentLetter

func IsIdentLetter(r rune) bool

IsIdentLetter checks if rune r can start an identifier.

func IsLetter

func IsLetter(r rune) bool

IsLetter returns true if the rune is in a-z or A-Z

func IsPkgName

func IsPkgName(s string) bool

IsPkgName checks if a literal is a valid package name.

func IsWhite

func IsWhite(r rune) bool

IsWhite is the default IsWhite function for a lexer. Returns true on spaces, \t and \r. Returns false on \n.

func IsWhiteOrEndl

func IsWhiteOrEndl(r rune) bool

IsWhiteOrEndl is another IsWhite function that also returns true for \n.

func KeywordSet

func KeywordSet(words ...string) map[string]struct{}

KeywordSet creates a keyword set.

func LogError

func LogError(log Logger, e error) bool

LogError adds a error to the logger if the error is not nil and returns true. If the error is nil, it returns false.

func Tokens

func Tokens(tokener Tokener) ([]*Token, []*Error)

Tokens takes a lexer that is already setup and returns its tokens and errors.

Types

type Error

type Error struct {
	Pos  *Pos   // Pos can be null for error not related to any position
	Err  error  // Err is the error message, human friendly.
	Code string // Code is the error code, machine friendly.
}

Error is a parsing error

func CodeErrorf

func CodeErrorf(c string, f string, args ...interface{}) *Error

CodeErrorf creates a lex8.Error with ErrCode

func Errorf

func Errorf(f, c string, args ...interface{}) *Error

Errorf creates a lex8.Error similar to fmt.Errorf

func SingleCodeErr

func SingleCodeErr(code string, err error) []*Error

SingleCodeErr returns an error array with one error with ErrorCode.

func SingleErr

func SingleErr(err error) []*Error

SingleErr returns an error array with one error.

func (*Error) Error

func (e *Error) Error() string

Error returns the error string.

func (*Error) ErrorRelFile

func (e *Error) ErrorRelFile(workDir string) string

ErrorRelFile returns the error relative to the given workDir

func (*Error) JSON

func (e *Error) JSON() interface{}

JSON returns a JSON marshable object of the error.

type ErrorList

type ErrorList struct {
	Max int
	// contains filtered or unexported fields
}

ErrorList saves a list of error

func NewErrorList

func NewErrorList() *ErrorList

NewErrorList creates a new error list with default (20) maximum lines of errors.

func (*ErrorList) Add

func (lst *ErrorList) Add(e *Error)

Add appends the error to the list. Change the state to "in jail".

func (*ErrorList) AddAll

func (lst *ErrorList) AddAll(es []*Error)

AddAll adds a list of errors into the list.

func (*ErrorList) BailOut

func (lst *ErrorList) BailOut()

BailOut clears the "in jail" state.

func (*ErrorList) CodeErrorf

func (lst *ErrorList) CodeErrorf(p *Pos, c, f string, args ...interface{})

CodeErrorf appends a new error with a ErrorCode.

func (*ErrorList) Errorf

func (lst *ErrorList) Errorf(p *Pos, f string, args ...interface{})

Errorf appends a new error with particular position and format.

func (*ErrorList) Errs

func (lst *ErrorList) Errs() []*Error

Errs retunrs the errors in the list

func (*ErrorList) InJail

func (lst *ErrorList) InJail() bool

InJail checks if a new error has been added since created or last bail out

func (*ErrorList) Jail

func (lst *ErrorList) Jail()

Jail puts it in jail without generating a new error message

func (*ErrorList) Print

func (lst *ErrorList) Print(w io.Writer) error

Print prints to the writer (maximume lst.MaxPrint errors).

type Keyworder

type Keyworder struct {
	Keywords map[string]struct{}
	Ident    int
	Keyword  int
	// contains filtered or unexported fields
}

Keyworder contains idents into keywords

func NewKeyworder

func NewKeyworder(tok Tokener) *Keyworder

NewKeyworder creates a new tokener that changes the type of a token into keywords if it is in the keyword map.

func (*Keyworder) Errs

func (kw *Keyworder) Errs() []*Error

Errs returns the error list on tokening.

func (*Keyworder) Token

func (kw *Keyworder) Token() *Token

Token returns the next token, while replacing ident types into keyword types if the token is in the keyword set.

type LexFunc

type LexFunc func(x *Lexer) *Token

LexFunc is a function type that takes a lexer and returns the next token.

type Lexer

type Lexer struct {
	IsWhite WhiteFunc
	LexFunc LexFunc
	// contains filtered or unexported fields
}

Lexer parses a file input stream into tokens.

func MakeLexer

func MakeLexer(file string, r io.Reader, f LexFunc) *Lexer

MakeLexer creates a lexer with the particular lexer func.

func NewCommentLexer

func NewCommentLexer(file string, r io.Reader) *Lexer

NewCommentLexer returns a lexer that parse only comments.

func NewLexer

func NewLexer(file string, r io.Reader) *Lexer

NewLexer creates a new lexer.

func NewWordLexer

func NewWordLexer(file string, r io.Reader) *Lexer

NewWordLexer returns an example lexer that parses a file into words and punctuations.

func (*Lexer) Buffered

func (x *Lexer) Buffered() string

Buffered returns the current buffered string in the scanner

func (*Lexer) CodeErrorf

func (x *Lexer) CodeErrorf(c, f string, args ...interface{})

CodeErrorf adds an error into the error list with error code.

func (*Lexer) Discard

func (x *Lexer) Discard()

Discard clears the scanning buffer

func (*Lexer) Ended

func (x *Lexer) Ended() bool

Ended returns true when the scanning stops.

func (*Lexer) Errorf

func (x *Lexer) Errorf(f string, args ...interface{})

Errorf adds an error into the error list with current postion.

func (*Lexer) Errs

func (x *Lexer) Errs() []*Error

Errs returns the lexing errors.

func (*Lexer) MakeToken

func (x *Lexer) MakeToken(t int) *Token

MakeToken accepts the runes in the scanning buffer and returns it as a token of type t.

func (*Lexer) Next

func (x *Lexer) Next() (rune, error)

Next pushes the current rune into the scanning buffer, and returns the next rune.

func (*Lexer) Rune

func (x *Lexer) Rune() rune

Rune returns the current rune.

func (*Lexer) See

func (x *Lexer) See(r rune) bool

See returns true when the current rune is r.

func (*Lexer) SkipWhite

func (x *Lexer) SkipWhite()

SkipWhite is a helper function that skips any rune that returns true by IsWhite function. The buffer is discarded after the skipping.

func (*Lexer) Token

func (x *Lexer) Token() *Token

Token returns the next parsed token. It ends with a token with type EOF.

type Logger

type Logger interface {
	Errorf(p *Pos, fmt string, args ...interface{})
}

Logger is an error logging interface

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser provides the common parser functions for parsing. It does NOT provide a working parser for any grammar.

func NewParser

func NewParser(t Tokener, types *Types) *Parser

NewParser creates a new parser around a tokener

func (*Parser) Accept

func (p *Parser) Accept(t int) bool

Accept shifts the tokener by one token and returns true if the current token is of type t. It otherwise returns false and nothing happens.

func (*Parser) BailOut

func (p *Parser) BailOut()

BailOut bails out the parser from an error state.

func (*Parser) CodeErrorf

func (p *Parser) CodeErrorf(pos *Pos, c, f string, args ...interface{})

CodeErrorf adds a new parser error with a error code

func (*Parser) CodeErrorfHere

func (p *Parser) CodeErrorfHere(c, f string, args ...interface{})

CodeErrorfHere adds a new parser error at the current token position

func (*Parser) Errorf

func (p *Parser) Errorf(pos *Pos, f string, args ...interface{})

Errorf adds a new parser error to the parser's error list at a particular position.

func (*Parser) ErrorfHere

func (p *Parser) ErrorfHere(f string, args ...interface{})

ErrorfHere adds a new parser error at the current token position

func (*Parser) Errs

func (p *Parser) Errs() []*Error

Errs returns the parsing error list if the lexing has no error. If lexing has error, it returns the lexing error list instead.

func (*Parser) Expect

func (p *Parser) Expect(t int) *Token

Expect checks if the current token is type t. If it is, the token is accepted, the current token is shifted, and it returns the accepted token. If it is not, the call reports an error, enters the parser into error state, and returns nil. If the parser is already in error state, the call returns nil immediately, and nothing is checked.

func (*Parser) ExpectLit

func (p *Parser) ExpectLit(t int, lit string) *Token

ExpectLit checks if the current token is type t and has literal lit. If it is, the token is accepted, the current token is shifted, and it returns the accepted token. If it is not, the call reports an error, enters the parser into error state, and returns nil. If the parser is already in error state, the call returns nil immediately, and nothing is checked.

func (*Parser) InError

func (p *Parser) InError() bool

InError checks if the parser is in error state. A parser can enter error state by adding a parser error with Errorf() or ErrorfAt(). A parser leaves error by calling BailOut().

func (*Parser) Jail

func (p *Parser) Jail()

Jail puts the parser in error state without adding an error message.

func (*Parser) Next

func (p *Parser) Next() *Token

Next shifts the tokener by one token and returns the new current token.

func (*Parser) See

func (p *Parser) See(t int) bool

See checks if the current token is of type t.

func (*Parser) SeeLit

func (p *Parser) SeeLit(t int, lit string) bool

SeeLit checks if the current token is of type t and the lit is exactly lit.

func (*Parser) Shift

func (p *Parser) Shift() *Token

Shift shifts the token by one token and returns the last current token.

func (*Parser) SkipErrStmt

func (p *Parser) SkipErrStmt(sep int) bool

SkipErrStmt skips tokens until it meets a token of type sep or the end of file (token EOF) and returns true, but only when the parser is in error state. If the parser is not in error state, it returns false and nothing is skipped.

func (*Parser) Token

func (p *Parser) Token() *Token

Token returns the current token.

func (*Parser) TypeStr

func (p *Parser) TypeStr(t int) string

TypeStr returns the name of a type used by the type register of this parser.

type Pos

type Pos struct {
	File string
	Line int
	Col  int
}

Pos is the file line position in a file

func (*Pos) String

func (p *Pos) String() string

type Recorder

type Recorder struct {
	Tokener
	// contains filtered or unexported fields
}

Recorder is a token filter that records all the token a tokener generates

func NewRecorder

func NewRecorder(t Tokener) *Recorder

NewRecorder creates a new recorder that filters the tokener

func (*Recorder) Token

func (r *Recorder) Token() *Token

Token implements the Tokener interface by relaying the call to the internal Tokener.

func (*Recorder) Tokens

func (r *Recorder) Tokens() []*Token

Tokens returns the slice of recorded tokens.

type Remover

type Remover struct {
	Tokener
	// contains filtered or unexported fields
}

Remover removes a particular type of token from a token stream

func NewCommentRemover

func NewCommentRemover(t Tokener) *Remover

NewCommentRemover creates a new remover that removes token

func NewRemover

func NewRemover(t Tokener, typ int) *Remover

NewRemover creates a new remover that removes token of type t

func (*Remover) Token

func (r *Remover) Token() *Token

Token implements the Tokener interface but only returns token that is not the particular type.

type Token

type Token struct {
	Type int
	Lit  string
	Pos  *Pos
}

Token defines a token structure.

func LexComment

func LexComment(x *Lexer) *Token

LexComment lexes a c style comment. It is not a complete LexFunc, where it assumes that there is already a "/" buffered in the lexer as a precondition.

func LexIdent

func LexIdent(x *Lexer, t int) *Token

LexIdent lexes a typical C/Go langauge identifier.

func LexNumber

func LexNumber(x *Lexer, tokInt, tokFloat int) *Token

LexNumber lexes a number usign golang's number format.

func LexRawString

func LexRawString(x *Lexer, t int) *Token

LexRawString parses a raw string token with type t, which is quoted in a pair of `

func LexString

func LexString(x *Lexer, t int, q rune) *Token

LexString parses a string token with type t.

func TokenAll

func TokenAll(t Tokener) []*Token

TokenAll returns all the tokens fetched out of a tokener.

func (*Token) String

func (t *Token) String() string

type Tokener

type Tokener interface {
	// Token returns the next token
	Token() *Token

	// Errs returns the error list on tokening
	Errs() []*Error
}

Tokener is token emitting interface.

func NewTokener

func NewTokener(f string, r io.Reader, x LexFunc, w WhiteFunc) Tokener

NewTokener creates a new tokener from LexFunc x and WhiteFunc w.

type Types

type Types struct {
	// contains filtered or unexported fields
}

Types is a registrar of token type names

func NewTypes

func NewTypes() *Types

NewTypes makes a new registrar. It will auto register the default tokens.

func (*Types) Name

func (types *Types) Name(t int) string

Name resolves the name of a type.

func (*Types) Register

func (types *Types) Register(t int, name string)

Register registers a type with a name. If the type is already registered, it panics.

type WhiteFunc

type WhiteFunc func(r rune) bool

WhiteFunc is a function type that checks if a rune is white space.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL