parser

package module
v1.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 15, 2023 License: ISC Imports: 6 Imported by: 14

README

parser

-- import "vimagination.zapto.org/parser"

Package parser is a simple helper package for parsing strings, byte slices and io.Readers

Usage

var (
	ErrNoState      = errors.New("no state")
	ErrUnknownError = errors.New("unknown error")
)

Errors

type Parser
type Parser struct {
	Tokeniser
}

Parser is a type used to get tokens or phrases (collection of token) from an an input

func New
func New(t Tokeniser) Parser

New creates a new Parser from the given Tokeniser

func (*Parser) Accept
func (p *Parser) Accept(types ...TokenType) bool

Accept will accept a token with one of the given types, returning true if one is read and false otherwise.

func (*Parser) AcceptRun
func (p *Parser) AcceptRun(types ...TokenType) TokenType

AcceptRun will keep Accepting tokens as long as they match one of the given types.

It will return the type of the token that made it stop.

func (*Parser) AcceptToken
func (p *Parser) AcceptToken(tokens ...Token) bool

AcceptToken will accept a token matching one of the ones provided exactly, returning true if one is read and false otherwise.

func (*Parser) Done
func (p *Parser) Done() (Phrase, PhraseFunc)

Done is a PhraseFunc that is used to indicate that there are no more phrases to parse.

func (*Parser) Error
func (p *Parser) Error() (Phrase, PhraseFunc)

Error represents an error state for the phraser.

The error value should be set in Parser.Err and then this func should be called.

func (*Parser) Except
func (p *Parser) Except(types ...TokenType) bool

Except will Accept a token that is not one of the types given. Returns true if it Accepted a token.

func (*Parser) ExceptRun
func (p *Parser) ExceptRun(types ...TokenType) TokenType

ExceptRun will keep Accepting tokens as long as they do not match one of the given types.

It will return the type of the token that made it stop.

func (*Parser) Get
func (p *Parser) Get() []Token

Get retrieves a slice of the Tokens that have been read so far.

func (*Parser) GetPhrase
func (p *Parser) GetPhrase() (Phrase, error)

GetPhrase runs the state machine and retrieves a single Phrase and possibly an error

func (*Parser) GetToken
func (p *Parser) GetToken() (Token, error)

GetToken runs the state machine and retrieves a single Token and possibly an error.

If a Token has already been 'peek'ed, that token will be returned without running the state machine

func (*Parser) Len
func (p *Parser) Len() int

Len returns how many tokens have been read.

func (*Parser) Peek
func (p *Parser) Peek() Token

Peek takes a look at the upcoming Token and returns it.

func (*Parser) PhraserState
func (p *Parser) PhraserState(pf PhraseFunc)

PhraserState allows the internal state of the Phraser to be set.

type Phrase
type Phrase struct {
	Type PhraseType
	Data []Token
}

Phrase represents a collection of tokens that have meaning together

type PhraseFunc
type PhraseFunc func(*Parser) (Phrase, PhraseFunc)

PhraseFunc is the type that the worker types implement in order to be used by the Phraser

type PhraseType
type PhraseType int

PhraseType represnts the type of phrase being read.

Negative values are reserved for this package.

const (
	PhraseDone PhraseType = -1 - iota
	PhraseError
)

Constants PhraseError (-2) and PhraseDone (-1)

type Token
type Token struct {
	Type TokenType
	Data string
}

Token represents data parsed from the stream.

type TokenFunc
type TokenFunc func(*Tokeniser) (Token, TokenFunc)

TokenFunc is the type that the worker funcs implement in order to be used by the tokeniser.

type TokenType
type TokenType int

TokenType represents the type of token being read.

Negative values are reserved for this package.

const (
	TokenDone TokenType = -1 - iota
	TokenError
)

Constants TokenError (-2) and TokenDone (-1)

type Tokeniser
type Tokeniser struct {
	Err error
}

Tokeniser is a state machine to generate tokens from an input

func NewByteTokeniser
func NewByteTokeniser(data []byte) Tokeniser

NewByteTokeniser returns a Tokeniser which uses a byte slice.

func NewReaderTokeniser
func NewReaderTokeniser(reader io.Reader) Tokeniser

NewReaderTokeniser returns a Tokeniser which uses an io.Reader

func NewStringTokeniser
func NewStringTokeniser(str string) Tokeniser

NewStringTokeniser returns a Tokeniser which uses a string.

func (*Tokeniser) Accept
func (t *Tokeniser) Accept(chars string) bool

Accept returns true if the next character to be read is contained within the given string.

Upon true, it advances the read position, otherwise the position remains the same.

func (*Tokeniser) AcceptRun
func (t *Tokeniser) AcceptRun(chars string) rune

AcceptRun reads from the string as long as the read character is in the given string.

Returns the rune that stopped the run.

func (*Tokeniser) Done
func (t *Tokeniser) Done() (Token, TokenFunc)

Done is a TokenFunc that is used to indicate that there are no more tokens to parse.

func (*Tokeniser) Error
func (t *Tokeniser) Error() (Token, TokenFunc)

Error represents an error state for the parser.

The error value should be set in Tokeniser.Err and then this func should be called.

func (*Tokeniser) Except
func (t *Tokeniser) Except(chars string) bool

Except returns true if the next character to be read is not contained within the given string. Upon true, it advances the read position, otherwise the position remains the same.

func (*Tokeniser) ExceptRun
func (t *Tokeniser) ExceptRun(chars string) rune

ExceptRun reads from the string as long as the read character is not in the given string.

Returns the rune that stopped the run.

func (*Tokeniser) Get
func (t *Tokeniser) Get() string

Get returns a string of everything that has been read so far and resets the string for the next round of parsing.

func (*Tokeniser) GetError
func (t *Tokeniser) GetError() error

GetError returns any error that has been generated by the Tokeniser

func (*Tokeniser) GetToken
func (t *Tokeniser) GetToken() (Token, error)

GetToken runs the state machine and retrieves a single token and possible an error

func (*Tokeniser) Len
func (t *Tokeniser) Len() int

Len returns the number of bytes that has been read since the last Get.

func (*Tokeniser) Peek
func (t *Tokeniser) Peek() rune

Peek returns the next rune without advancing the read position.

func (*Tokeniser) TokeniserState
func (t *Tokeniser) TokeniserState(tf TokenFunc)

TokeniserState allows the internal state of the Tokeniser to be set

Documentation

Overview

Package parser is a simple helper package for parsing strings, byte slices and io.Readers

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	ErrNoState      = errors.New("no state")
	ErrUnknownError = errors.New("unknown error")
)

Errors

Functions

This section is empty.

Types

type Parser

type Parser struct {
	Tokeniser
	// contains filtered or unexported fields
}

Parser is a type used to get tokens or phrases (collection of token) from an an input

func New

func New(t Tokeniser) Parser

New creates a new Parser from the given Tokeniser

func (*Parser) Accept

func (p *Parser) Accept(types ...TokenType) bool

Accept will accept a token with one of the given types, returning true if one is read and false otherwise.

func (*Parser) AcceptRun

func (p *Parser) AcceptRun(types ...TokenType) TokenType

AcceptRun will keep Accepting tokens as long as they match one of the given types.

It will return the type of the token that made it stop.

func (*Parser) AcceptToken

func (p *Parser) AcceptToken(tokens ...Token) bool

AcceptToken will accept a token matching one of the ones provided exactly, returning true if one is read and false otherwise.

func (*Parser) Done

func (p *Parser) Done() (Phrase, PhraseFunc)

Done is a PhraseFunc that is used to indicate that there are no more phrases to parse.

func (*Parser) Error

func (p *Parser) Error() (Phrase, PhraseFunc)

Error represents an error state for the phraser.

The error value should be set in Parser.Err and then this func should be called.

func (*Parser) Except

func (p *Parser) Except(types ...TokenType) bool

Except will Accept a token that is not one of the types given. Returns true if it Accepted a token.

func (*Parser) ExceptRun

func (p *Parser) ExceptRun(types ...TokenType) TokenType

ExceptRun will keep Accepting tokens as long as they do not match one of the given types.

It will return the type of the token that made it stop.

func (*Parser) Get

func (p *Parser) Get() []Token

Get retrieves a slice of the Tokens that have been read so far.

func (*Parser) GetPhrase

func (p *Parser) GetPhrase() (Phrase, error)

GetPhrase runs the state machine and retrieves a single Phrase and possibly an error

func (*Parser) GetToken

func (p *Parser) GetToken() (Token, error)

GetToken runs the state machine and retrieves a single Token and possibly an error.

If a Token has already been 'peek'ed, that token will be returned without running the state machine

func (*Parser) Len

func (p *Parser) Len() int

Len returns how many tokens have been read.

func (*Parser) Peek

func (p *Parser) Peek() Token

Peek takes a look at the upcoming Token and returns it.

func (*Parser) PhraserState

func (p *Parser) PhraserState(pf PhraseFunc)

PhraserState allows the internal state of the Phraser to be set.

type Phrase

type Phrase struct {
	Type PhraseType
	Data []Token
}

Phrase represents a collection of tokens that have meaning together

type PhraseFunc

type PhraseFunc func(*Parser) (Phrase, PhraseFunc)

PhraseFunc is the type that the worker types implement in order to be used by the Phraser

type PhraseType

type PhraseType int

PhraseType represnts the type of phrase being read.

Negative values are reserved for this package.

const (
	PhraseDone PhraseType = -1 - iota
	PhraseError
)

Constants PhraseError (-2) and PhraseDone (-1)

type Token

type Token struct {
	Type TokenType
	Data string
}

Token represents data parsed from the stream.

type TokenFunc

type TokenFunc func(*Tokeniser) (Token, TokenFunc)

TokenFunc is the type that the worker funcs implement in order to be used by the tokeniser.

type TokenType

type TokenType int

TokenType represents the type of token being read.

Negative values are reserved for this package.

const (
	TokenDone TokenType = -1 - iota
	TokenError
)

Constants TokenError (-2) and TokenDone (-1)

type Tokeniser

type Tokeniser struct {
	Err error
	// contains filtered or unexported fields
}

Tokeniser is a state machine to generate tokens from an input

func NewByteTokeniser

func NewByteTokeniser(data []byte) Tokeniser

NewByteTokeniser returns a Tokeniser which uses a byte slice.

Example
package main

import (
	"fmt"

	"vimagination.zapto.org/parser"
)

func main() {
	p := parser.NewByteTokeniser([]byte("Hello, World!"))
	alphaNum := "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
	p.AcceptRun(alphaNum)
	word := p.Get()
	fmt.Println("got word:", word)

	p.ExceptRun(alphaNum)
	p.Get()

	p.AcceptRun(alphaNum)
	word = p.Get()
	fmt.Println("got word:", word)
}
Output:

got word: Hello
got word: World

func NewReaderTokeniser

func NewReaderTokeniser(reader io.Reader) Tokeniser

NewReaderTokeniser returns a Tokeniser which uses an io.Reader

Example
package main

import (
	"fmt"
	"strings"

	"vimagination.zapto.org/parser"
)

func main() {
	p := parser.NewReaderTokeniser(strings.NewReader("Hello, World!"))
	alphaNum := "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
	p.AcceptRun(alphaNum)
	word := p.Get()
	fmt.Println("got word:", word)

	p.ExceptRun(alphaNum)
	p.Get()

	p.AcceptRun(alphaNum)
	word = p.Get()
	fmt.Println("got word:", word)
}
Output:

got word: Hello
got word: World

func NewStringTokeniser

func NewStringTokeniser(str string) Tokeniser

NewStringTokeniser returns a Tokeniser which uses a string.

Example
package main

import (
	"fmt"

	"vimagination.zapto.org/parser"
)

func main() {
	p := parser.NewStringTokeniser("Hello, World!")
	alphaNum := "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
	p.AcceptRun(alphaNum)
	word := p.Get()
	fmt.Println("got word:", word)

	p.ExceptRun(alphaNum)
	p.Get()

	p.AcceptRun(alphaNum)
	word = p.Get()
	fmt.Println("got word:", word)
}
Output:

got word: Hello
got word: World

func (*Tokeniser) Accept

func (t *Tokeniser) Accept(chars string) bool

Accept returns true if the next character to be read is contained within the given string.

Upon true, it advances the read position, otherwise the position remains the same.

func (*Tokeniser) AcceptRun

func (t *Tokeniser) AcceptRun(chars string) rune

AcceptRun reads from the string as long as the read character is in the given string.

Returns the rune that stopped the run.

func (*Tokeniser) Done

func (t *Tokeniser) Done() (Token, TokenFunc)

Done is a TokenFunc that is used to indicate that there are no more tokens to parse.

func (*Tokeniser) Error

func (t *Tokeniser) Error() (Token, TokenFunc)

Error represents an error state for the parser.

The error value should be set in Tokeniser.Err and then this func should be called.

func (*Tokeniser) Except

func (t *Tokeniser) Except(chars string) bool

Except returns true if the next character to be read is not contained within the given string. Upon true, it advances the read position, otherwise the position remains the same.

func (*Tokeniser) ExceptRun

func (t *Tokeniser) ExceptRun(chars string) rune

ExceptRun reads from the string as long as the read character is not in the given string.

Returns the rune that stopped the run.

func (*Tokeniser) Get

func (t *Tokeniser) Get() string

Get returns a string of everything that has been read so far and resets the string for the next round of parsing.

func (*Tokeniser) GetError added in v1.0.2

func (t *Tokeniser) GetError() error

GetError returns any error that has been generated by the Tokeniser

func (*Tokeniser) GetToken

func (t *Tokeniser) GetToken() (Token, error)

GetToken runs the state machine and retrieves a single token and possible an error

func (*Tokeniser) Len

func (t *Tokeniser) Len() int

Len returns the number of bytes that has been read since the last Get.

func (*Tokeniser) Peek

func (t *Tokeniser) Peek() rune

Peek returns the next rune without advancing the read position.

func (*Tokeniser) TokeniserState

func (t *Tokeniser) TokeniserState(tf TokenFunc)

TokeniserState allows the internal state of the Tokeniser to be set

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL