Documentation ¶
Overview ¶
Package participle constructs parsers from definitions in struct tags and parses directly into those structs. The approach is philosophically similar to how other marshallers work in Go, "unmarshalling" an instance of a grammar into a struct.
The supported annotation syntax is:
- `@<expr>` Capture expression into the field.
- `@@` Recursively capture using the fields own type.
- `<identifier>` Match named lexer token.
- `( ... )` Group.
- `"..."` Match the literal (note that the lexer must emit tokens matching this literal exactly).
- `"...":<identifier>` Match the literal, specifying the exact lexer token type to match.
- `<expr> <expr> ...` Match expressions.
- `<expr> | <expr>` Match one of the alternatives.
The following modifiers can be used after any expression:
- `*` Expression can match zero or more times.
- `+` Expression must match one or more times.
- `?` Expression can match zero or once.
- `!` Require a non-empty match (this is useful with a sequence of optional matches eg. `("a"? "b"? "c"?)!`).
Here's an example of an EBNF grammar.
type Group struct { Expression *Expression `"(" @@ ")"` } type Option struct { Expression *Expression `"[" @@ "]"` } type Repetition struct { Expression *Expression `"{" @@ "}"` } type Literal struct { Start string `@String` // lexer.Lexer token "String" End string `("…" @String)?` } type Term struct { Name string ` @Ident` Literal *Literal `| @@` Group *Group `| @@` Option *Option `| @@` Repetition *Expression `| "(" @@ ")"` } type Sequence struct { Terms []*Term `@@+` } type Expression struct { Alternatives []*Sequence `@@ ("|" @@)*` } type Expressions []*Expression type Production struct { Name string `@Ident "="` Expressions Expressions `@@+ "."` } type EBNF struct { Productions []*Production `@@*` }
Index ¶
- Constants
- Variables
- func FormatError(err Error) string
- type Capture
- type Error
- type Mapper
- type Option
- func CaseInsensitive(tokens ...string) Option
- func Elide(types ...string) Option
- func Lexer(def lexer.Definition) Option
- func Map(mapper Mapper, symbols ...string) Option
- func ParseTypeWith[T any](parseFn func(*lexer.PeekingLexer) (T, error)) Option
- func Union[T any](members ...T) Option
- func Unquote(types ...string) Option
- func Upper(types ...string) Option
- func UseLookahead(n int) Option
- type ParseError
- type ParseOption
- type Parseable
- type Parser
- func (p *Parser[G]) Lex(filename string, r io.Reader) ([]lexer.Token, error)
- func (p *Parser[G]) Lexer() lexer.Definition
- func (p *Parser[G]) Parse(filename string, r io.Reader, options ...ParseOption) (v *G, err error)
- func (p *Parser[G]) ParseBytes(filename string, b []byte, options ...ParseOption) (v *G, err error)
- func (p *Parser[G]) ParseFromLexer(lex *lexer.PeekingLexer, options ...ParseOption) (*G, error)
- func (p *Parser[G]) ParseString(filename string, s string, options ...ParseOption) (v *G, err error)
- func (p *Parser[G]) String() string
- type UnexpectedTokenError
Constants ¶
const MaxLookahead = 99999
MaxLookahead can be used with UseLookahead to get pseudo-infinite lookahead without the risk of pathological cases causing a stack overflow.
Variables ¶
var ( // MaxIterations limits the number of elements capturable by {}. MaxIterations = 1000000 // NextMatch should be returned by Parseable.Parse() method implementations to indicate // that the node did not match and that other matches should be attempted, if appropriate. NextMatch = errors.New("no match") // nolint: golint )
Functions ¶
func FormatError ¶
FormatError formats an error in the form "[<filename>:][<line>:<pos>:] <message>"
Types ¶
type Capture ¶
Capture can be implemented by fields in order to transform captured tokens into field values.
type Error ¶
type Error interface { error // Unadorned message. Message() string // Closest position to error location. Position() lexer.Position }
Error represents an error while parsing.
The format of an Error is in the form "[<filename>:][<line>:<pos>:] <message>".
The error will contain positional information if available.
type Option ¶
type Option func(p *parserOptions) error
An Option to modify the behaviour of the Parser.
func CaseInsensitive ¶
CaseInsensitive allows the specified token types to be matched case-insensitively.
Note that the lexer itself will also have to be case-insensitive; this option just controls whether literals in the grammar are matched case insensitively.
func Lexer ¶
func Lexer(def lexer.Definition) Option
Lexer is an Option that sets the lexer to use with the given grammar.
func Map ¶
Map is an Option that configures the Parser to apply a mapping function to each Token from the lexer.
This can be useful to eg. upper-case all tokens of a certain type, or dequote strings.
"symbols" specifies the token symbols that the Mapper will be applied to. If empty, all tokens will be mapped.
func ParseTypeWith ¶
func ParseTypeWith[T any](parseFn func(*lexer.PeekingLexer) (T, error)) Option
ParseTypeWith associates a custom parsing function with some interface type T. When the parser encounters a value of type T, it will use the given parse function to parse a value from the input.
The parse function may return anything it wishes as long as that value satisfies the interface T. However, only a single function can be defined for any type T. If you want to have multiple parse functions returning types that satisfy the same interface, you'll need to define new wrapper types for each one.
This can be useful if you want to parse a DSL within the larger grammar, or if you want to implement an optimized parsing scheme for some portion of the grammar.
func Union ¶
Union associates several member productions with some interface type T. Given members X, Y, Z, and W for a union type U, then the EBNF rule is:
U = X | Y | Z | W .
When the parser encounters a field of type T, it will attempt to parse each member in sequence and take the first match. Because of this, the order in which the members are defined is important. You must be careful to order your members appropriately.
An example of a bad parse that can happen if members are out of order:
If the first member matches A, and the second member matches A B, and the source string is "AB", then the parser will only match A, and will not try to parse the second member at all.
func Unquote ¶
Unquote applies strconv.Unquote() to tokens of the given types.
Tokens of type "String" will be unquoted if no other types are provided.
func Upper ¶
Upper is an Option that upper-cases all tokens of the given type. Useful for case normalisation.
func UseLookahead ¶
UseLookahead allows branch lookahead up to "n" tokens.
If parsing cannot be disambiguated before "n" tokens of lookahead, parsing will fail.
Note that increasing lookahead has a minor performance impact, but also reduces the accuracy of error reporting.
If "n" is negative, it will be treated as "infinite" lookahead. This can have a large impact on performance, and does not provide any protection against stack overflow during parsing. In most cases, using MaxLookahead will achieve the same results in practice, but with a concrete upper bound to prevent pathological behavior in the parser. Using infinite lookahead can be useful for testing, or for parsing especially ambiguous grammars. Use at your own risk!
type ParseError ¶
ParseError is returned when a parse error occurs.
It is useful for differentiating between parse errors and other errors such as lexing and IO errors.
func (*ParseError) Error ¶
func (p *ParseError) Error() string
func (*ParseError) Message ¶
func (p *ParseError) Message() string
func (*ParseError) Position ¶
func (p *ParseError) Position() lexer.Position
type ParseOption ¶
type ParseOption func(p *parseContext)
ParseOption modifies how an individual parse is applied.
func AllowTrailing ¶
func AllowTrailing(ok bool) ParseOption
AllowTrailing tokens without erroring.
That is, do not error if a full parse completes but additional tokens remain.
type Parseable ¶
type Parseable interface { // Parse into the receiver. // // Should return NextMatch if no tokens matched and parsing should continue. // Nil should be returned if parsing was successful. Parse(lex *lexer.PeekingLexer) error }
The Parseable interface can be implemented by any element in the grammar to provide custom parsing.
type Parser ¶
type Parser[G any] struct { // contains filtered or unexported fields }
A Parser for a particular grammar and lexer.
func Build ¶
Build constructs a parser for the given grammar.
If "Lexer()" is not provided as an option, a default lexer based on text/scanner will be used. This scans typical Go- like tokens.
See documentation for details.
func ParserForProduction ¶
ParserForProduction returns a new parser for the given production in grammar G.
func (*Parser[G]) Lex ¶
Lex uses the parser's lexer to tokenise input. Parameter filename is used as an opaque prefix in error messages.
func (*Parser[G]) Lexer ¶
func (p *Parser[G]) Lexer() lexer.Definition
Lexer returns the parser's builtin lexer.
func (*Parser[G]) Parse ¶
Parse from r into grammar v which must be of the same type as the grammar passed to Build(). Parameter filename is used as an opaque prefix in error messages.
This may return an Error.
func (*Parser[G]) ParseBytes ¶
func (p *Parser[G]) ParseBytes(filename string, b []byte, options ...ParseOption) (v *G, err error)
ParseBytes from b into grammar v which must be of the same type as the grammar passed to Build(). Parameter filename is used as an opaque prefix in error messages.
This may return an Error.
func (*Parser[G]) ParseFromLexer ¶
func (p *Parser[G]) ParseFromLexer(lex *lexer.PeekingLexer, options ...ParseOption) (*G, error)
ParseFromLexer into grammar v which must be of the same type as the grammar passed to Build().
This may return a Error.
func (*Parser[G]) ParseString ¶
func (p *Parser[G]) ParseString(filename string, s string, options ...ParseOption) (v *G, err error)
ParseString from s into grammar v which must be of the same type as the grammar passed to Build(). Parameter filename is used as an opaque prefix in error messages.
This may return an Error.
type UnexpectedTokenError ¶
type UnexpectedTokenError struct { Unexpected lexer.Token Expect string // contains filtered or unexported fields }
UnexpectedTokenError is returned by Parse when an unexpected token is encountered.
This is useful for composing parsers in order to detect when a sub-parser has terminated.
func (*UnexpectedTokenError) Error ¶
func (u *UnexpectedTokenError) Error() string
func (*UnexpectedTokenError) Message ¶
func (u *UnexpectedTokenError) Message() string
func (*UnexpectedTokenError) Position ¶
func (u *UnexpectedTokenError) Position() lexer.Position
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
cmd
|
|
railroad
Package main generates Railroad Diagrams from Participle grammar EBNF.
|
Package main generates Railroad Diagrams from Participle grammar EBNF. |
Package ebnf contains the AST and parser for parsing the form of EBNF produced by Participle.
|
Package ebnf contains the AST and parser for parsing the form of EBNF produced by Participle. |
Package lexer defines interfaces and implementations used by Participle to perform lexing.
|
Package lexer defines interfaces and implementations used by Participle to perform lexing. |
internal
Code generated by Participle.
|
Code generated by Participle. |