Documentation
¶
Overview ¶
Package parsec provides a library of parser-combinators. The basic idea behind parsec module is that, it allows programmers to compose basic set of terminal parsers, a.k.a tokenizers and compose them together as a tree of parsers, using combinators like: And, OrdChoice, Kleene, Many, Maybe.
To begin with there are four basic Types that needs to be kept in mind while creating and composing parsers,
Types ¶
Scanner, an interface type that encapsulates the input text. A built in scanner called SimpleScanner is supplied along with this package. Developers can also implement their own scanner types. Following example create a new instance of SimpleScanner, using an input text:
var exprText = []byte(`4 + 123 + 23 + 67 +89 + 87 *78`) s := parsec.NewScanner(exprText)
Nodify, callback function is supplied while combining parser functions. If the underlying parsing logic matches with i/p text, then callback will be dispatched with list of matching ParsecNode. Value returned by callback function will further be used as ParsecNode item in higher-level list of ParsecNodes.
Parser, simple parsers are functions that matches i/p text for specific patterns. Simple parsers can be combined using one of the supplied combinators to construct a higher level parser. A parser function takes a Scanner object and applies the underlying parsing logic, if underlying logic succeeds Nodify callback is dispatched and a ParsecNode and a new Scanner object (with its cursor moved forward) is returned. If parser fails to match, it shall return the input scanner object as it is, along with nil ParsecNode.
ParsecNode, an interface type encapsulates one or more tokens from i/p text, as terminal node or non-terminal node.
Combinators ¶
If input text is going to be a single token like `10` or `true` or `"some string"`, then all we need is a single Parser function that can tokenize the i/p text into a terminal node. But our applications are seldom that simple. Almost all the time we need to parse the i/p text for more than one tokens and most of the time we need to compose them into a tree of terminal and non-terminal nodes.
This is where combinators are useful. Package provides a set of combinators to help combine terminal parsers into higher level parsers. They are,
- And, to combine a sequence of terminals and non-terminal parsers.
- OrdChoice, to choose between specified list of parsers.
- Kleene, to repeat the parser zero or more times.
- Many, to repeat the parser one or more times.
- ManyUntil, to repeat the parser until a specified end matcher.
- Maybe, to apply the parser once or none.
All the above mentioned combinators accept one or more parser function as arguments, either by value or by reference. The reason for allowing parser argument by reference is to be able to define recursive parsing logic, like parsing nested arrays:
var Y Parser var value Parser // circular rats var opensqrt = Atom("[", "OPENSQRT") var closesqrt = Atom("]", "CLOSESQRT") var values = Kleene(nil, &value, Atom(",", "COMMA")) var array = And(nil, opensqrt, values, closeSqrt) func init() { value = parsec.OrdChoice(nil, Int(), Bool(), String(), array) Y = parsec.OrdChoice(nil, value) }
Terminal parsers ¶
Parsers for standard set of tokens are supplied along with this package. Most of these parsers return Terminal type as ParseNode.
- Char, match a single character skipping leading whitespace.
- Float, match a float literal skipping leading whitespace.
- Hex, match a hexadecimal literal skipping leading whitespace.
- Int, match a decimal number literal skipping leading whitespace.
- Oct, match a octal number literal skipping leading whitespace.
- String, match a string literal skipping leading whitespace.
- Ident, match a identifier token skipping leading whitespace.
- Atom, match a single atom skipping leading whitespace.
- AtomExact, match a single atom without skipping leading whitespace.
- Token, match a single token skipping leading whitespace.
- TokenExact, match a single token without skipping leading whitespace.
- OrdToken, match a single token with specified list of alternatives.
- End, match end of text.
- NoEnd, match not an end of text.
All of the terminal parsers, except End and NoEnd return Terminal type as ParsecNode. While End and NoEnd return a boolean type as ParsecNode.
AST and Queryable ¶
This is an experimental feature to use CSS like selectors for quering an Abstract Syntax Tree (AST). Types, APIs and methods associated with AST and Queryable are unstable, and are expected to change in future.
While Scanner, Parser, ParsecNode types are re-used in AST and Queryable, combinator functions are re-implemented as AST methods. Similarly type ASTNodify is to be used instead of Nodify type. Otherwise all the parsec techniques mentioned above are equally applicable on AST.
Additionally, following points are worth noting while using AST,
- Combinator methods supplied via AST can be named.
- All combinators from AST object will create and return NonTerminal as the Queryable type.
- ASTNodify function can interpret its Queryable argument and return a different type implementing Queryable interface.
Index ¶
- type AST
- func (ast *AST) And(name string, callb ASTNodify, parsers ...interface{}) Parser
- func (ast *AST) Dotstring(name string) string
- func (ast *AST) End(name string) Parser
- func (ast *AST) GetValue() string
- func (ast *AST) Kleene(nm string, callb ASTNodify, ps ...interface{}) Parser
- func (ast *AST) Many(nm string, callb ASTNodify, parsers ...interface{}) Parser
- func (ast *AST) ManyUntil(nm string, callb ASTNodify, ps ...interface{}) Parser
- func (ast *AST) Maybe(name string, callb ASTNodify, parser interface{}) Parser
- func (ast *AST) OrdChoice(nm string, cb ASTNodify, ps ...interface{}) Parser
- func (ast *AST) Parsewith(y Parser, s Scanner) (Queryable, Scanner)
- func (ast *AST) Prettyprint()
- func (ast *AST) Query(selectors string, ch chan Queryable)
- func (ast *AST) Reset() *AST
- func (ast *AST) SetDebug() *AST
- type ASTNodify
- type MaybeNone
- func (mn MaybeNone) GetAttribute(attrname string) []string
- func (mn MaybeNone) GetAttributes() map[string][]string
- func (mn MaybeNone) GetChildren() []Queryable
- func (mn MaybeNone) GetName() string
- func (mn MaybeNone) GetPosition() int
- func (mn MaybeNone) GetValue() string
- func (mn MaybeNone) IsTerminal() bool
- func (mn MaybeNone) SetAttribute(attrname, value string) Queryable
- type Nodify
- type NonTerminal
- func (nt *NonTerminal) GetAttribute(attrname string) []string
- func (nt *NonTerminal) GetAttributes() map[string][]string
- func (nt *NonTerminal) GetChildren() []Queryable
- func (nt *NonTerminal) GetName() string
- func (nt *NonTerminal) GetPosition() int
- func (nt *NonTerminal) GetValue() string
- func (nt *NonTerminal) IsTerminal() bool
- func (nt *NonTerminal) SetAttribute(attrname, value string) Queryable
- type ParsecNode
- type Parser
- func And(callb Nodify, parsers ...interface{}) Parser
- func Atom(match string, name string) Parser
- func AtomExact(match string, name string) Parser
- func Char() Parser
- func End() Parser
- func Float() Parser
- func Hex() Parser
- func Ident() Parser
- func Int() Parser
- func Kleene(callb Nodify, parsers ...interface{}) Parser
- func Many(callb Nodify, parsers ...interface{}) Parser
- func ManyUntil(callb Nodify, parsers ...interface{}) Parser
- func Maybe(callb Nodify, parser interface{}) Parser
- func NoEnd() Parser
- func Oct() Parser
- func OrdChoice(callb Nodify, parsers ...interface{}) Parser
- func OrdTokens(patterns []string, names []string) Parser
- func String() Parser
- func Token(pattern string, name string) Parser
- func TokenExact(pattern string, name string) Parser
- type Queryable
- type Scanner
- type SimpleScanner
- func (s *SimpleScanner) Clone() Scanner
- func (s *SimpleScanner) Endof() bool
- func (s *SimpleScanner) GetCursor() int
- func (s *SimpleScanner) Lineno() int
- func (s *SimpleScanner) Match(pattern string) ([]byte, Scanner)
- func (s *SimpleScanner) MatchString(str string) (bool, Scanner)
- func (s *SimpleScanner) SetWSPattern(pattern string) Scanner
- func (s *SimpleScanner) SkipAny(pattern string) ([]byte, Scanner)
- func (s *SimpleScanner) SkipWS() ([]byte, Scanner)
- func (s *SimpleScanner) SkipWSUnicode() ([]byte, Scanner)
- func (s *SimpleScanner) SubmatchAll(patt string) (map[string][]byte, Scanner)
- func (s *SimpleScanner) TrackLineno() Scanner
- type Terminal
- func (t *Terminal) GetAttribute(attrname string) []string
- func (t *Terminal) GetAttributes() map[string][]string
- func (t *Terminal) GetChildren() []Queryable
- func (t *Terminal) GetName() string
- func (t *Terminal) GetPosition() int
- func (t *Terminal) GetValue() string
- func (t *Terminal) IsTerminal() bool
- func (t *Terminal) SetAttribute(attrname, value string) Queryable
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AST ¶
type AST struct {
// contains filtered or unexported fields
}
AST to parse and construct Abstract Syntax Tree whose nodes confirm to `Queryable` interface, facilitating tree processing algorithms.
func NewAST ¶
NewAST return a new instance of AST, maxnodes is size of internal buffer pool of nodes, it is directly proportional to number of nodes that you expect in the syntax-tree.
func (*AST) And ¶
And combinator, same as package level And combinator function. `name` identifies the NonTerminal nodes constructed by this combinator.
Example ¶
// parse a configuration line from ini file. text := []byte(`loglevel = info`) ast := NewAST("example", 100) y := ast.And("configline", nil, Ident(), Atom("=", "EQUAL"), Ident()) root, _ := ast.Parsewith(y, NewScanner(text)) nodes := root.GetChildren() fmt.Println(nodes[0].GetName(), nodes[0].GetValue()) fmt.Println(nodes[1].GetName(), nodes[1].GetValue()) fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
Output: IDENT loglevel EQUAL = IDENT info
func (*AST) Dotstring ¶
Dotstring return AST in graphviz dot format. Save this string to a dot file and use graphviz tool generate a nice looking graph.
func (*AST) GetValue ¶
GetValue return the full text, called as value here, that was parsed to contruct this syntax-tree.
func (*AST) Kleene ¶
Kleene combinator, same as package level Kleene combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.
func (*AST) Many ¶
Many combinator, same as package level Many combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.
Example ¶
// parse comma separated values text := []byte(`10,30,50 wont parse this`) ast := NewAST("example", 100) y := ast.Many("many", nil, Int(), Atom(",", "COMMA")) root, _ := ast.Parsewith(y, NewScanner(text)) nodes := root.GetChildren() fmt.Println(nodes[0].GetName(), nodes[0].GetValue()) fmt.Println(nodes[1].GetName(), nodes[1].GetValue()) fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
Output: INT 10 INT 30 INT 50
func (*AST) ManyUntil ¶
ManyUntil combinator, same as package level Many combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.
Example ¶
// make sure to parse the entire text text := []byte("10,30,50") ast := NewAST("example", 100) y := ast.ManyUntil("values", nil, Int(), Atom(",", "COMMA"), ast.End("eof")) root, _ := ast.Parsewith(y, NewScanner(text)) nodes := root.GetChildren() fmt.Println(nodes[0].GetName(), nodes[0].GetValue()) fmt.Println(nodes[1].GetName(), nodes[1].GetValue()) fmt.Println(nodes[2].GetName(), nodes[2].GetValue())
Output: INT 10 INT 30 INT 50
func (*AST) Maybe ¶
Maybe combinator, same as package level Maybe combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.
Example ¶
// parse an optional token ast := NewAST("example", 100) equal := Atom("=", "EQUAL") maybeand := ast.Maybe("maybeand", nil, Atom("&", "AND")) y := ast.And("assignment", nil, Ident(), equal, maybeand, Ident()) text := []byte("a = &b") root, _ := ast.Parsewith(y, NewScanner(text)) nodes := root.GetChildren() fmt.Println(nodes[0].GetName(), nodes[0].GetValue()) fmt.Println(nodes[1].GetName(), nodes[1].GetValue()) fmt.Println(nodes[2].GetName(), nodes[2].GetValue()) fmt.Println(nodes[3].GetName(), nodes[3].GetValue()) text = []byte("a = b") ast = ast.Reset() root, _ = ast.Parsewith(y, NewScanner(text)) nodes = root.GetChildren() fmt.Println(nodes[0].GetName(), nodes[0].GetValue()) fmt.Println(nodes[1].GetName(), nodes[1].GetValue()) fmt.Println(nodes[2].GetName()) fmt.Println(nodes[3].GetName(), nodes[3].GetValue())
Output: IDENT a EQUAL = AND & IDENT b IDENT a EQUAL = missing IDENT b
func (*AST) OrdChoice ¶
OrdChoice combinator, same as package level OrdChoice combinator function. `nm` identifies the NonTerminal nodes constructed by this combinator.
Example ¶
// parse a boolean value text := []byte(`true`) ast := NewAST("example", 100) y := ast.OrdChoice("bool", nil, Atom("true", "TRUE"), Atom("false", "FALSE")) root, _ := ast.Parsewith(y, NewScanner(text)) fmt.Println(root.GetName(), root.GetValue())
Output: TRUE true
func (*AST) Parsewith ¶
Parsewith execute the root parser, y, with scanner s. AST will remember the root parser, and root node. Return the root-node as Queryable, if success and scanner with remaining input.
func (*AST) Prettyprint ¶
func (ast *AST) Prettyprint()
Prettyprint to standard output the syntax-tree in human readable plain text.
func (*AST) Query ¶
Query is an experimental method on AST. Developers can use the selector specification to pick one or more nodes from the AST.
type ASTNodify ¶
ASTNodify callback function to construct custom Queryable. Even when combinators like And, OrdChoice, Many etc.. match input string, it is possible to fail them via ASTNodify callback function, by returning nil. This is useful in cases like:
- where lookahead matching is required.
- exceptional cases for a regex pattern.
Note that some combinators like Kleene shall not interpret the return value from ASTNodify callback. `node` will always be of NonTerminal type, although callback can process it and return a different type, provided it implements Queryable interface.
Example ¶
text := []byte("10 * 20") ast := NewAST("example", 100) y := ast.And( "multiply", func(name string, s Scanner, node Queryable) Queryable { cs := node.GetChildren() x, _ := strconv.Atoi(cs[0].(*Terminal).GetValue()) y, _ := strconv.Atoi(cs[2].(*Terminal).GetValue()) return &Terminal{Value: fmt.Sprintf("%v", x*y)} }, Int(), Token(`\*`, "MULT"), Int(), ) node, _ := ast.Parsewith(y, NewScanner(text)) fmt.Println(node.GetValue())
Output: 200
type MaybeNone ¶
type MaybeNone string
MaybeNone is a placeholder type, similar to Terminal type, used by Maybe combinator if parser does not match the input text.
func (MaybeNone) GetAttribute ¶
GetAttribute implement Queryable interface.
func (MaybeNone) GetAttributes ¶
GetAttributes implement Queryable interface.
func (MaybeNone) GetChildren ¶
GetChildren implement Queryable interface.
func (MaybeNone) GetPosition ¶
GetPosition implement Queryable interface.
func (MaybeNone) IsTerminal ¶
IsTerminal implement Queryable interface.
func (MaybeNone) SetAttribute ¶
SetAttribute implement Queryable interface.
type Nodify ¶
type Nodify func([]ParsecNode) ParsecNode
Nodify callback function to construct custom ParsecNode. Even when combinators like And, OrdChoice, Many etc.. can match input string, it is still possible to fail them via nodify callback function, by returning nil. This very useful in cases when,
- lookahead matching is required.
- an exceptional cases for regex pattern.
Note that some combinators like KLEENE shall not interpret the return value from Nodify callback.
Example ¶
text := []byte("10 * 20") s := NewScanner(text) y := And( func(nodes []ParsecNode) ParsecNode { x, _ := strconv.Atoi(nodes[0].(*Terminal).GetValue()) y, _ := strconv.Atoi(nodes[2].(*Terminal).GetValue()) return x * y // this is retuned as node further down. }, Int(), Token(`\*`, "MULT"), Int(), ) node, _ := y(s) fmt.Println(node)
Output: 200
type NonTerminal ¶
type NonTerminal struct { Name string // contains terminal's token type Children []Queryable // list of children to this node. Attributes map[string][]string }
NonTerminal will be used by AST methods to construct intermediate nodes. Note that user supplied ASTNodify callback can construct a different type of intermediate node that confirms to Queryable interface.
func NewNonTerminal ¶
func NewNonTerminal(name string) *NonTerminal
NewNonTerminal create and return a new NonTerminal instance.
func (*NonTerminal) GetAttribute ¶
func (nt *NonTerminal) GetAttribute(attrname string) []string
GetAttribute implement Queryable interface.
func (*NonTerminal) GetAttributes ¶
func (nt *NonTerminal) GetAttributes() map[string][]string
GetAttributes implement Queryable interface.
func (*NonTerminal) GetChildren ¶
func (nt *NonTerminal) GetChildren() []Queryable
GetChildren implement Queryable interface.
func (*NonTerminal) GetName ¶
func (nt *NonTerminal) GetName() string
GetName implement Queryable interface.
func (*NonTerminal) GetPosition ¶
func (nt *NonTerminal) GetPosition() int
GetPosition implement Queryable interface.
func (*NonTerminal) GetValue ¶
func (nt *NonTerminal) GetValue() string
GetValue implement Queryable interface.
func (*NonTerminal) IsTerminal ¶
func (nt *NonTerminal) IsTerminal() bool
IsTerminal implement Queryable interface.
func (*NonTerminal) SetAttribute ¶
func (nt *NonTerminal) SetAttribute(attrname, value string) Queryable
SetAttribute implement Queryable interface.
type ParsecNode ¶
type ParsecNode interface{}
ParsecNode for parsers return input text as parsed nodes.
type Parser ¶
type Parser func(Scanner) (ParsecNode, Scanner)
Parser function parses input text encapsulated by Scanner, higher order parsers are constructed using combinators.
func And ¶
And combinator accepts a list of `Parser`, or reference to a parser, that must match the input string, atleast until the last Parser argument. Return a parser function that can further be used to construct higher-level parsers.
If all parser matches, a list of ParsecNode, where each ParsecNode is constructed by matching parser, will be passed as argument to Nodify callback. Even if one of the input parser function fails, And will fail without consuming the input.
Example ¶
// parse a configuration line from ini file. text := []byte(`loglevel = info`) y := And(nil, Ident(), Atom("=", "EQUAL"), Ident()) root, _ := y(NewScanner(text)) nodes := root.([]ParsecNode) t := nodes[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[1].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[2].(*Terminal) fmt.Println(t.GetName(), t.GetValue())
Output: IDENT loglevel EQUAL = IDENT info
func Atom ¶
Atom is similar to Token, takes a string to match with input byte-by-byte. Internally uses the MatchString() API from Scanner. Skip leading whitespace. For example:
scanner := NewScanner([]byte("cosmos")) Atom("cos", "ATOM")(scanner) // will match
func AtomExact ¶
AtomExact is similar to Atom(), but string will be matched without skipping leading whitespace.
func Char ¶
func Char() Parser
Char return parser function to match a single character in the input stream. Skip leading whitespace.
func End ¶
func End() Parser
End is a parser function to detect end of scanner output, return boolean as ParseNode, hence incompatible with AST{}. Instead, use AST:End method.
func Float ¶
func Float() Parser
Float return parser function to match a float literal in the input stream. Skip leading whitespace.
func Hex ¶
func Hex() Parser
Hex return parser function to match a hexadecimal literal in the input stream. Skip leading whitespace.
func Ident ¶
func Ident() Parser
Ident return parser function to match an identifier token in the input stream, an identifier is matched with the following pattern: `^[A-Za-z][0-9a-zA-Z_]*`. Skip leading whitespace.
func Int ¶
func Int() Parser
Int return parser function to match an integer literal in the input stream. Skip leading whitespace.
func Kleene ¶
Kleene combinator accepts two parsers, or reference to parsers, namely opScan and sepScan, where opScan parser will be used to match input string and contruct ParsecNode, and sepScan parser will be used to match input string and ignore the matched string. If sepScan parser is not supplied, then opScan parser will be applied on the input until it fails.
The process of matching opScan parser and sepScan parser will continue in a loop until either one of them fails on the input stream.
For every successful match of opScan, the returned ParsecNode from matching parser will be accumulated and passed as argument to Nodify callback. If there is not a single match for opScan, then []ParsecNode of ZERO length will be passed as argument to Nodify callback. Kleene combinator will never fail.
func Many ¶
Many combinator accepts two parsers, or reference to parsers, namely opScan and sepScan, where opScan parser will be used to match input string and contruct ParsecNode, and sepScan parser will be used to match input string and ignore the matched string. If sepScan parser is not supplied, then opScan parser will be applied on the input until it fails.
The process of matching opScan parser and sepScan parser will continue in a loop until either one of them fails on the input stream.
The difference between `Many` combinator and `Kleene` combinator is that there shall atleast be one match of opScan.
For every successful match of opScan, the returned ParsecNode from matching parser will be accumulated and passed as argument to Nodify callback. If there is not a single match for opScan, then Many will fail without consuming the input.
Example ¶
// parse comma separated values text := []byte(`10,30,50 wont parse this`) y := Many(nil, Int(), Atom(",", "COMMA")) root, _ := y(NewScanner(text)) nodes := root.([]ParsecNode) t := nodes[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[1].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[2].(*Terminal) fmt.Println(t.GetName(), t.GetValue())
Output: INT 10 INT 30 INT 50
func ManyUntil ¶
ManyUntil combinator accepts three parsers, or references to parsers, namely opScan, sepScan and untilScan, where opScan parser will be used to match input string and contruct ParsecNode, and sepScan parser will be used to match input string and ignore the matched string. If sepScan parser is not supplied, then opScan parser will be applied on the input until it fails.
The process of matching opScan parser and sepScan parser will continue in a loop until either one of them fails on the input stream or untilScan matches.
For every successful match of opScan, the returned ParsecNode from matching parser will be accumulated and passed as argument to Nodify callback. If there is not a single match for opScan, then ManyUntil will fail without consuming the input.
Example ¶
// make sure to parse the entire text text := []byte("10,20,50") y := ManyUntil(nil, Int(), Atom(",", "COMMA"), End()) root, _ := y(NewScanner(text)) nodes := root.([]ParsecNode) t := nodes[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[1].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[2].(*Terminal) fmt.Println(t.GetName(), t.GetValue())
Output: INT 10 INT 20 INT 50
func Maybe ¶
Maybe combinator accepts a single parser, or reference to a parser, and tries to match the input stream with it. If parser fails to match the input, returns MaybeNone.
Example ¶
// parse an optional token equal := Atom("=", "EQUAL") maybeand := Maybe(nil, Atom("&", "AND")) y := And(nil, Ident(), equal, maybeand, Ident()) text := []byte("a = &b") root, _ := y(NewScanner(text)) nodes := root.([]ParsecNode) t := nodes[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[1].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[2].([]ParsecNode)[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[3].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) text = []byte("a = b") root, _ = y(NewScanner(text)) nodes = root.([]ParsecNode) t = nodes[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) t = nodes[1].(*Terminal) fmt.Println(t.GetName(), t.GetValue()) fmt.Println(nodes[2]) t = nodes[3].(*Terminal) fmt.Println(t.GetName(), t.GetValue())
Output: IDENT a EQUAL = AND & IDENT b IDENT a EQUAL = missing IDENT b
func NoEnd ¶
func NoEnd() Parser
NoEnd is a parser function to detect not-an-end of scanner output, return boolean as ParsecNode, hence incompatible with AST{}.
func Oct ¶
func Oct() Parser
Oct return parser function to match an octal number literal in the input stream. Skip leading whitespace.
func OrdChoice ¶
OrdChoice combinator accepts a list of `Parser`, or reference to a parser, where atleast one of the parser must match the input string. Return a parser function that can further be used to construct higher level parsers.
The first matching parser function's output is passed as argument to Nodify callback, that is []ParsecNode argument will just have one element in it. If none of the parsers match the input, then OrdChoice will fail without consuming any input.
Example ¶
// parse a boolean value text := []byte(`true`) y := OrdChoice(nil, Atom("true", "TRUE"), Atom("false", "FALSE")) root, _ := y(NewScanner(text)) nodes := root.([]ParsecNode) t := nodes[0].(*Terminal) fmt.Println(t.GetName(), t.GetValue())
Output: TRUE true
func OrdTokens ¶
OrdTokens to parse a single token based on one of the specified `patterns`. Skip leading whitespaces.
func String ¶
func String() Parser
String parse double quoted string in input text, this parser returns string type as ParsecNode, hence incompatible with AST combinators. Skip leading whitespace.
func Token ¶
Token takes a regular-expression pattern and return a parser that will match input stream with supplied pattern. Skip leading whitespace. `name` will be used as the Terminal's name.
func TokenExact ¶
TokenExact same as Token() but pattern will be matched without skipping leading whitespace. `name` will be used as the terminal's name.
type Queryable ¶
type Queryable interface { // GetName for the node. GetName() string // IsTerminal return true if node is a leaf node in syntax-tree. IsTerminal() bool // GetValue return parsed text, if node is NonTerminal it will // concat the entire sub-tree for parsed text and return the same. GetValue() string // GetChildren relevant only for NonTerminal node. GetChildren() []Queryable // GetPosition of the first terminal value in input. GetPosition() int // SetAttribute with a value string, can be called multiple times for the // same attrname. SetAttribute(attrname, value string) Queryable // GetAttribute for attrname, since more than one value can be set on the // attribute, return a slice of values. GetAttribute(attrname string) []string // GetAttributes return a map of all attributes set on this node. GetAttributes() map[string][]string }
Queryable interface to be implemented by all nodes, both terminal and non-terminal nodes constructed using AST object.
type Scanner ¶
type Scanner interface { // SetWSPattern to configure white space pattern. Typically used as // scanner := NewScanner(input).SetWSPattern(" ") SetWSPattern(pattern string) Scanner // TrackLineno as cursor moves forward, this can slow down parsing. // Useful when developing with parsec package. TrackLineno() Scanner // Clone will return new clone of the underlying scanner structure. // This will be used by combinators to _backtrack_. Clone() Scanner // GetCursor gets the current cursor position inside input text. GetCursor() int // Match the input stream with `pattern` and return matching string // after advancing the scanner's cursor. Match(pattern string) ([]byte, Scanner) // Match the input stream with a simple string, rather that a // pattern. It should be more efficient. Return a bool indicating // if the match was succesfull after advancing the scanner's cursor. MatchString(string) (bool, Scanner) // SubmatchAll the input stream with a choice of `patterns` // and return matching string and submatches, after advancing the // Scanner's cursor. SubmatchAll(pattern string) (map[string][]byte, Scanner) // SkipWs skips white space characters in the input stream. // Return skipped whitespaces as byte-slice after advance the // Scanner's cursor. SkipWS() ([]byte, Scanner) // SkipAny any occurrence of the elements of the slice. // Equivalent to Match(`(b[0]|b[1]|...|b[n])*`) // Returns Scanner after advancing its cursor. SkipAny(pattern string) ([]byte, Scanner) // Lineno return the current line-number of the cursor. Lineno() int // Endof detects whether end-of-file is reached in the input // stream and return a boolean indicating the same. Endof() bool }
Scanner interface defines necessary methods to match the input stream.
func NewScanner ¶
NewScanner create and return a new instance of SimpleScanner object.
type SimpleScanner ¶
type SimpleScanner struct {
// contains filtered or unexported fields
}
SimpleScanner implements Scanner interface based on golang's regexp module.
func (*SimpleScanner) Clone ¶
func (s *SimpleScanner) Clone() Scanner
Clone implement Scanner{} interface.
func (*SimpleScanner) Endof ¶
func (s *SimpleScanner) Endof() bool
Endof implement Scanner{} interface.
func (*SimpleScanner) GetCursor ¶
func (s *SimpleScanner) GetCursor() int
GetCursor implement Scanner{} interface.
func (*SimpleScanner) Lineno ¶
func (s *SimpleScanner) Lineno() int
Lineno implement Scanner{} interface.
func (*SimpleScanner) Match ¶
func (s *SimpleScanner) Match(pattern string) ([]byte, Scanner)
Match implement Scanner{} interface.
func (*SimpleScanner) MatchString ¶
func (s *SimpleScanner) MatchString(str string) (bool, Scanner)
MatchString implement Scanner{} interface.
func (*SimpleScanner) SetWSPattern ¶
func (s *SimpleScanner) SetWSPattern(pattern string) Scanner
SetWSPattern implement Scanner{} interface.
func (*SimpleScanner) SkipAny ¶
func (s *SimpleScanner) SkipAny(pattern string) ([]byte, Scanner)
SkipAny implement Scanner{} interface.
func (*SimpleScanner) SkipWS ¶
func (s *SimpleScanner) SkipWS() ([]byte, Scanner)
SkipWS implement Scanner{} interface.
func (*SimpleScanner) SkipWSUnicode ¶
func (s *SimpleScanner) SkipWSUnicode() ([]byte, Scanner)
SkipWSUnicode for looping through runes checking for whitespace.
func (*SimpleScanner) SubmatchAll ¶
func (s *SimpleScanner) SubmatchAll(patt string) (map[string][]byte, Scanner)
SubmatchAll implement Scanner{} interface.
func (*SimpleScanner) TrackLineno ¶
func (s *SimpleScanner) TrackLineno() Scanner
TrackLineno implement Scanner{} interface.
type Terminal ¶
type Terminal struct { Name string // contains terminal's token type Value string // value of the terminal Position int // Offset into the text stream where token was identified Attributes map[string][]string }
Terminal type can be used to construct a terminal ParsecNode. It implements Queryable interface, hence can be used with AST object.
func NewTerminal ¶
NewTerminal create a new Terminal instance. Supply the name of the terminal, its matching text from i/p stream as value. And its position within the i/p stream.
func (*Terminal) GetAttribute ¶
GetAttribute implement Queryable interface.
func (*Terminal) GetAttributes ¶
GetAttributes implement Queryable interface.
func (*Terminal) GetChildren ¶
GetChildren implement Queryable interface.
func (*Terminal) GetPosition ¶
GetPosition implement Queryable interface.
func (*Terminal) IsTerminal ¶
IsTerminal implement Queryable interface.
func (*Terminal) SetAttribute ¶
SetAttribute implement Queryable interface.