Documentation
¶
Overview ¶
Package parser implements simple, yet expressive mechanisms for combinatorial parsing in Go.
Example ¶
package main import ( "fmt" "os" "strconv" "github.com/FollowTheProcess/parser" ) // RGB represents a colour. type RGB struct { Red int Green int Blue int } // fromHex parses a string into a hex digit. func fromHex(s string) (int, error) { hx, err := strconv.ParseInt(s, 16, 64) return int(hx), err } // hexPair is a parser that converts a hex string into it's integer value. func hexPair(colour string) (int, string, error) { return parser.Map( parser.Take(2), fromHex, )(colour) } func main() { // Let's parse this into an RGB colour := "#2F14DF" // We don't actually care about the # _, colour, err := parser.Char('#')(colour) if err != nil { fmt.Fprintln(os.Stderr, err) return } // We want 3 hex pairs pairs, _, err := parser.Count(hexPair, 3)(colour) if err != nil { fmt.Fprintln(os.Stderr, err) return } if len(pairs) != 3 { fmt.Fprintln(os.Stderr, err) return } rgb := RGB{ Red: pairs[0], Green: pairs[1], Blue: pairs[2], } fmt.Printf("%#v\n", rgb) }
Output: parser_test.RGB{Red:47, Green:20, Blue:223}
Index ¶
- type Parser
- func AnyOf(chars string) Parser[string]
- func Chain[T any](parsers ...Parser[T]) Parser[[]T]
- func Char(char rune) Parser[string]
- func Count[T any](parser Parser[T], count int) Parser[[]T]
- func Exact(match string) Parser[string]
- func ExactCaseInsensitive(match string) Parser[string]
- func Map[T1, T2 any](parser Parser[T1], fn func(T1) (T2, error)) Parser[T2]
- func NoneOf(chars string) Parser[string]
- func NotAnyOf(chars string) Parser[string]
- func OneOf(chars string) Parser[string]
- func Optional(match string) Parser[string]
- func Take(n int) Parser[string]
- func TakeTo(match string) Parser[string]
- func TakeUntil(predicate func(r rune) bool) Parser[string]
- func TakeWhile(predicate func(r rune) bool) Parser[string]
- func TakeWhileBetween(lower, upper int, predicate func(r rune) bool) Parser[string]
- func Try[T any](parsers ...Parser[T]) Parser[T]
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Parser ¶
Parser is the core parsing function that all parser functions return, they can be combined and composed to parse complex grammars.
Each Parser is generic over type T and returns the parsed value from the input, the remaining unparsed input and an error.
func AnyOf ¶
AnyOf returns a Parser that continues taking characters so long as they are contained in the passed in set of chars.
Parsing stops at the first occurrence of a character not contained in the argument and the offending character is not included in the parsed value, but will be in the remainder.
AnyOf is the opposite to NotAnyOf.
If the input or chars is empty, an error will be returned. Likewise if none of the chars are present at the start of the input.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "DEADBEEF and the rest" chars := "1234567890ABCDEF" // Any hexadecimal digit value, remainder, err := parser.AnyOf(chars)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "DEADBEEF" Remainder: " and the rest"
func Chain ¶ added in v0.3.0
Chain returns a Parser that calls a series of sub-parsers, passing the remainder from one as input to the next and returning a slice of values; one from each parser, and any remaining input after applying all the parsers.
If any of the parsers fail, an error will be returned.
Note: Because Chain takes a variadic argument and returns a slice, it is one of the only parser functions to allocate on the heap.
Example ¶
package main import ( "fmt" "os" "unicode" "github.com/FollowTheProcess/parser" ) func main() { input := "1234abcd\t\n日ð本rest..." value, remainder, err := parser.Chain( // Can do this is a number of ways, but here's one! parser.TakeWhile(unicode.IsDigit), parser.Exact("abcd"), parser.TakeWhile(unicode.IsSpace), parser.Char('日'), parser.Char('ð'), parser.Char('本'), )(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %#v\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: []string{"1234", "abcd", "\t\n", "日", "ð", "本"} Remainder: "rest..."
func Char ¶
Char returns a Parser that consumes a single exact, case-sensitive utf-8 character from the input.
If the first char in the input is not the requested char, an error will be returned.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "X marks the spot!" value, remainder, err := parser.Char('X')(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "X" Remainder: " marks the spot!"
func Count ¶
Count returns a Parser that applies another parser a certain number of times, returning the values in a slice along with any remaining input.
If the parser fails or the input is exhausted before the parser has been applied the requested number of times, an error will be returned.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "12345678rest..." // Pairs of digits with a bit on the end value, remainder, err := parser.Count(parser.Take(2), 4)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %#v\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: []string{"12", "34", "56", "78"} Remainder: "rest..."
func Exact ¶
Exact returns a Parser that consumes an exact, case-sensitive string from the input.
If the string is not present at the beginning of the input, an error will be returned.
An empty match string or empty input (i.e. "") will also return an error.
Exact is case-sensitive, if you need a case-insensitive match, use ExactCaseInsensitive instead.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "General Kenobi! You are a bold one." value, remainder, err := parser.Exact("General Kenobi!")(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "General Kenobi!" Remainder: " You are a bold one."
func ExactCaseInsensitive ¶
ExactCaseInsensitive returns a Parser that consumes an exact, case-insensitive string from the input.
If the string is not present at the beginning of the input, an error will be returned.
An empty match string or empty input (i.e. "") will also return an error.
ExactCaseInsensitive is case-insensitive, if you need a case-sensitive match, use Exact instead.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "GENERAL KENOBI! YOU ARE A BOLD ONE." value, remainder, err := parser.ExactCaseInsensitive("GEnErAl KeNobI!")(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "GENERAL KENOBI!" Remainder: " YOU ARE A BOLD ONE."
func Map ¶
Map returns a Parser that applies a function to the result of another parser.
It is particularly useful for parsing a section of string input, then converting that captured string to another type.
If the provided parser or the mapping function 'fn' return an error, Map will bubble up this error to the caller.
Example ¶
package main import ( "fmt" "os" "strconv" "github.com/FollowTheProcess/parser" ) func main() { input := "27 <- this is a number" // Let's convert it to an int! value, remainder, err := parser.Map(parser.Take(2), strconv.Atoi)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value %[1]d is type %[1]T\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value 27 is type int Remainder: " <- this is a number"
func NoneOf ¶
NoneOf returns a Parser that recognises any char other than any of the provided characters from the start of input.
It can be considered as the opposite to OneOf.
If the input or chars is empty, an error will be returned. Likewise if one of the chars was recognised.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "abcdefg" chars := "xyz" // Match anything other than 'x', 'y', or 'z' from input value, remainder, err := parser.NoneOf(chars)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "a" Remainder: "bcdefg"
func NotAnyOf ¶
NotAnyOf returns a Parser that continues taking characters so long as they are not contained in the passed in set of chars.
Parsing stops at the first occurrence of a character contained in the argument and the offending character is not included in the parsed value, but will be in the remainder.
NotAnyOf is the opposite of AnyOf.
If the input or chars is empty, an error will be returned. Likewise if any of the chars are present at the start of the input.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "69 is a number" chars := "abcdefghijklmnopqrstuvwxyz" // Parse until we hit any lowercase letter value, remainder, err := parser.NotAnyOf(chars)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "69 " Remainder: "is a number"
func OneOf ¶
OneOf returns a Parser that recognises one of the provided characters from the start of input.
If you want to match anything other than the provided char set, use NoneOf.
If the input or chars is empty, an error will be returned. Likewise if none of the chars was recognised.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "abcdefg" chars := "abc" // Match any of 'a', 'b', or 'c' from input value, remainder, err := parser.OneOf(chars)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "a" Remainder: "bcdefg"
func Optional ¶ added in v0.2.0
Optional returns a Parser that recognises an optional exact string from the start of input.
If the match is there, it is returned as the value with the remainder being the remaining input, if the match is not there, the entire input is returned as the remainder with no value and no error.
If the input is empty or invalid utf-8, then an error will be returned.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "12.6.7-rc.2" // A semver, but could have an optional v // Doesn't matter... value, remainder, err := parser.Optional("v")(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "" Remainder: "12.6.7-rc.2"
func Take ¶
Take returns a Parser that consumes n utf-8 chars from the input.
If n is less than or equal to 0, or greater than the number of utf-8 chars in the input, an error will be returned.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "Hello I am some input for you to parser" value, remainder, err := parser.Take(10)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "Hello I am" Remainder: " some input for you to parser"
func TakeTo ¶
TakeTo returns a Parser that consumes characters until it first hits an exact string.
If the input is empty or the exact string is not in the input, an error will be returned.
The value will contain everything from the start of the input up to the first occurrence of match, and the remainder will contain the match and everything thereafter.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "lots of stuff KEYWORD more stuff" value, remainder, err := parser.TakeTo("KEYWORD")(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "lots of stuff " Remainder: "KEYWORD more stuff"
func TakeUntil ¶
TakeUntil returns a Parser that continues taking characters until the predicate returns true, the parsing stops as soon as the predicate returns true for a particular character. The last character for which the predicate returns false is captured; that is, TakeUntil is inclusive.
TakeUntil can be thought of as the inverse of TakeWhile.
If the input is empty or predicate == nil, an error will be returned.
If the predicate never returns true, the entire input will be returned as the value with no remainder.
A predicate that never returns false will return an error.
Example ¶
package main import ( "fmt" "os" "unicode" "github.com/FollowTheProcess/parser" ) func main() { input := "something <- first whitespace is here" value, remainder, err := parser.TakeUntil(unicode.IsSpace)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "something" Remainder: " <- first whitespace is here"
func TakeWhile ¶
TakeWhile returns a Parser that continues consuming characters so long as the predicate returns true, the parsing stops as soon as the predicate returns false for a particular character. The last character for which the predicate returns true is captured; that is, TakeWhile is inclusive.
TakeWhile can be thought of as the inverse of TakeUntil.
If the input is empty or predicate == nil, an error will be returned.
If the predicate doesn't return false for any char in the input, the entire input is returned as the value with no remainder.
A predicate that never returns true will return an error.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "本本本b語ç日ð本Ê語" pred := func(r rune) bool { return r == '本' } value, remainder, err := parser.TakeWhile(pred)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "本本本" Remainder: "b語ç日ð本Ê語"
func TakeWhileBetween ¶
TakeWhileBetween returns a Parser that recognises the longest (lower <= len <= upper) sequence of utf-8 characters for which the predicate returns true.
Any of the following conditions will return an error:
- input is empty
- input is not valid utf-8
- predicate is nil
- lower < 0
- lower > upper
- predicate never returns true
- predicate matched some chars but less than lower limit
Example ¶
package main import ( "fmt" "os" "strconv" "github.com/FollowTheProcess/parser" ) func main() { input := "2F14DF" // A hex colour (minus the #) isHexDigit := func(r rune) bool { _, err := strconv.ParseUint(string(r), 16, 64) return err == nil } value, remainder, err := parser.TakeWhileBetween(2, 2, isHexDigit)(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "2F" Remainder: "14DF"
func Try ¶
Try returns a Parser that attempts a series of sub-parsers, returning the output from the first successful one.
If all parsers fail, an error will be returned.
Note: Because Try takes a variadic argument, it is one of the only parser functions to allocate on the heap.
Example ¶
package main import ( "fmt" "os" "github.com/FollowTheProcess/parser" ) func main() { input := "xyzabc日ð本Ê語" value, remainder, err := parser.Try( parser.OneOf("abc"), // Will fail parser.Char('本'), // Same parser.ExactCaseInsensitive("XyZ"), // Should succeed, this is the output we'll get )(input) if err != nil { fmt.Fprintln(os.Stderr, err) } fmt.Printf("Value: %q\n", value) fmt.Printf("Remainder: %q\n", remainder) }
Output: Value: "xyz" Remainder: "abc日ð本Ê語"