Documentation
¶
Index ¶
- Constants
- func ExtractHex32n(src Source, max_chars int) (v uint32, overflow bool, n_chars int)
- func ExtractHex64n(src Source, max_chars int) (v uint64, overflow bool, n_chars int)
- func ExtractOct32n(src Source, max_chars int) (v uint32, overflow bool, n_chars int)
- func Static(buf []byte, lc *LineCol) *static_impl
- func Tokenize[T Key](buf []byte, bindings []*Binding[T], on_token func(k T, c *Context, lc LineCol)) error
- type Binding
- type Context
- type ErrAtLineCol
- type ErrCode
- func EOF(src Source, ctx *Context) ErrCode
- func EOL(src Source, ctx *Context) ErrCode
- func HexCodepoint_XXXX(src Source, ctx *Context) ErrCode
- func HexCodepoint_XXXXXXXX(src Source, ctx *Context) ErrCode
- func HexCodeunit_XX(src Source, ctx *Context) ErrCode
- func HexCodeunit_Xn(src Source, ctx *Context) ErrCode
- func OctCodeunit_X3n(src Source, ctx *Context) ErrCode
- type ErrContent
- type Key
- type LineCol
- type Location
- type Source
- type Term
- type TermFunc
- func AnyOf(args ...string) TermFunc
- func Between(prefix, terminator any, content ...any) TermFunc
- func Codepoint(r rune) TermFunc
- func CodepointFunc(m func(rune) bool) TermFunc
- func Escaped(prefix rune, escapers map[rune]any) TermFunc
- func FirstOf(args ...any) TermFunc
- func HexCodeunit_XXXX(first_prefix, second_prefix string) TermFunc
- func HexN[T unsigned](prefix string) TermFunc
- func Literal(s string) TermFunc
- func OneOrMore[T Term](a T) TermFunc
- func Optional[T Term](a T) TermFunc
- func Sequence(args ...any) TermFunc
- func Skip(content ...any) TermFunc
- func Uint[T unsigned | signed](prefix string, base uint, maxval T) TermFunc
- func ZeroOrMore[T Term](a T) TermFunc
Constants ¶
const ( ErrCodeNone = ErrCode(iota) ErrCodeUnexpected ErrCodeExpected ErrCodeUnterminated ErrCodeIncomplete ErrCodeUnpaired ErrCodeInvalid ErrCodeOverflow ErrCodeUnmatched = ErrCode(-1) )
const Unmatched = rune(0x7fffffff)
Variables ¶
This section is empty.
Functions ¶
func ExtractHex32n ¶
func ExtractHex64n ¶
func ExtractOct32n ¶
Types ¶
type ErrAtLineCol ¶
func (*ErrAtLineCol) Error ¶
func (e *ErrAtLineCol) Error() string
type ErrCode ¶
type ErrCode int
func HexCodepoint_XXXX ¶
HexCodepoint_XXXX captures four hexadecimal digits and interprets those as a UTF-16 codepoint. This codeunit is then converted to a UTF-8 sequence and inserted into the captured string.
Returned values are:
- `ErrCodeUnmatched` if src does not start with the hex digit
- `ErrCodeIncomplete` if src contains less than 4 hex digits
- `ErrCodeInvalid` if src is a surrogate.
- `ErrCodeNone` if src contains 4 hex digits that represent a valid codepoint.
If src contains more than 4 digits, this function consumes only the the first 4 them.
func HexCodepoint_XXXXXXXX ¶
HexCodepoint_XXXXXXXX captures 8 hexadecimal digits and interprets those as a UTF-32 codeunit. This codeunit is then converted to a UTF-8 sequence and inserted into the captured string. This function does not perform any validation, neither does it check for surrogates.
Returned values are:
- `ErrCodeUnmatched` if src does not start with the hex digit
- `ErrCodeIncomplete` if src contains less than 8 hex digits
- `ErrCodeNone` if src contains 8 hex digits
If src contains more than 8 digits, this function consumes only the the first 8 them.
func HexCodeunit_XX ¶
HexCodeunit_XX reads two hexadecimal digits from src and inserts the corresponding numeric value into captured string as a UTF-8 codeunit. The codeunit is inserted as-is, without any validation.
Returned values are:
- `ErrCodeUnmatched` if src does not start with the hex digit
- `ErrCodeIncomplete` if src contains only one hex digit
- `ErrCodeNone` if src contains two hex digits
If src contains more than two hex digits, this function consumes only the the first two of them.
func HexCodeunit_Xn ¶
HexCodeunit_Xn reads hexadecimal digits from src and inserts the corresponding numeric value into captured string as a UTF-8 codeunit. The codeunit is inserted as-is, without any validation.
Returned values are:
- `ErrCodeUnmatched` if src does not start with the hex digit
- `ErrCodeInvalid` if the obtained value exceeds 255
- `ErrCodeNone` if src contains a value in [0..255] range
This function consumes all the hex digits, regardless of overflow.
func OctCodeunit_X3n ¶
OctCodeunit_X3n reads 1~3 octal digits from src and inserts the corresponding numeric value into captured string as a UTF-8 codeunit. The codeunit is inserted as-is, without any validation.
Returned values are:
- `ErrCodeUnmatched` if src does not start with the hex digit
- `ErrCodeInvalid` if the obtained value exceeds 255
- `ErrCodeNone` if src contains a value in [0..255] range
type ErrContent ¶
func Expected ¶
func Expected(v string) *ErrContent
func Invalid ¶
func Invalid(v string) *ErrContent
func Unexpected ¶
func Unexpected(v string) *ErrContent
func Unpaired ¶
func Unpaired(v string) *ErrContent
func Unterminated ¶
func Unterminated(v string) *ErrContent
func (*ErrContent) Error ¶
func (e *ErrContent) Error() string
type Location ¶
func (*Location) ColumnNumber ¶
type Source ¶
type Source interface { // Done indicates that there is no more content available in the input. Done() bool // Peek previews the codepoint without consuming it. Returns the Unmatched // sentinel if the source is at the end of input. Peek() rune // Hop consumes one codepoint if it matches c. Hop(c rune) bool // Leap consumes len(seq) bytes only if all the bytes match. Leap(seq string) bool // Fetch consumes and returns one codepoint if its value is matched by f. // Otherwise, it returns the Unmatched sentinel. Fetch(f func(rune) bool) rune // Skip consumes len(seq) bytes only if all the bytes match and the codepoint // that follows matches the term. This is similar to Leap followed by Fetch. Skip(seq string, term func(rune) bool) rune }
type TermFunc ¶
func CodepointFunc ¶
func Escaped ¶
Escaped creates a matcher for escape sequences (typically found inside string literals).
Supported escaper types (assuming prefix is `\`):
- struct{} self-mapping \z -> z
- byte maps to byte \z -> byte code unit
- rune maps to rune \z -> utf8-encoded codepoint
- string maps to string \z -> literal string sequence
- TermFunc uses termfunc \z... -> envokes TermFunc to decode `...`
A key value in the supplied map may be specified as Unmatched
func HexCodeunit_XXXX ¶
HexCodeunit_XXXX this is a tricky one that is specialized for escape sequences that may decode into a utf-16 pair of surrogates which, in turn, needs to be re-assembled into a single codeunit. JSON is a good example.