jtree

package module
v0.0.0-...-8c49b5b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 25, 2024 License: BSD-3-Clause Imports: 10 Imported by: 0

README

jtree

GoDoc

This repository defines a Go module that implements an streaming JSON scanner and parser.

The jwcc package implements the JSON With Commas and Comments(JWCC) extension, using the same underlying parsing machinery.

Documentation

Overview

Package jtree implements a JSON scanner and parser.

Scanning

The Scanner type implements a lexical scanner for JSON. Construct a scanner from an io.Reader and call its Next method to iterate over the stream. Next advances to the next input token and returns nil, or reports an error:

s := jtree.NewScanner(input)
for s.Next() == nil {
   log.Printf("Next token: %v", s.Token())
}

Next returns io.EOF when the input has been fully consumed. Any other error indicates an I/O or lexical error in the input.

if s.Err() != io.EOF {
   log.Fatalf("Scanning failed: %v", err)
}

Streaming

The Stream type implements an event-driven stream parser for JSON. The parser works by calling methods on a Handler value to report the structure of the input. In case of error, parsing is terminated and an error of concrete type *jtree.SyntaxError is returned.

Construct a Stream from an io.Reader, and call its Parse method. Parse returns nil if the input was fully processed without error. If a Handler method reports an error, parsing stops and that error is returned.

s := jtree.NewStream(input)
if err := s.Parse(handler); err != nil {
   log.Fatalf("Parse failed: %v", err)
}

To parse a single value from the front of the input, call ParseOne. This method returns io.EOF if no further values are available:

if err := s.ParseOne(handle); err == io.EOF {
   log.Print("No more input")
} else if err != nil {
   log.Printf("ParseOne failed: %v", err)
}

Handlers

The Handler interface accepts parser events from a Stream. The methods of a handler correspond to the syntax of JSON values:

JSON type  | Methods                   | Description
---------- | ------------------------- | ---------------------------------
object     | BeginObject, EndObject    | { ... }
array      | BeginArray, EndArray      | [ ... ]
member     | BeginMember, EndMember    | "key": value
value      | Value                     | true, false, null, number, string
--         | EndOfInput                | end of input

Each method is passed an Anchor value that can be used to retrieve location and type information. See the comments on the Handler type for the meaning of each method's anchor value. The Anchor passed to a handler method is only valid for the duration of that method call; the handler must copy any data it needs to retain beyond the lifetime of the call.

The parser ensures that corresponding Begin and End methods are correctly paired, or that a SyntaxError is reported.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ParseFloat

func ParseFloat(text []byte, bitSize int) (float64, error)

ParseFloat behaves as strconv.ParseFloat, but does not copy its argument.

func ParseInt

func ParseInt(text []byte, base, bitSize int) (int64, error)

ParseInt behaves as strconv.ParseInt, but does not copy its argument.

func Quote

func Quote(src string) string

Quote encodes src as a JSON string value. The contents are escaped and double quotation marks are added.

func Unquote

func Unquote(src []byte) ([]byte, error)

Unquote decodes a JSON string value. Double quotation marks are removed, and escape sequences are replaced with their unescaped equivalents.

Invalid escapes are replaced by the Unicode replacement rune. Unquote reports an error for an incomplete escape sequence.

func UnquoteString

func UnquoteString(src string) ([]byte, error)

UnquoteString decodes a JSON string value. Double quotation marks are removed, and escape sequences are replaced with their unescaped equivalents.

Invalid escapes are replaced by the Unicode replacement rune. Unquote reports an error for an incomplete escape sequence.

Types

type Anchor

type Anchor interface {
	Token() Token       // Returns the token type of the anchor
	Text() []byte       // Returns a view of the raw (undecoded) text of the anchor
	Copy() []byte       // Returns a copy of the raw text of the anchor
	Location() Location // Returns the full location of the anchor
}

An Anchor represents a location in source text. The methods of an Anchor will report the location, token type, and contents of the anchor.

type CommentHandler

type CommentHandler interface {
	// Process the line or block comment at the specified location.
	// Line comments include their leading "//" and trailing newline (if present).
	// Block comments include their leading "/*" and trailing "*/".
	Comment(loc Anchor)
}

CommentHandler is an optional interface that a Handler may implement to handle comment tokens. If a handler implements this method and comments are enabled in the scanner, Comment will be called for each comment token that occurs in the input. If the handler does not provide this method, comments will be silently discarded.

type Handler

type Handler interface {
	// Begin a new object, whose open brace is at loc.
	BeginObject(loc Anchor) error

	// End the most-recently-opened object, whose close brace is at loc.
	EndObject(loc Anchor) error

	// Begin a new array, whose open bracket is at loc.
	BeginArray(loc Anchor) error

	// End the most-recently-opened array, whose close bracket is at loc.
	EndArray(loc Anchor) error

	// Begin a new object member, whose key is at loc.  The text of the key is
	// still quoted; the handler is responsible for unescaping key values if the
	// plain string is required (see jtree.Unquote).
	BeginMember(loc Anchor) error

	// End the current object member giving the location and type of the token
	// that terminated the member (either Comma or RBrace).
	EndMember(loc Anchor) error

	// Report a data value at the given location. The type of the value can be
	// recovered from the token. String tokens are quoted.
	Value(loc Anchor) error

	// EndOfInput reports the end of the input stream.
	EndOfInput(loc Anchor)
}

A Handler handles events from parsing an input stream. If a method reports an error, parsing stops and that error is returned to the caller. The parser ensures objects and arrays are correctly balanced.

The Anchor argument to a Handler method is only valid for the duration of that method call. If the method needs to retain information about the location after it returns, it must copy the relevant data.

type Interner

type Interner map[string]string

Interner is a deduplicating string interning map.

func (Interner) Intern

func (n Interner) Intern(text []byte) string

Intern returns text as a string, ensuring that only one string is allocated for each unique text.

type LineCol

type LineCol struct {
	Line   int // line number, 1-based
	Column int // byte offset of column in line, 0-based
}

A LineCol describes the line number and column offset of a location in source text.

func (LineCol) String

func (lc LineCol) String() string

type Location

type Location struct {
	Span
	First, Last LineCol
}

A Location describes the complete location of a range of source text, including line and column offsets.

func (Location) String

func (loc Location) String() string

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

A Scanner reads lexical tokens from an input stream. Each call to Next advances the scanner to the next token, or reports an error.

func NewScanner

func NewScanner(r io.Reader) *Scanner

NewScanner constructs a new lexical scanner that consumes input from r.

func (*Scanner) AllowComments

func (s *Scanner) AllowComments(ok bool)

AllowComments configures the scanner to report (true) or reject (false) comment tokens. Comments are a non-standard exension of the JSON spec. If enabled, C++ style block comments (/* ... */) and line comments (// ...) are recognized and emitted as tokens.

func (*Scanner) Copy

func (s *Scanner) Copy() []byte

Copy returns a copy of the undecoded text of the current token.

func (*Scanner) Err

func (s *Scanner) Err() error

Err returns the last error reported by Next.

func (*Scanner) Location

func (s *Scanner) Location() Location

Location returns the complete location of the current token.

func (*Scanner) Next

func (s *Scanner) Next() error

Next advances s to the next token of the input, or reports an error. At the end of the input, Next returns io.EOF.

func (*Scanner) Span

func (s *Scanner) Span() Span

Span returns the location span of the current token.

func (*Scanner) Text

func (s *Scanner) Text() []byte

Text returns the undecoded text of the current token. The return value is only valid until the next call of Next. The caller must copy the contents of the returned slice if it is needed beyond that.

func (*Scanner) Token

func (s *Scanner) Token() Token

Token returns the type of the current token.

type Span

type Span struct {
	Pos int // the start offset, 0-based
	End int // the end offset, 0-based (noninclusive)
}

A Span describes a contiguous span of a source input.

func (Span) String

func (s Span) String() string

type Stream

type Stream struct {
	// contains filtered or unexported fields
}

Stream is a stream parser that consumes input and delivers events to a Handler corresponding with the structure of the input.

func NewStream

func NewStream(r io.Reader) *Stream

NewStream constructs a new Stream that consumes input from r.

func (*Stream) AllowComments

func (s *Stream) AllowComments(ok bool)

AllowComments configures the scanner associated with s to report (true) or reject (false) comment tokens.

func (*Stream) AllowTrailingCommas

func (s *Stream) AllowTrailingCommas(ok bool)

AllowTrailingCommas configures the parser to allow (true) or reject (false) trailing comments in objects and arrays.

func (*Stream) Parse

func (s *Stream) Parse(h Handler) (err error)

Parse parses the input stream and delivers events to h until either an error occurs or the input is exhausted. In case of a syntax error, the returned error has type *SyntaxError.

func (*Stream) ParseOne

func (s *Stream) ParseOne(h Handler) (err error)

ParseOne parses a single value from the input stream and delivers events to h until the value is complete or an error occurs. If no further value is available from the input, ParseOne returns io.EOF. In case of a syntax error, the returned error has type *SyntaxError.

type SyntaxError

type SyntaxError struct {
	Location LineCol
	Message  string
	// contains filtered or unexported fields
}

SyntaxError is the concrete type of errors reported by the stream parser.

func (*SyntaxError) Error

func (s *SyntaxError) Error() string

Error satisfies the error interface.

func (*SyntaxError) Unwrap

func (s *SyntaxError) Unwrap() error

Unwrap supports error wrapping.

type Token

type Token byte

Token is the type of a lexical token in the JSON grammar.

const (
	Invalid Token = iota // invalid token
	LBrace               // left brace "{"
	RBrace               // right brace "}"
	LSquare              // left square bracket "["
	RSquare              // right square bracket "]"
	Comma                // comma ","
	Colon                // colon ":"
	Integer              // number: integer with no fraction or exponent
	Number               // number with fraction and/or exponent
	String               // quoted string
	True                 // constant: true
	False                // constant: false
	Null                 // constant: null

	BlockComment // comment: /* ... */
	LineComment  // comment: // ... <LF>

)

Constants defining the valid Token values.

func (Token) String

func (t Token) String() string

Directories

Path Synopsis
Package ast defines an abstract syntax tree for JSON values, and a parser that constructs syntax trees from JSON source.
Package ast defines an abstract syntax tree for JSON values, and a parser that constructs syntax trees from JSON source.
Package cursor implements traversal over the AST of a JSON value.
Package cursor implements traversal over the AST of a JSON value.
internal
testutil
Package testutil defines support code for unit tests.
Package testutil defines support code for unit tests.
Package jwcc implements a parser for JSON With Commas and Comments (JWCC) as defined by https://nigeltao.github.io/blog/2021/json-with-commas-comments.html
Package jwcc implements a parser for JSON With Commas and Comments (JWCC) as defined by https://nigeltao.github.io/blog/2021/json-with-commas-comments.html
Package tq implements structural traversal queries over JSON values.
Package tq implements structural traversal queries over JSON values.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL