scanner

package
v1.2.120 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 12, 2024 License: MIT Imports: 10 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ScanBytes is a split function for a Scanner that returns each byte as a token.
	ScanBytes = bufio.ScanBytes

	// ScanRunes is a split function for a Scanner that returns each
	// UTF-8-encoded rune as a token. The sequence of runes returned is
	// equivalent to that from a range loop over the input as a string, which
	// means that erroneous UTF-8 encodings translate to U+FFFD = "\xef\xbf\xbd".
	// Because of the Scan interface, this makes it impossible for the client to
	// distinguish correctly encoded replacement runes from encoding errors.
	ScanRunes = bufio.ScanRunes

	// ScanWords is a split function for a Scanner that returns each
	// space-separated word of text, with surrounding spaces deleted. It will
	// never return an empty string. The definition of space is set by
	// unicode.IsSpace.
	ScanWords = bufio.ScanWords

	// ScanLines is a split function for a Scanner that returns each line of
	// text, stripped of any trailing end-of-line marker. The returned line may
	// be empty. The end-of-line marker is one optional carriage return followed
	// by one mandatory newline. In regular expression notation, it is `\r?\n`.
	// The last non-empty line of input will be returned even if it has no
	// newline.
	ScanLines = bufio.ScanLines
)

Split functions

Functions

func ScanEscapes

func ScanEscapes(quote rune) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanEscapes is a split function wrapper for a Scanner that returns each string which is an escape format of text. The returned line may be empty.

func ScanIdentifier

func ScanIdentifier(data []byte, atEOF bool) (advance int, token []byte, err error)

https://golang.org/ref/spec#Identifiers ScanIdentifier is a split function wrapper for a Scanner that returns each string which is an identifier format of text. The returned line may be empty. identifier = letter { letter | unicode_digit } .

func ScanInterpretedStrings

func ScanInterpretedStrings(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanInterpretedStrings is a split function for a Scanner that returns each string quoted by " of text. The returned line may be empty. Interpreted string literals are character sequences between double quotes, as in "bar". Within the quotes, any character may appear except newline and unescaped double quote. The text between the quotes forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. The three-digit octal (\nnn) and two-digit hexadecimal (\xnn) escapes represent individual bytes of the resulting string; all other escapes represent the (possibly multi-byte) UTF-8 encoding of individual characters. Thus inside a string literal \377 and \xFF represent a single byte of value 0xFF=255, while ÿ, \u00FF, \U000000FF and \xc3\xbf represent the two bytes 0xc3 0xbf of the UTF-8 encoding of character U+00FF. https://golang.org/ref/spec#String_literals interpreted_string_lit = `"` { unicode_value | byte_value } `"` .

func ScanMantissas

func ScanMantissas(base int) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanMantissas is a split function wrapper for a Scanner that returns each string which is an n-base number format of text. The returned line may be empty.

func ScanNumbers

func ScanNumbers(data []byte, atEOF bool) (advance int, token []byte, err error)

https://golang.org/ref/spec#Integer_literals https://golang.org/ref/spec#Floating-point_literals https://golang.org/ref/spec#Imaginary_literals ScanNumbers is a split function wrapper for a Scanner that returns each string which is an integer, floating-point or imaginary format of text. The returned line may be empty.

func ScanRawStrings

func ScanRawStrings(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanRawStrings is a split function for a Scanner that returns each string quoted by ` of text. The returned line may be empty. Escape is disallowed Raw string literals are character sequences between back quotes, as in `foo`. Within the quotes, any character may appear except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; in particular, backslashes have no special meaning and the string may contain newlines. Carriage return characters ('\r') inside raw string literals are discarded from the raw string value. https://golang.org/ref/spec#String_literals raw_string_lit = "`" { unicode_char | newline } "`" .

func ScanRegexp

func ScanRegexp(regs ...*regexp.Regexp) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanRegexp is a split function wrapper for a Scanner that returns each string until regexp case is not meet. The returned line may be empty.

func ScanRegexpPerl

func ScanRegexpPerl(expectStrs ...string) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanRegexpPerl is a split function wrapper for a Scanner that returns each string until regexp case is not meet. The returned line may be empty. This so-called leftmost-first matching is the same semantics that Perl, Python, and other implementations use, although this package implements it without the expense of backtracking. For POSIX leftmost-longest matching, see ScanRegexpPosix.

func ScanRegexpPosix

func ScanRegexpPosix(expectStrs ...string) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanRegexpPosix is a split function wrapper for a Scanner that returns each string until regexp case is not meet. The returned line may be empty. ScanRegexpPosix is like ScanRegexpPerl but restricts the regular expression to POSIX ERE (egrep) syntax and changes the match semantics to leftmost-longest.

func ScanUntil

func ScanUntil(filter func(r rune) bool) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanUntil is a split function wrapper for a Scanner that returns each string until filter case is meet. The returned line may be empty.

func ScanWhile

func ScanWhile(filter func(r rune) bool) func(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanUntil is a split function wrapper for a Scanner that returns each string until filter case is not meet. The returned line may be empty.

Types

type ErrorHandler

type ErrorHandler func(pos token.Position, msg string)

An ErrorHandler may be provided to Scanner.Init. If a syntax error is encountered and a handler was installed, the handler is called with a position and an error message. The position points to the beginning of the offending token.

type Mode

type Mode uint

A mode value is a set of flags (or 0). They control scanner behavior.

const (
	ModeCaseSensitive Mode = 1 << iota
	ModeRegexpPerl
	ModeRegexpPosix
)

type Scanner

type Scanner struct {

	// public state - ok to modify
	ErrorCount int // number of errors encountered
	// contains filtered or unexported fields
}

A Scanner holds the scanner's internal state while processing a given text. It can be allocated as part of another data structure but must be initialized via Init before use.

func (*Scanner) AtEOF

func (s *Scanner) AtEOF() bool

func (*Scanner) Consume

func (s *Scanner) Consume()

walk until current is consumed

func (*Scanner) CurrentBytes

func (s *Scanner) CurrentBytes() []byte

func (*Scanner) CurrentLength

func (s *Scanner) CurrentLength() int

func (*Scanner) CurrentRune

func (s *Scanner) CurrentRune() rune

func (*Scanner) CurrentRunes

func (s *Scanner) CurrentRunes() []rune

func (*Scanner) CurrentString

func (s *Scanner) CurrentString() string

func (*Scanner) Init

func (s *Scanner) Init(file *token.File, src []byte, err ErrorHandler, mode Mode)

Init prepares the scanner s to tokenize the text src by setting the scanner at the beginning of src. The scanner uses the file set file for position information and it adds line information for each line. It is ok to re-use the same file when re-scanning the same file as line information which is already present is ignored. Init causes a panic if the file size does not match the src size.

Calls to Scan will invoke the error handler err if they encounter a syntax error and err is not nil. Also, for each error encountered, the Scanner field ErrorCount is incremented by one. The mode parameter determines how comments are handled.

Note that Init may call err if there is an error in the first character of the file.

func (*Scanner) NextByte

func (s *Scanner) NextByte()

func (*Scanner) NextBytesN

func (s *Scanner) NextBytesN(n int)

func (*Scanner) NextRegexp

func (s *Scanner) NextRegexp(expectStrs ...string)

Read the NextRune Unicode chars into s.ch. s.ch < 0 means end-of-file.

func (*Scanner) NextRune

func (s *Scanner) NextRune()

Read the NextRune Unicode char into s.ch. s.AtEOF() == true means end-of-file.

func (*Scanner) NextRunesN

func (s *Scanner) NextRunesN(n int)

Read the NextRune Unicode chars into s.ch. s.ch < 0 means end-of-file.

func (*Scanner) PeekByte

func (s *Scanner) PeekByte() byte

PeekByte returns the byte following the most recently read character without advancing the scanner. If the scanner is at EOF, PeekByte returns 0.

func (*Scanner) PeekRegexpAny

func (s *Scanner) PeekRegexpAny(expectStrs ...string) string

PeekRegexpAny returns the string following the most recently read character which matches the regexp case without advancing the scanner. If the scanner is at EOF or regexp unmatched, PeekRegexpAny returns nil.

func (*Scanner) PeekRune

func (s *Scanner) PeekRune() rune

func (*Scanner) PeekString

func (s *Scanner) PeekString(expectStrs ...string) string

func (*Scanner) ScanEscape

func (s *Scanner) ScanEscape(quote rune) bool

ScanEscape parses an escape sequence where rune is the accepted escaped quote. In case of a syntax error, it stops at the offending character (without consuming it) and returns false. Otherwise it returns true.

func (*Scanner) ScanLine

func (s *Scanner) ScanLine() string

func (*Scanner) ScanRawString

func (s *Scanner) ScanRawString() string

func (*Scanner) ScanRune

func (s *Scanner) ScanRune() string

func (*Scanner) ScanSplits

func (s *Scanner) ScanSplits(splits ...bufio.SplitFunc) ([]byte, bool)

ScanSplits advances the Scanner to the next token by splits when first meet, which will then be available through the Bytes or Text method. It returns false when the scan stops, either by reaching the end of the input or an error. After Scan returns false, the Err method will return any error that occurred during scanning, except that if it was io.EOF, Err will return nil.

func (*Scanner) ScanString

func (s *Scanner) ScanString() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL