scanner

package

v0.0.0-...-90c9d3a Latest Latest Go to latest Published: Mar 21, 2010 License: BSD-3-Clause, GooglePatentClause Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ivanwyc/google-go

Links

Open Source Insights

Documentation ¶

Overview ¶

A general-purpose scanner for UTF-8 encoded text. Takes an io.Reader providing the source which then can be tokenized through repeated calls to the Scan function. For compatibility with existing tools, the NUL character is not allowed (implementation restriction).

By default, a Scanner skips white space and comments and recognizes literals as defined by the Go language spec. It may be customized to recognize only a subset of those literals and to recognize different white space characters.

Basic usage pattern:

var s scanner.Scanner
s.Init(src)
tok := s.Scan()
for tok != scanner.EOF {
	// do something with tok
	tok = s.Scan()
}

Index ¶

Constants
func TokenString(tok int) string
type Position
- func (pos *Position) IsValid() bool
- func (pos Position) String() string
type Scanner

Constants ¶

View Source

const (
	ScanIdents     = 1 << -Ident
	ScanInts       = 1 << -Int
	ScanFloats     = 1 << -Float // includes Ints
	ScanChars      = 1 << -Char
	ScanStrings    = 1 << -String
	ScanRawStrings = 1 << -RawString
	ScanComments   = 1 << -Comment
	SkipComments   = 1 << -skipComment // if set with ScanComments, comments become white space
	GoTokens       = ScanIdents | ScanFloats | ScanChars | ScanStrings | ScanRawStrings | ScanComments | SkipComments
)

Predefined mode bits to control recognition of tokens. For instance, to configure a Scanner such that it only recognizes (Go) identifiers, integers, and skips comments, set the Scanner's Mode field to:

ScanIdents | ScanInts | SkipComments

View Source

const (
	EOF = -(iota + 1)
	Ident
	Int
	Float
	Char
	String
	RawString
	Comment
)

The result of Scan is one of the following tokens or a Unicode character.

View Source

const GoWhitespace = 1<<'\t' | 1<<'\n' | 1<<'\r' | 1<<' '

GoWhitespace is the default value for the Scanner's Whitespace field. Its value selects Go's white space characters.

Variables ¶

This section is empty.

Functions ¶

func TokenString ¶

func TokenString(tok int) string

TokenString returns a (visible) string for a token or Unicode character.

Types ¶

type Position ¶

type Position struct {
	Filename string // filename, if any
	Offset   int    // byte offset, starting at 0
	Line     int    // line number, starting at 1
	Column   int    // column number, starting at 0 (character count per line)
}

A source position is represented by a Position value. A position is valid if Line > 0.

func (*Position) IsValid ¶

func (pos *Position) IsValid() bool

IsValid returns true if the position is valid.

func (Position) String ¶

func (pos Position) String() string

type Scanner ¶

type Scanner struct {

	// Error is called for each error encountered. If no Error
	// function is set, the error is reported to os.Stderr.
	Error func(s *Scanner, msg string)

	// ErrorCount is incremented by one for each error encountered.
	ErrorCount int

	// The Mode field controls which tokens are recognized. For instance,
	// to recognize Ints, set the (1<<-Int) bit in Mode. The field may be
	// changed at any time.
	Mode uint

	// The Whitespace field controls which characters are recognized
	// as white space. To recognize a character ch <= ' ' as white space,
	// set the ch'th bit in Whitespace (the Scanner's behavior is undefined
	// for values ch > ' '). The field may be changed at any time.
	Whitespace uint64

	// Current token position. The Offset, Line, and Column fields
	// are set by Scan(); the Filename field is left untouched by the
	// Scanner.
	Position
	// contains filtered or unexported fields
}

A Scanner implements reading of Unicode characters and tokens from an io.Reader.

func (*Scanner) Init ¶

func (s *Scanner) Init(src io.Reader) *Scanner

Init initializes a Scanner with a new source and returns itself. Error is set to nil, ErrorCount is set to 0, Mode is set to GoTokens, and Whitespace is set to GoWhitespace.

func (*Scanner) Next ¶

func (s *Scanner) Next() int

Next reads and returns the next Unicode character. It returns EOF at the end of the source. It reports a read error by calling s.Error, if set, or else prints an error message to os.Stderr. Next does not update the Scanner's Position field; use Pos() to get the current position.

func (*Scanner) Pos ¶

func (s *Scanner) Pos() Position

Position returns the current source position. If called before Next() or Scan(), it returns the position of the next Unicode character or token returned by these functions. If called afterwards, it returns the position immediately after the last character of the most recent token or character scanned.

func (*Scanner) Scan ¶

func (s *Scanner) Scan() int

Scan reads the next token or Unicode character from source and returns it. It only recognizes tokens t for which the respective Mode bit (1<<-t) is set. It returns EOF at the end of the source. It reports scanner errors (read and token errors) by calling s.Error, if set; otherwise it prints an error message to os.Stderr.

func (*Scanner) TokenText ¶

func (s *Scanner) TokenText() string

TokenText returns the string corresponding to the most recently scanned token. Valid after calling Scan().

Source Files ¶

View all Source files

scanner.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL