scanner

package
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 30, 2022 License: Apache-2.0 Imports: 10 Imported by: 5

Documentation

Index

Examples

Constants

View Source
const DefaultTemplate = `
{{- if .Errors -}}
	{{- range .Errors -}}
		error: {{.}}
	{{- end -}}
{{- else -}}
	{{- .Pos -}}
{{- end -}}
`

Variables

View Source
var DefaultErrFmtFunc = func(e error) string { return fmt.Sprintf("%v\n", e) }
View Source
var DefaultErrorMessage = `failed to scan`
View Source
var Template *template.Template

Template is used by Report to log human-friendly scanner state information if the scan.R has not set its own Template.

View Source
var Trace int

Trace sets the trace for everything that uses this package. Use TraceOn/Off for specific scanner tracing.

View Source
var ViewLenDefault = 10 // default length of preview window

Functions

This section is empty.

Types

type Error

type Error struct {
	P   int      // can be left blank if Pos is defined
	Pos Position // can be left blank, Report will populate
	Msg string
}

func (Error) Error

func (e Error) Error() string

type Position

type Position struct {
	Rune    rune // rune at this location
	BufByte int  // byte offset in file
	BufRune int  // rune offset in file
	Line    int  // line offset
	LByte   int  // line column byte offset
	LRune   int  // line column rune offset
}

Position contains the human-friendly information about the position within a give text file. Note that all values begin with 1 and not 0.

func (Position) Log

func (p Position) Log()

Log calls log.Println on the cursor itself in String form. See String.

func (Position) Print

func (p Position) Print()

Print prints the cursor itself in String form. See String.

func (Position) String

func (p Position) String() string

String fulfills the fmt.Stringer interface by printing the Position in a human-friendly way:

U+1F47F '👿' 1,3-5 (3-5)
             | | |  | |
          line | |  | overall byte offset
line rune offset |  overall rune offset
  line byte offset

type S

type S struct {
	Buf        []byte             // full buffer for lookahead or behind
	R          rune               // last decoded/scanned rune, maybe >1byte
	B          int                // index pointing beginning of R
	E          int                // index pointing to end (after) R
	Template   *template.Template // for Report()
	NewLine    []string           // []string{"\r\n","\n"} by default
	Trace      int                // non-zero activates tracing
	ErrFmtFunc func(e error) string
	// contains filtered or unexported fields
}

S (to avoid stuttering) implements a buffered data, non-linear, rune-centric, scanner with regular expression support

Example (Init)
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	// * extremely minimal initialization
	// * order guaranteed never to change

	s := scanner.New(`some thing`)
	fmt.Println(s)

}
Output:

Example (Package_Trace)
package main

import (
	"log"
	"os"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	// take over stderr just for this test
	defer log.SetFlags(log.Flags())
	defer log.SetOutput(os.Stderr)
	defer func() { scanner.Trace = 0 }()
	log.SetOutput(os.Stdout)
	log.SetFlags(0)

	s := scanner.New(`foo`)
	scanner.Trace++
	s.Scan()
	s.Scan()
	s.Scan()

}
Output:

'f' 0-1 "oo"
'o' 1-2 "o"
'o' 2-3 ""

func New

func New(args ...any) *S

New is a high-level scanner constructor and initializer that takes a single optional argument containing any valid Buffer() argument. Invalid arguments will fail (not fatal) with log output.

func (*S) Beginning added in v0.1.0

func (s *S) Beginning() bool

Beginning returns true if and only if the scanner is currently pointing to the beginning of the buffer without anything scanned at all.

func (*S) Buffer

func (s *S) Buffer(b any) error

Buffer sets the internal bytes buffer (Buf) and resets the existing cursor values to their initial state (null, 0,0). This is useful when testing in order to buffer strings as well as content from any io.Reader, []byte, []rune, or string. Fulfills pegn.Scanner.

func (*S) Bytes

func (s *S) Bytes() *[]byte

func (*S) CopyBB added in v0.1.0

func (s *S) CopyBB(m curs.R) string

CopyBB returns copy [n,m) fulfilling pegn.Scanner interface.

func (*S) CopyBE added in v0.1.0

func (s *S) CopyBE(m curs.R) string

CopyBB returns copy [n,m] fulfilling pegn.Scanner interface.

func (*S) CopyEB added in v0.1.0

func (s *S) CopyEB(m curs.R) string

CopyEB returns copy (n,m) fulfilling pegn.Scanner interface.

func (*S) CopyEE added in v0.1.0

func (s *S) CopyEE(m curs.R) string

CopyEE returns copy (n,m] fulfilling pegn.Scanner interface.

func (*S) ErrPop added in v0.1.0

func (s *S) ErrPop() error

func (*S) ErrPush added in v0.1.0

func (s *S) ErrPush(e error)

func (*S) Error

func (s *S) Error() string

func (*S) Errors

func (s *S) Errors() *[]error

func (*S) Expected added in v0.1.0

func (s *S) Expected(ruleid int) bool

Expected is a shortcut for ErrPush for a new rule.Error at the current position, and returning false (always). It makes shorter code when writing pegn.ScanFuncs.

func (*S) Finished

func (s *S) Finished() bool

Finished returns true if scanner has nothing more to scan.

Example
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New(`foo`)

	s.Print()

	s.Scan()
	s.Print()
	fmt.Println(s.Finished())

	s.Scan()
	s.Print()
	fmt.Println(s.Finished())

	s.Scan()
	s.Print()
	fmt.Println(s.Finished())

}
Output:

'\x00' 0-0 "foo"
'f' 0-1 "oo"
false
'o' 1-2 "o"
false
'o' 2-3 ""
true

func (*S) Goto

func (s *S) Goto(c curs.R)

func (*S) Is

func (s *S) Is(a string) bool

Is returns true if the passed string matches the last scanned rune and the runes ahead matching the length of the string. Returns false if the string would go beyond the length of buffer (len(s.Buf)).

Example
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New(`foo`)

	s.Scan() // never forget to scan with Is (use Peek otherwise)

	fmt.Println(s.Is("fo"))
	fmt.Println(s.Is("bar"))

}
Output:

true
false
Example (Not)
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New("\r\n")

	s.Scan() // never forget to scan with Is (use Peek otherwise)

	fmt.Println(s.Is("\r"))
	fmt.Println(s.Is("\r\n"))
	fmt.Println(s.Is("\n"))

}
Output:

true
true
false

func (S) Log

func (s S) Log()

Log is shorthand for log.Print(s).

func (*S) Mark

func (s *S) Mark() curs.R

func (*S) Match

func (s *S) Match(re *regexp.Regexp) int

Match checks for a regular expression match at the last position in the buffer (s.B) providing a mechanism for positive and negative lookahead expressions. It returns the length of the match. Successful matches might be zero (see regexp.Regexp.FindIndex). A negative value is returned if no match is found. Note that Go regular expressions now include the Unicode character classes (ex: \p{L|d}) that should be used over dated alternatives (ex: \w).

Example
package main

import (
	"fmt"
	"regexp"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New(`foo`)

	s.Scan() // never forget to scan (use PeekMatch otherwise)

	f := regexp.MustCompile(`f`)
	F := regexp.MustCompile(`F`)
	o := regexp.MustCompile(`o`)

	fmt.Println(s.Match(f))
	fmt.Println(s.Match(F))
	fmt.Println(s.Match(o))

}
Output:

1
-1
-1

func (*S) Open added in v0.2.0

func (s *S) Open(path string) error

Open opens the file at path and passes it to Buffer. Fulfills pegn.Scanner.

func (*S) Peek

func (s *S) Peek(a string) bool

Peek returns true if the passed string matches from current position in the buffer (s.B) forward. Returns false if the string would go beyond the length of buffer (len(s.Buf)). Peek does not advance the Scanner.

Example
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New(`foo`)

	fmt.Println(s.Peek("fo"))
	s.Scan()
	fmt.Println(s.Peek("fo"))
	fmt.Println(s.Peek("oo"))

}
Output:

true
false
true

func (*S) PeekMatch

func (s *S) PeekMatch(re *regexp.Regexp) int

PeekMatch checks for a regular expression match at the current position in the buffer providing a mechanism for positive and negative lookahead expressions. It returns the length of the match. Successful matches might be zero (see regexp.Regexp.FindIndex). A negative value is returned if no match is found. Note that Go regular expressions now include the Unicode character classes (ex: \p{L|d}) that should be used over dated alternatives (ex: \w).

func (S) Pos

func (s S) Pos() Position
Example
package main

import (
	"github.com/rwxrob/pegn/scanner"
)

func main() {

	//😟 WARNING: uses risky jumps (assigning s.E)

	s := scanner.New("one line\nand another\r\nand yet another")

	s.E = 2
	s.Pos().Print()

	s.E = 0
	s.Scan()
	s.Scan()
	s.Pos().Print()

	s.E = 12
	s.Pos().Print()

	s.E = 27
	s.Pos().Print()

}
Output:

U+006E 'n' 1,2-2 (2-2)
U+006E 'n' 1,2-2 (2-2)
U+0064 'd' 2,3-3 (12-12)
U+0079 'y' 3,5-5 (27-27)

func (S) Positions

func (s S) Positions(p ...int) []Position

Positions returns human-friendly Position information (which can easily be used to populate a text/template) for each raw byte offset (s.E). Only one pass through the buffer (s.Buf) is required to count lines and runes since the raw byte position (s.E) is frequently changed directly. Therefore, when multiple positions are wanted, consider caching the raw byte positions (s.E) and calling Positions() once for all of them.

Example
package main

import (
	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New("one line\nand another\r\nand yet another")

	for _, p := range s.Positions(2, 12, 27) {
		p.Print()
	}

}
Output:

U+006E 'n' 1,2-2 (2-2)
U+0064 'd' 2,3-3 (12-12)
U+0079 'y' 3,5-5 (27-27)

func (S) Print

func (s S) Print()

Print is shorthand for fmt.Println(s).

func (S) Report

func (s S) Report()

Report will fill in the s.Template (or scan.Template if not set) and log it to standard error. See the log package for removing prefixes and such. The DefaultTemplate is compiled at init() and assigned to the scan.Template global package variable. To silence reports developers may use the log package or simply ensure that both s.Template and scan.Template are nil.

func (*S) Revert added in v0.1.0

func (s *S) Revert(m curs.R, ruleid int) bool

Revert is a shortcut for Expected + Goto.

func (*S) Rune

func (s *S) Rune() rune

func (*S) RuneB

func (s *S) RuneB() int

func (*S) RuneE

func (s *S) RuneE() int

func (*S) Scan

func (s *S) Scan() bool

Scan decodes the next rune, setting it to R, and advances position (P) by the size of the rune (R) in bytes returning false then there is nothing left to scan. Only runes bigger than utf8.RuneSelf are decoded since most runes (ASCII) will usually be under this number.

Example
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New(`foo`)
	s.Print() // equivalent of a "zero value"

	fmt.Println(s.Scan())
	s.Print()
	fmt.Println(s.Scan())
	s.Print()
	fmt.Println(s.Scan())
	s.Print()
	fmt.Println(s.Scan()) // does not advance
	s.Print()             // same as before

}
Output:

'\x00' 0-0 "foo"
true
'f' 0-1 "oo"
true
'o' 1-2 "o"
true
'o' 2-3 ""
false
'o' 2-3 ""
Example (Loop)
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {
	s := scanner.New(`abcdefgh`)
	for s.Scan() {
		fmt.Print(string(s.Rune()))
		if !s.Finished() {
			fmt.Print("-")
		}
	}
}
Output:

a-b-c-d-e-f-g-h
Example (Risky_Jump)
package main

import (
	"fmt"

	"github.com/rwxrob/pegn/scanner"
)

func main() {

	s := scanner.New(`foo1234`)
	fmt.Println(s.Scan())
	s.Print()
	s.E += 2              //😟 WARNING: s.R and s.B, not yet updated!
	fmt.Println(s.Scan()) //😊 s.R and s.B now updated
	s.Print()

}
Output:

true
'f' 0-1 "oo1234"
true
'1' 3-4 "234"

func (*S) SetErrFmtFunc added in v0.2.0

func (s *S) SetErrFmtFunc(fn func(e error) string)

func (*S) SetMaxErr added in v0.1.0

func (s *S) SetMaxErr(i int)

func (*S) SetViewLen added in v0.1.0

func (s *S) SetViewLen(a int)

func (S) String

func (s S) String() string

String implements fmt.Stringer with simply the position (E) and quoted rune (R) along with its Unicode. For printing more human friendly information about the current scanner state use Report.

func (*S) TraceOff added in v0.1.0

func (s *S) TraceOff()

func (*S) TraceOn added in v0.1.0

func (s *S) TraceOn()

func (*S) ViewLen added in v0.1.0

func (s *S) ViewLen() int

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL