dict

package
v0.0.14 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 8, 2023 License: GPL-3.0 Imports: 6 Imported by: 0

Documentation

Index

Examples

Constants

View Source
const (
	// MaxScanTokenSize is the maximum size used to buffer a token
	// unless the user provides an explicit buffer with Dict.Buffer.
	// The actual maximum token size may be smaller as the buffer
	// may need to include, for instance, a newline.
	MaxScanTokenSize = 64 * 1024
)

Variables

View Source
var (
	ErrTooLong         = errors.New("bufio.Dict: token too long")
	ErrNegativeAdvance = errors.New("bufio.Dict: SplitFunc returns negative advance count")
	ErrAdvanceTooFar   = errors.New("bufio.Dict: SplitFunc returns advance count beyond input")
	ErrBadReadCount    = errors.New("bufio.Dict: Read returned impossible count")
)

Errors returned by Dict.

View Source
var ErrFinalToken = errors.New("final token")

ErrFinalToken is a special sentinel error value. It is intended to be returned by a Split function to indicate that the token being delivered with the error is the last token and scanning should stop after this one. After ErrFinalToken is received by Scan, scanning stops with no error. The value is useful to stop processing early or when it is necessary to deliver a final empty token. One could achieve the same behavior with a custom error value but providing one here is tidier. See the emptyFinalToken example for a use of this value.

Functions

func DefaultLines added in v0.0.13

func DefaultLines(data []byte, atEOF bool) (advance int, token []byte, err error)

DefaultLines is a split function for a Dict that returns each line of text, stripped of any trailing end-of-line marker. The returned line may be empty. The end-of-line marker is one optional carriage return followed by one mandatory newline. In regular expression notation, it is `\r?\n`. The last non-empty line of input will be returned even if it has no newline.

func ScanBytes added in v0.0.13

func ScanBytes(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanBytes is a split function for a Dict that returns each byte as a token.

func ScanRunes added in v0.0.13

func ScanRunes(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanRunes is a split function for a Dict that returns each UTF-8-encoded rune as a token. The sequence of runes returned is equivalent to that from a range loop over the input as a string, which means that erroneous UTF-8 encodings translate to U+FFFD = "\xef\xbf\xbd". Because of the Scan interface, this makes it impossible for the client to distinguish correctly encoded replacement runes from encoding errors.

func ScanWords added in v0.0.13

func ScanWords(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanWords is a split function for a Dict that returns each space-separated word of text, with surrounding spaces deleted. It will never return an empty string. The definition of space is set by unicode.IsSpace.

Types

type Dict

type Dict struct {
	// contains filtered or unexported fields
}

Dict provides a convenient interface for reading data such as a file of newline-delimited lines of text. Successive calls to the Scan method will step through the 'tokens' of a file, skipping the bytes between the tokens. The specification of a token is defined by a split function of type SplitFunc; the default split function breaks the input into lines with line termination stripped. Split functions are defined in this package for scanning a file into lines, bytes, UTF-8-encoded runes, and space-delimited words. The client may instead provide a custom split function.

Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

func NewDict

func NewDict(r io.Reader) *Dict

NewDict returns a new Dict to read from r. The split function defaults to DefaultLines.

Example
package main

import (
	"fmt"
	"github.com/sechelper/seclib/dict"
	"log"
	"os"
)

func main() {
	op, err := os.Open("users.txt")
	if err != nil {
		log.Fatal(err)
	}
	d := dict.NewDict(op)

	for d.Scan() {
		if line, err := d.Line(); err == nil {
			fmt.Println(line)
		}
	}
}
Output:

func NewDictForFile added in v0.0.13

func NewDictForFile(path string) (*Dict, error)
Example
package main

import (
	"fmt"
	"github.com/sechelper/seclib/dict"
	"log"
)

func main() {
	d, err := dict.NewDictForFile("users.txt")
	if err != nil {
		log.Fatal(err)
	}

	for d.Scan() {
		if line, err := d.Line(); err == nil {
			fmt.Println(line)
		}
	}
}
Output:

func (*Dict) Buffer added in v0.0.13

func (d *Dict) Buffer(buf []byte, max int)

Buffer sets the initial buffer to use when scanning and the maximum size of buffer that may be allocated during scanning. The maximum token size is the larger of max and cap(buf). If max <= cap(buf), Scan will use this buffer only and do no allocation.

By default, Scan uses an internal buffer and sets the maximum token size to MaxScanTokenSize.

Buffer panics if it is called after scanning has started.

func (*Dict) Bytes added in v0.0.13

func (d *Dict) Bytes() []byte

Bytes returns the most recent token generated by a call to Scan. The underlying array may point to data that will be overwritten by a subsequent call to Scan. It does no allocation.

func (*Dict) Err added in v0.0.11

func (d *Dict) Err() error

Err returns the first non-EOF error that was encountered by the Dict.

func (*Dict) Line added in v0.0.13

func (d *Dict) Line() (Line, error)

Line returns the most recent token generated by a call to Scan as a newly allocated string holding its bytes.

func (*Dict) LineFunc added in v0.0.13

func (d *Dict) LineFunc(line LineFunc)

func (*Dict) Scan added in v0.0.13

func (d *Dict) Scan() bool

Scan advances the Dict to the next token, which will then be available through the Bytes or Text method. It returns false when the scan stops, either by reaching the end of the input or an error. After Scan returns false, the Err method will return any error that occurred during scanning, except that if it was io.EOF, Err will return nil. Scan panics if the split function returns too many empty tokens without advancing the input. This is a common error mode for scanners.

func (*Dict) Split added in v0.0.13

func (d *Dict) Split(split SplitFunc)

Split sets the split function for the Dict. The default split function is DefaultLines.

Split panics if it is called after scanning has started.

func (*Dict) Text added in v0.0.13

func (d *Dict) Text() string

Text returns the most recent token generated by a call to Scan as a newly allocated string holding its bytes.

type Line

type Line interface {
}

Line dict line data

func DefaultLine added in v0.0.13

func DefaultLine(data []byte) (Line, error)

func LoginLineFunc added in v0.0.13

func LoginLineFunc(b []byte) (Line, error)

LoginLineFunc make LoginLine

Example
package main

import (
	"fmt"
	"github.com/sechelper/seclib/dict"
	"log"
)

func main() {
	d, err := dict.NewDictForFile("user-pass.txt")
	if err != nil {
		log.Fatal(err)
	}
	d.LineFunc(dict.LoginLineFunc)
	for d.Scan() {
		if line, err := d.Line(); err == nil {
			fmt.Println(line.(dict.LoginLine).User, line.(dict.LoginLine).Passwd)
		}

	}
}
Output:

type LineFunc added in v0.0.13

type LineFunc func([]byte) (Line, error)

type LoginLine added in v0.0.13

type LoginLine struct {
	Line

	User   string
	Passwd string
}

LoginLine user login use this Line

type SplitFunc added in v0.0.13

type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)

SplitFunc is the signature of the split function used to tokenize the input. The arguments are an initial substring of the remaining unprocessed data and a flag, atEOF, that reports whether the Reader has no more data to give. The return values are the number of bytes to advance the input and the next token to return to the user, if any, plus an error, if any.

Scanning stops if the function returns an error, in which case some of the input may be discarded. If that error is ErrFinalToken, scanning stops with no error.

Otherwise, the Dict advances the input. If the token is not nil, the Dict returns it to the user. If the token is nil, the Dict reads more data and continues scanning; if there is no more data--if atEOF was true--the Dict returns. If the data does not yet hold a complete token, for instance if it has no newline while scanning lines, a SplitFunc can return (0, nil, nil) to signal the Dict to read more data into the slice and try again with a longer slice starting at the same point in the input.

The function is never called with an empty data slice unless atEOF is true. If atEOF is true, however, data may be non-empty and, as always, holds unprocessed text.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL