Documentation ¶
Overview ¶
Package iterators is a support (base types) package for other packages in UAX29.
Index ¶
- Variables
- func All(src []byte, dest *[][]byte, split bufio.SplitFunc) error
- type Scanner
- type Segmenter
- func (seg *Segmenter) Bytes() []byte
- func (seg *Segmenter) End() int
- func (seg *Segmenter) Err() error
- func (seg *Segmenter) Filter(filter filter.Func)
- func (seg *Segmenter) Next() bool
- func (seg *Segmenter) SetText(data []byte)
- func (seg *Segmenter) Start() int
- func (seg *Segmenter) Text() string
- func (seg *Segmenter) Transform(transformers ...transform.Transformer)
Constants ¶
This section is empty.
Variables ¶
var ErrAdvanceNegative = errors.New("SplitFunc returned a negative advance, this is likely a bug in the SplitFunc")
var ErrAdvanceTooFar = errors.New("SplitFunc advanced beyond the end of the data, this is likely a bug in the SplitFunc")
Functions ¶
Types ¶
type Scanner ¶
type Scanner struct {
// contains filtered or unexported fields
}
func NewScanner ¶
NewScanner creates a new Scanner given an io.Reader and bufio.SplitFunc. To use the new scanner, iterate while Scan() is true.
func (*Scanner) Bytes ¶ added in v1.11.0
Bytes returns the current token, which results from calling Scan.
func (*Scanner) Filter ¶
Filter applies one or more filters (predicates) to all tokens, only returning those where all filters evaluate true. Filters are applied after Transformers.
func (*Scanner) Scan ¶
Scan advances to the next token. It returns true until end of data, or an error. Use Bytes() to retrieve the token, and be sure to check Err().
func (*Scanner) Text ¶ added in v1.11.0
Text returns the current token as a string, which results from calling Scan.
func (*Scanner) Transform ¶ added in v1.9.0
func (sc *Scanner) Transform(transformers ...transform.Transformer)
Transform applies one or more transformers to all tokens, in order. Calling Transform overwrites previous transformers, so call it once (it's variadic, you can add multiple). Transformers are applied before Filters.
type Segmenter ¶
type Segmenter struct {
// contains filtered or unexported fields
}
Segmenter is an iterator for byte slices, which are segmented into tokens (segments). To use it, you will define a SplitFunc, SetText with the bytes you wish to tokenize, loop over Next until false, call Bytes to retrieve the current token, and check Err after the loop.
Note that Segmenter is designed for use with the SplitFuncs in the various uax29 sub-packages, and relies on assumptions about their behavior. Caveat emptor when bringing your own SplitFunc.
func NewSegmenter ¶
NewSegmenter creates a new segmenter given a SplitFunc. To use the new segmenter, call SetText() and then iterate while Next() is true.
Note that Segmenter is designed for use with the SplitFuncs in the various uax29 sub-packages, and relies on assumptions about their behavior. Caveat emptor when bringing your own SplitFunc.
func (*Segmenter) End ¶ added in v1.10.0
End returns the position (byte index) of the first byte after the current token, in the original text.
In other words, segmenter.Bytes() == original[segmenter.Start():segmenter.End()]
func (*Segmenter) Err ¶
Err indicates an error occured when calling Next; Next will return false when an error occurs.
func (*Segmenter) Filter ¶
Filter applies a filter (predicate) to all tokens, returning only those where all filters evaluate true. Calling Filter will overwrite the previous filter.
func (*Segmenter) Next ¶
Next advances Segmenter to the next token (segment). It returns false when there are no remaining segments, or an error occurred.
func (*Segmenter) SetText ¶
SetText sets the text for the segmenter to operate on, and resets all state.
func (*Segmenter) Start ¶ added in v1.10.0
Start returns the position (byte index) of the current token in the original text.
func (*Segmenter) Transform ¶ added in v1.9.0
func (seg *Segmenter) Transform(transformers ...transform.Transformer)
Transform applies one or more transforms to all tokens. Calling Transform will overwrite previous transforms, so call it once (it's variadic, you can add multiple, which will be applied in order).
Directories ¶
Path | Synopsis |
---|---|
Package filter provides methods for filtering via Scanners and Segmenters.
|
Package filter provides methods for filtering via Scanners and Segmenters. |
Package transformer provides a few handy transformers, for use with Scanner and Segmenter.
|
Package transformer provides a few handy transformers, for use with Scanner and Segmenter. |