tokenizers

package
v0.0.0-...-9d40a61 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 2, 2023 License: GPL-3.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var AllTokenizers = map[string]Tokenizer{
	"words":         &Words{},
	"nonwords":      &RegexpNonWords{},
	"alphaboundary": &AlphaBoundary{},
}

Functions

This section is empty.

Types

type AlphaBoundary

type AlphaBoundary struct {
	// contains filtered or unexported fields
}

func (*AlphaBoundary) Tokenize

func (a *AlphaBoundary) Tokenize(in string) []string

type RegexpNonWords

type RegexpNonWords struct{}

func (*RegexpNonWords) Tokenize

func (r *RegexpNonWords) Tokenize(in string) []string

type Tokenizer

type Tokenizer interface {
	Tokenize(string) []string
}

type TokenizerOptions

type TokenizerOptions struct {
	Name *string
}

func (*TokenizerOptions) AddFlags

func (o *TokenizerOptions) AddFlags(fs *flag.FlagSet, prefix string)

func (*TokenizerOptions) Validate

func (o *TokenizerOptions) Validate() error

type Words

type Words struct{}

func (*Words) Tokenize

func (w *Words) Tokenize(in string) []string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL