spell

package

v1.0.22 Latest Latest Go to latest Published: Jan 24, 2023 License: BSD-3-Clause Imports: 19 Imported by: 4

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/goki/pi

Links

Open Source Insights

Documentation ¶

Overview ¶

Package spell provides functions for spell check and correction. It wraps https://github.com/sajari/fuzzy as the core spelling engine.

A single globally-usable spelling dictionary is managed.

Index ¶

Constants
Variables
func Asset(name string) ([]byte, error)
func AssetDir(name string) ([]string, error)
func AssetInfo(name string) (os.FileInfo, error)
func AssetNames() []string
func CheckIgnore(word string) bool
func CheckLexLine(src []rune, tags lex.Line) lex.Line
func CheckWord(word string) ([]string, bool)
func Complete(s string) []string
func Edits1(word string) []string
func IgnoreWord(word string)
func Initialized() bool
func LearnWord(word string)
func Levenshtein(a, b *string) int
func ModTime(path string) (time.Time, error)
func MustAsset(name string) []byte
func Open(path string) error
func OpenAsset(fname string) error
func OpenCheck() error
func OpenDefault() error
func ResetLearnTime()
func RestoreAsset(dir, name string) error
func RestoreAssets(dir, name string) error
func SampleEnglish() []string
func Save(filename string) error
func SaveIfLearn() error
func Train(file os.File, new bool) (err error)
func UnLearnWord(word string)
type Autos
- func (a Autos) Len() int
- func (a Autos) Less(i, j int) bool
- func (a Autos) Swap(i, j int)
type Counts
type Method
- func (m Method) String() string
type Model
- func FromReader(r io.Reader) (*Model, error)
- func Load(filename string) (*Model, error)
- func NewModel() *Model
- func (md *Model) Autocomplete(input string) ([]string, error)
- func (md *Model) CheckKnown(input string, correct string) bool
- func (md *Model) Delete(term string)
- func (md *Model) EditsMulti(term string, depth int) []string
- func (md *Model) Init() *Model
- func (md *Model) Potentials(input string, exhaustive bool) map[string]*Potential
- func (md *Model) Save(filename string) error
- func (md *Model) SaveLight(filename string) error
- func (md *Model) SetCount(term string, count int, suggest bool)
- func (md *Model) SetDepth(val int)
- func (md *Model) SetDivergenceThreshold(val int)
- func (md *Model) SetThreshold(val int)
- func (md *Model) SetUseAutocomplete(val bool)
- func (md *Model) SpellCheck(input string) string
- func (md *Model) SpellCheckSuggestions(input string, n int) []string
- func (md *Model) Suggestions(input string, exhaustive bool) []string
- func (md *Model) Train(terms []string)
- func (md *Model) TrainQuery(term string)
- func (md *Model) TrainWord(term string)
- func (md *Model) WriteTo(w io.Writer) error
type Pair
type Potential
- func (pot *Potential) String() string

Constants ¶

View Source

const (
	SpellDepthDefault              = 2
	SpellThresholdDefault          = 5
	SuffDivergenceThresholdDefault = 100
)

View Source

const (
	MethodIsWord                   Method = 0
	MethodSuggestMapsToInput              = 1
	MethodInputDeleteMapsToDict           = 2
	MethodInputDeleteMapsToSuggest        = 3
)

View Source

const SaveAfterLearnIntervalSecs = 20

SaveAfterLearnIntervalSecs is number of seconds since file has been opened / saved above which model is saved after learning.

Variables ¶

View Source

var (
	Ignore = map[string]struct{}{}
)

Functions ¶

func Asset ¶

func Asset(name string) ([]byte, error)

Asset loads and returns the asset for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetDir ¶

func AssetDir(name string) ([]string, error)

AssetDir returns the file names below a certain directory embedded in the file by go-bindata. For example if you run go-bindata on data/... and data contains the following hierarchy:

data/
  foo.txt
  img/
    a.png
    b.png

then AssetDir("data") would return []string{"foo.txt", "img"} AssetDir("data/img") would return []string{"a.png", "b.png"} AssetDir("foo.txt") and AssetDir("notexist") would return an error AssetDir("") will return []string{"data"}.

func AssetInfo ¶

func AssetInfo(name string) (os.FileInfo, error)

AssetInfo loads and returns the asset info for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetNames ¶

func AssetNames() []string

AssetNames returns the names of the assets. nolint: deadcode

func CheckIgnore ¶ added in v0.9.14

func CheckIgnore(word string) bool

CheckIgnore returns true if the word is found in the Ignore list

func CheckLexLine ¶ added in v0.9.14

func CheckLexLine(src []rune, tags lex.Line) lex.Line

CheckLexLine returns the Lex regions for any words that are misspelled within given line of text with existing Lex tags -- automatically excludes any Code token regions (see token.IsCode). Token is set to token.TextSpellErr on returned Lex's

func CheckWord ¶

func CheckWord(word string) ([]string, bool)

CheckWord checks a single word and returns suggestions if word is unknown

func Complete ¶

func Complete(s string) []string

Complete finds possible completions based on the prefix s

func Edits1 ¶ added in v1.0.0

func Edits1(word string) []string

Edits1 creates a set of terms that are 1 char delete from the input term

func IgnoreWord ¶

func IgnoreWord(word string)

IgnoreWord adds the word to the Ignore list

func Initialized ¶

func Initialized() bool

Initialized returns true if the model has been loaded or created anew

func LearnWord ¶

func LearnWord(word string)

LearnWord adds a single word to the corpus: this is deterministic and we set the threshold to 1 to make it learn it immediately.

func Levenshtein ¶ added in v1.0.0

func Levenshtein(a, b *string) int

Calculate the Levenshtein distance between two strings

func ModTime ¶ added in v0.9.14

func ModTime(path string) (time.Time, error)

ModTime returns the modification time of given file path

func MustAsset ¶

func MustAsset(name string) []byte

MustAsset is like Asset but panics when Asset would return an error. It simplifies safe initialization of global variables. nolint: deadcode

func Open ¶ added in v0.9.14

func Open(path string) error

Open loads the saved model stored in json format

func OpenAsset ¶ added in v0.9.14

func OpenAsset(fname string) error

OpenAsset loads json-formatted model from compiled-in asset

func OpenCheck ¶ added in v0.9.14

func OpenCheck() error

OpenCheck checks if the current file has been modified since last open time and re-opens it if so -- call this prior to checking.

func OpenDefault ¶ added in v0.9.14

func OpenDefault() error

OpenDefault loads the default spelling file. TODO: need different languages obviously!

func ResetLearnTime ¶ added in v0.9.14

func ResetLearnTime()

func RestoreAsset ¶

func RestoreAsset(dir, name string) error

RestoreAsset restores an asset under the given directory

func RestoreAssets ¶

func RestoreAssets(dir, name string) error

RestoreAssets restores an asset under the given directory recursively

func SampleEnglish ¶ added in v1.0.0

func SampleEnglish() []string

func Save ¶

func Save(filename string) error

Save saves the spelling model which includes the data and parameters note: this will overwrite any existing file -- be sure to have opened the current file before making any changes.

func SaveIfLearn ¶ added in v0.9.14

func SaveIfLearn() error

SaveIfLearn saves the spelling model to file path that was used in last Open command, if learning has occurred since last save / open. If no changes also checks if file has been modified and opens it if so.

func Train ¶

func Train(file os.File, new bool) (err error)

Train trains the model based on a text file

func UnLearnWord ¶ added in v1.0.9

func UnLearnWord(word string)

UnLearnWord removes word from dictionary -- in case accidentally added

Types ¶

type Autos ¶ added in v1.0.0

type Autos struct {
	Results []string
	Model   *Model
}

For sorting autocomplete suggestions to bias the most popular first

func (Autos) Len ¶ added in v1.0.0

func (a Autos) Len() int

func (Autos) Less ¶ added in v1.0.0

func (a Autos) Less(i, j int) bool

func (Autos) Swap ¶ added in v1.0.0

func (a Autos) Swap(i, j int)

type Counts ¶ added in v1.0.0

type Counts struct {
	Corpus int `json:"c"`
	Query  int `json:"q"`
}

Counts has the individual word counts

type Method ¶ added in v1.0.0

type Method int

func (Method) String ¶ added in v1.0.0

func (m Method) String() string

type Model ¶ added in v1.0.0

type Model struct {
	Data                    map[string]*Counts  `json:"data"`
	Maxcount                int                 `json:"maxcount"`
	Suggest                 map[string][]string `json:"suggest"`
	Depth                   int                 `json:"depth"`
	Threshold               int                 `json:"threshold"`
	UseAutocomplete         bool                `json:"autocomplete"`
	SuffDivergence          int                 `json:"-"`
	SuffDivergenceThreshold int                 `json:"suff_threshold"`
	SuffixArr               *suffixarray.Index  `json:"-"`
	SuffixArrConcat         string              `json:"-"`
	sync.RWMutex
}

Model is the full data model

func FromReader ¶ added in v1.0.0

func FromReader(r io.Reader) (*Model, error)

FromReader loads a model from a Reader

func Load ¶

func Load(filename string) (*Model, error)

Load a saved model from disk

func NewModel ¶ added in v1.0.0

func NewModel() *Model

Create and initialise a new model

func (*Model) Autocomplete ¶ added in v1.0.0

func (md *Model) Autocomplete(input string) ([]string, error)

For a given string, autocomplete using the suffix array model

func (*Model) CheckKnown ¶ added in v1.0.0

func (md *Model) CheckKnown(input string, correct string) bool

Test an input, if we get it wrong, look at why it is wrong. This function returns a bool indicating if the guess was correct as well as the term it is suggesting. Typically this function would be used for testing, not for production

func (*Model) Delete ¶ added in v1.0.8

func (md *Model) Delete(term string)

Delete removes given word from dictionary -- undoes learning

func (*Model) EditsMulti ¶ added in v1.0.0

func (md *Model) EditsMulti(term string, depth int) []string

Edits at any depth for a given term. The depth of the model is used

func (*Model) Init ¶ added in v1.0.0

func (md *Model) Init() *Model

func (*Model) Potentials ¶ added in v1.0.0

func (md *Model) Potentials(input string, exhaustive bool) map[string]*Potential

Return the raw potential terms so they can be ranked externally to this package

func (*Model) Save ¶ added in v1.0.0

func (md *Model) Save(filename string) error

Save a spelling model to disk

func (*Model) SaveLight ¶ added in v1.0.0

func (md *Model) SaveLight(filename string) error

Save a spelling model to disk, but discard all entries less than the threshold number of occurrences Much smaller and all that is used when generated as a once off, but not useful for incremental usage

func (*Model) SetCount ¶ added in v1.0.0

func (md *Model) SetCount(term string, count int, suggest bool)

Manually set the count of a word. Optionally trigger the creation of suggestion keys for the term. This function lets you build a model from an existing dictionary with word popularity counts without needing to run "TrainWord" repeatedly

func (*Model) SetDepth ¶ added in v1.0.0

func (md *Model) SetDepth(val int)

Change the default depth value of the model. This sets how many character differences are indexed. The default is 2.

func (*Model) SetDivergenceThreshold ¶ added in v1.0.0

func (md *Model) SetDivergenceThreshold(val int)

Optionally set the suffix array divergence threshold. This is the number of query training steps between rebuilds of the suffix array. A low number will be more accurate but will use resources and create more garbage.

func (*Model) SetThreshold ¶ added in v1.0.0

func (md *Model) SetThreshold(val int)

Change the default threshold of the model. This is how many times a term must be seen before suggestions are created for it

func (*Model) SetUseAutocomplete ¶ added in v1.0.0

func (md *Model) SetUseAutocomplete(val bool)

Optionally disabled suffixarray based autocomplete support

func (*Model) SpellCheck ¶ added in v1.0.0

func (md *Model) SpellCheck(input string) string

Return the most likely correction for the input term

func (*Model) SpellCheckSuggestions ¶ added in v1.0.0

func (md *Model) SpellCheckSuggestions(input string, n int) []string

Return the most likely corrections in order from best to worst

func (*Model) Suggestions ¶ added in v1.0.0

func (md *Model) Suggestions(input string, exhaustive bool) []string

For a given input string, suggests potential replacements

func (*Model) Train ¶ added in v1.0.0

func (md *Model) Train(terms []string)

Add an array of words to train the model in bulk

func (*Model) TrainQuery ¶ added in v1.0.0

func (md *Model) TrainQuery(term string)

TrainQuery using a search query term. This builds a second popularity index of terms used to search, as opposed to generally occurring in corpus text

func (*Model) TrainWord ¶ added in v1.0.0

func (md *Model) TrainWord(term string)

Train the model word by word. This is corpus training as opposed to query training. Word counts from this type of training are not likely to correlate with those of search queries

func (*Model) WriteTo ¶ added in v1.0.0

func (md *Model) WriteTo(w io.Writer) error

WriteTo writes a model to a Writer

type Pair ¶ added in v1.0.0

type Pair struct {
	// contains filtered or unexported fields
}

type Potential ¶ added in v1.0.0

type Potential struct {
	Term   string // Potential term string
	Score  int    // Score
	Leven  int    // Levenstein distance from the suggestion to the input
	Method Method // How this potential was matched
}

Potential is a potential match

func (*Potential) String ¶ added in v1.0.0

func (pot *Potential) String() string

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL