spell

package
v1.0.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 28, 2020 License: BSD-3-Clause Imports: 19 Imported by: 4

Documentation

Overview

Package spell provides functions for spell check and correction. It wraps https://github.com/sajari/fuzzy as the core spelling engine.

A single globally-usable spelling dictionary is managed.

Index

Constants

View Source
const (
	SpellDepthDefault              = 2
	SpellThresholdDefault          = 5
	SuffDivergenceThresholdDefault = 100
)
View Source
const (
	MethodIsWord                   Method = 0
	MethodSuggestMapsToInput              = 1
	MethodInputDeleteMapsToDict           = 2
	MethodInputDeleteMapsToSuggest        = 3
)
View Source
const SaveAfterLearnIntervalSecs = 20

SaveAfterLearnIntervalSecs is number of seconds since file has been opened / saved above which model is saved after learning. learnring

Variables

View Source
var (
	Ignore = map[string]struct{}{}
)

Functions

func Asset

func Asset(name string) ([]byte, error)

Asset loads and returns the asset for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetDir

func AssetDir(name string) ([]string, error)

AssetDir returns the file names below a certain directory embedded in the file by go-bindata. For example if you run go-bindata on data/... and data contains the following hierarchy:

data/
  foo.txt
  img/
    a.png
    b.png

then AssetDir("data") would return []string{"foo.txt", "img"} AssetDir("data/img") would return []string{"a.png", "b.png"} AssetDir("foo.txt") and AssetDir("notexist") would return an error AssetDir("") will return []string{"data"}.

func AssetInfo

func AssetInfo(name string) (os.FileInfo, error)

AssetInfo loads and returns the asset info for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetNames

func AssetNames() []string

AssetNames returns the names of the assets. nolint: deadcode

func CheckIgnore added in v0.9.14

func CheckIgnore(word string) bool

CheckIgnore returns true if the word is found in the Ignore list

func CheckLexLine added in v0.9.14

func CheckLexLine(src []rune, tags lex.Line) lex.Line

CheckLexLine returns the Lex regions for any words that are misspelled within given line of text with existing Lex tags -- automatically excludes any Code token regions (see token.IsCode). Token is set to token.TextSpellErr on returned Lex's

func CheckWord

func CheckWord(word string) ([]string, bool)

CheckWord checks a single word and returns suggestions if word is unknown

func Complete

func Complete(s string) []string

Complete finds possible completions based on the prefix s

func Edits1 added in v1.0.0

func Edits1(word string) []string

Edits1 creates a set of terms that are 1 char delete from the input term

func IgnoreWord

func IgnoreWord(word string)

IgnoreWord adds the word to the Ignore list

func Initialized

func Initialized() bool

Initialized returns true if the model has been loaded or created anew

func LearnWord

func LearnWord(word string)

LearnWord adds a single word to the corpus: this is deterministic and we set the threshold to 1 to make it learn it immediately.

func Levenshtein added in v1.0.0

func Levenshtein(a, b *string) int

Calculate the Levenshtein distance between two strings

func ModTime added in v0.9.14

func ModTime(path string) (time.Time, error)

ModTime returns the modification time of given file path

func MustAsset

func MustAsset(name string) []byte

MustAsset is like Asset but panics when Asset would return an error. It simplifies safe initialization of global variables. nolint: deadcode

func Open added in v0.9.14

func Open(path string) error

Open loads the saved model stored in json format

func OpenAsset added in v0.9.14

func OpenAsset(fname string) error

OpenAsset loads json-formatted model from compiled-in asset

func OpenCheck added in v0.9.14

func OpenCheck() error

OpenCheck checks if the current file has been modified since last open time and re-opens it if so -- call this prior to checking.

func OpenDefault added in v0.9.14

func OpenDefault() error

OpenDefault loads the default spelling file. TODO: need different languages obviously!

func ResetLearnTime added in v0.9.14

func ResetLearnTime()

func RestoreAsset

func RestoreAsset(dir, name string) error

RestoreAsset restores an asset under the given directory

func RestoreAssets

func RestoreAssets(dir, name string) error

RestoreAssets restores an asset under the given directory recursively

func SampleEnglish added in v1.0.0

func SampleEnglish() []string

func Save

func Save(filename string) error

Save saves the spelling model which includes the data and parameters note: this will overwrite any existing file -- be sure to have opened the current file before making any changes.

func SaveIfLearn added in v0.9.14

func SaveIfLearn() error

SaveIfLearn saves the spelling model to file path that was used in last Open command, if learning has occurred since last save / open. If no changes also checks if file has been modified and opens it if so.

func Train

func Train(file os.File, new bool) (err error)

Train trains the model based on a text file

func UnLearnWord added in v1.0.9

func UnLearnWord(word string)

UnLearnWord removes word from dictionary -- in case accidentally added

Types

type Autos added in v1.0.0

type Autos struct {
	Results []string
	Model   *Model
}

For sorting autocomplete suggestions to bias the most popular first

func (Autos) Len added in v1.0.0

func (a Autos) Len() int

func (Autos) Less added in v1.0.0

func (a Autos) Less(i, j int) bool

func (Autos) Swap added in v1.0.0

func (a Autos) Swap(i, j int)

type Counts added in v1.0.0

type Counts struct {
	Corpus int `json:"c"`
	Query  int `json:"q"`
}

Counts has the individual word counts

type Method added in v1.0.0

type Method int

func (Method) String added in v1.0.0

func (m Method) String() string

type Model added in v1.0.0

type Model struct {
	Data                    map[string]*Counts  `json:"data"`
	Maxcount                int                 `json:"maxcount"`
	Suggest                 map[string][]string `json:"suggest"`
	Depth                   int                 `json:"depth"`
	Threshold               int                 `json:"threshold"`
	UseAutocomplete         bool                `json:"autocomplete"`
	SuffDivergence          int                 `json:"-"`
	SuffDivergenceThreshold int                 `json:"suff_threshold"`
	SuffixArr               *suffixarray.Index  `json:"-"`
	SuffixArrConcat         string              `json:"-"`
	sync.RWMutex
}

Model is the full data model

func FromReader added in v1.0.0

func FromReader(r io.Reader) (*Model, error)

FromReader loads a model from a Reader

func Load

func Load(filename string) (*Model, error)

Load a saved model from disk

func NewModel added in v1.0.0

func NewModel() *Model

Create and initialise a new model

func (*Model) Autocomplete added in v1.0.0

func (md *Model) Autocomplete(input string) ([]string, error)

For a given string, autocomplete using the suffix array model

func (*Model) CheckKnown added in v1.0.0

func (md *Model) CheckKnown(input string, correct string) bool

Test an input, if we get it wrong, look at why it is wrong. This function returns a bool indicating if the guess was correct as well as the term it is suggesting. Typically this function would be used for testing, not for production

func (*Model) Delete added in v1.0.8

func (md *Model) Delete(term string)

Delete removes given word from dictionary -- undoes learning

func (*Model) EditsMulti added in v1.0.0

func (md *Model) EditsMulti(term string, depth int) []string

Edits at any depth for a given term. The depth of the model is used

func (*Model) Init added in v1.0.0

func (md *Model) Init() *Model

func (*Model) Potentials added in v1.0.0

func (md *Model) Potentials(input string, exhaustive bool) map[string]*Potential

Return the raw potential terms so they can be ranked externally to this package

func (*Model) Save added in v1.0.0

func (md *Model) Save(filename string) error

Save a spelling model to disk

func (*Model) SaveLight added in v1.0.0

func (md *Model) SaveLight(filename string) error

Save a spelling model to disk, but discard all entries less than the threshold number of occurrences Much smaller and all that is used when generated as a once off, but not useful for incremental usage

func (*Model) SetCount added in v1.0.0

func (md *Model) SetCount(term string, count int, suggest bool)

Manually set the count of a word. Optionally trigger the creation of suggestion keys for the term. This function lets you build a model from an existing dictionary with word popularity counts without needing to run "TrainWord" repeatedly

func (*Model) SetDepth added in v1.0.0

func (md *Model) SetDepth(val int)

Change the default depth value of the model. This sets how many character differences are indexed. The default is 2.

func (*Model) SetDivergenceThreshold added in v1.0.0

func (md *Model) SetDivergenceThreshold(val int)

Optionally set the suffix array divergence threshold. This is the number of query training steps between rebuilds of the suffix array. A low number will be more accurate but will use resources and create more garbage.

func (*Model) SetThreshold added in v1.0.0

func (md *Model) SetThreshold(val int)

Change the default threshold of the model. This is how many times a term must be seen before suggestions are created for it

func (*Model) SetUseAutocomplete added in v1.0.0

func (md *Model) SetUseAutocomplete(val bool)

Optionally disabled suffixarray based autocomplete support

func (*Model) SpellCheck added in v1.0.0

func (md *Model) SpellCheck(input string) string

Return the most likely correction for the input term

func (*Model) SpellCheckSuggestions added in v1.0.0

func (md *Model) SpellCheckSuggestions(input string, n int) []string

Return the most likely corrections in order from best to worst

func (*Model) Suggestions added in v1.0.0

func (md *Model) Suggestions(input string, exhaustive bool) []string

For a given input string, suggests potential replacements

func (*Model) Train added in v1.0.0

func (md *Model) Train(terms []string)

Add an array of words to train the model in bulk

func (*Model) TrainQuery added in v1.0.0

func (md *Model) TrainQuery(term string)

TrainQuery using a search query term. This builds a second popularity index of terms used to search, as opposed to generally occurring in corpus text

func (*Model) TrainWord added in v1.0.0

func (md *Model) TrainWord(term string)

Train the model word by word. This is corpus training as opposed to query training. Word counts from this type of training are not likely to correlate with those of search queries

func (*Model) WriteTo added in v1.0.0

func (md *Model) WriteTo(w io.Writer) error

WriteTo writes a model to a Writer

type Pair added in v1.0.0

type Pair struct {
	// contains filtered or unexported fields
}

type Potential added in v1.0.0

type Potential struct {
	Term   string // Potential term string
	Score  int    // Score
	Leven  int    // Levenstein distance from the suggestion to the input
	Method Method // How this potential was matched
}

Potential is a potential match

func (*Potential) String added in v1.0.0

func (pot *Potential) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL