Documentation
¶
Overview ¶
Package spell provides fast spelling correction and string segmentation
Index ¶
- Constants
- func LoadBigrams(filename string) (map[string]int, error)
- type Entry
- type LookupOption
- type Segment
- type SegmentOption
- type SegmentResult
- type Spell
- func (s *Spell) AddEntry(de Entry) (bool, error)
- func (s *Spell) GetEntry(word string) *Entry
- func (s *Spell) GetLongestWord() uint32
- func (s *Spell) Lookup(input string, opts ...LookupOption) (SuggestionList, error)
- func (s *Spell) RemoveEntry(word string) bool
- func (s *Spell) Save(filename string) error
- func (s *Spell) Segment(input string, opts ...SegmentOption) (*SegmentResult, error)
- type Suggestion
- type SuggestionList
- type WordData
Examples ¶
Constants ¶
const ( // LevelBest will yield 'best' suggestion LevelBest suggestionLevel = iota // LevelClosest will yield closest suggestions LevelClosest // LevelAll will yield all suggestions LevelAll )
Suggestion Levels used during Lookup.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type LookupOption ¶
type LookupOption func(*lookupParams) error
LookupOption is a function that controls how a Lookup is performed. An error will be returned if the LookupOption is invalid.
func DistanceFunc ¶
func DistanceFunc(df func(string, string, int) int) LookupOption
DistanceFunc accepts a function, f(str1, str2, maxDist), which calculates the distance between two strings. It should return -1 if the distance between the strings is greater than maxDist.
func EditDistance ¶
func EditDistance(dist uint32) LookupOption
EditDistance allows the max edit distance to be set for the Lookup. Reducing the edit distance will improve lookup performance.
func PrefixLength ¶
func PrefixLength(prefixLength uint32) LookupOption
PrefixLength defines how much of the input word should be used for the lookup.
func SortFunc ¶
func SortFunc(sf func(SuggestionList)) LookupOption
SortFunc allows the sorting of the SuggestionList to be configured. By default, suggestions will be sorted by their edit distance, then their frequency.
func SuggestionLevel ¶
func SuggestionLevel(level suggestionLevel) LookupOption
SuggestionLevel defines how many results are returned for the lookup. See the package constants for the levels available.
type SegmentOption ¶
type SegmentOption func(*segmentParams) error
SegmentOption is a function that controls how a Segment is performed. An error will be returned if the SegmentOption is invalid.
func SegmentLookupOpts ¶
func SegmentLookupOpts(opt ...LookupOption) SegmentOption
SegmentLookupOpts allows the Lookup() options for the current segmentation to be configured
type SegmentResult ¶
type SegmentResult struct {
Segments []Segment
}
SegmentResult holds the result of a call to Segment()
func (SegmentResult) GetWords ¶
func (s SegmentResult) GetWords() []string
GetWords returns a string slice of words for the segments
func (SegmentResult) String ¶
func (s SegmentResult) String() string
String returns a string representation of the SegmentList.
type Spell ¶
type Spell struct { // The max number of deletes that will be performed to each word in the // dictionary MaxEditDistance uint32 // The prefix length that will be examined PrefixLength uint32 // contains filtered or unexported fields }
Spell provides access to functions for spelling correction
func Load ¶
Load a dictionary from disk from filename. Returns a new Spell instance on success, or will return an error if there's a problem reading the file.
func (*Spell) AddEntry ¶
AddEntry adds an entry to the dictionary. If the word already exists its data will be overwritten. Returns true if a new word was added, false otherwise. Will return an error if there was a problem adding a word, for example the dictionary entry must contain word data with a "frequency" field.
Example ¶
// Create a new speller s := New() // Add a new word, "example" to the dictionary s.AddEntry(Entry{ Word: "example", WordData: WordData{"frequency": 10}, }) // Overwrite the data for word "example" s.AddEntry(Entry{ Word: "example", WordData: WordData{"frequency": 100}, }) // Output the frequency for word "example" entry := s.GetEntry("example") fmt.Printf("Output for word 'example' is: %v\n", entry.WordData.GetFrequency())
Output: Output for word 'example' is: 100
func (*Spell) GetEntry ¶
GetEntry returns the Entry for word. If a word does not exist, nil will be returned
func (*Spell) GetLongestWord ¶
GetLongestWord returns the length of the longest word in the dictionary
func (*Spell) Lookup ¶
func (s *Spell) Lookup(input string, opts ...LookupOption) (SuggestionList, error)
Lookup takes an input and returns suggestions from the dictionary for that word. By default it will return the best suggestion for the word if it exists.
Accepts zero or more LookupOption that can be used to configure how lookup occurs.
Example ¶
// Create a new speller s := New() s.AddEntry(Entry{ Word: "example", WordData: WordData{"frequency": 1}, }) // Perform a default lookup for example suggestions, _ := s.Lookup("eample") fmt.Printf("Suggestions are: %v\n", suggestions)
Output: Suggestions are: [example]
Example (ConfigureDistanceFunc) ¶
// Create a new speller s := New() s.AddEntry(Entry{ Word: "example", WordData: WordData{"frequency": 1}, }) // Configure the Lookup to use Levenshtein distance rather than the default // Damerau Levenshtein calculation s.Lookup("example", DistanceFunc(func(s1, s2 string, maxDist int) int { // Call the Levenshtein function from github.com/eskriett/strmet return strmet.Levenshtein(s1, s2, maxDist) }))
Output:
Example (ConfigureEditDistance) ¶
// Create a new speller s := New() s.AddEntry(Entry{ Word: "example", WordData: WordData{"frequency": 1}, }) // Lookup exact matches, i.e. edit distance = 0 suggestions, _ := s.Lookup("eample", EditDistance(0)) fmt.Printf("Suggestions are: %v\n", suggestions)
Output: Suggestions are: []
Example (ConfigureSortFunc) ¶
// Create a new speller s := New() s.AddEntry(Entry{ Word: "example", WordData: WordData{"frequency": 1}, }) // Configure suggestions to be sorted solely by their frequency s.Lookup("example", SortFunc(func(sl SuggestionList) { sort.Slice(sl, func(i, j int) bool { s1Freq := sl[i].WordData.GetFrequency() s2Freq := sl[j].WordData.GetFrequency() return s1Freq < s2Freq }) }))
Output:
func (*Spell) RemoveEntry ¶
RemoveEntry removes a entry from the dictionary. Returns true if the entry was removed, false otherwise
func (*Spell) Segment ¶
func (s *Spell) Segment(input string, opts ...SegmentOption) (*SegmentResult, error)
Segment takes an input string which may have word concatenations, and attempts to divide it into the most likely set of words by adding spaces at the most appropriate positions.
Accepts zero or more SegmentOption that can be used to configure how segmentation occurs
Example ¶
// Create a new speller s := New() wd := WordData{"frequency": 1} s.AddEntry(Entry{Word: "the", WordData: wd}) s.AddEntry(Entry{Word: "quick", WordData: wd}) s.AddEntry(Entry{Word: "brown", WordData: wd}) s.AddEntry(Entry{Word: "fox", WordData: wd}) // Segment a string with word concatenated together segmentResult, _ := s.Segment("thequickbrownfox") fmt.Println(segmentResult)
Output: the quick brown fox
type Suggestion ¶
type Suggestion struct { // The distance between this suggestion and the input word Distance int Entry }
Suggestion is used to represent a suggested word from a lookup.
type SuggestionList ¶
type SuggestionList []Suggestion
SuggestionList is a slice of Suggestion
func (SuggestionList) GetWords ¶
func (s SuggestionList) GetWords() []string
GetWords returns a string slice of words for the suggestions
func (SuggestionList) String ¶
func (s SuggestionList) String() string
String returns a string representation of the SuggestionList.