Documentation ¶
Overview ¶
This package defines a Word struct that is used to encapsulate most of the "state" variables we must track when stemming a word. The Word struct also has a few methods common to stemming in a variety of languages.
Index ¶
- type Word
- func (w *Word) DebugString() string
- func (w *Word) FirstPrefix(prefixes ...string) (foundPrefix string, foundPrefixRunes []rune)
- func (w *Word) FirstSuffix(suffixes ...string) (suffix string, suffixRunes []rune)
- func (w *Word) FirstSuffixIfIn(startPos, endPos int, suffixes ...string) (suffix string, suffixRunes []rune)
- func (w *Word) FirstSuffixIn(startPos, endPos int, suffixes ...string) (suffix string, suffixRunes []rune)
- func (w *Word) FitsInR1(x int) bool
- func (w *Word) FitsInR2(x int) bool
- func (w *Word) FitsInRV(x int) bool
- func (w *Word) HasSuffixRunes(suffixRunes []rune) bool
- func (w *Word) HasSuffixRunesIn(startPos, endPos int, suffixRunes []rune) bool
- func (w *Word) R1() []rune
- func (w *Word) R1String() string
- func (w *Word) R2() []rune
- func (w *Word) R2String() string
- func (w *Word) RV() []rune
- func (w *Word) RVString() string
- func (w *Word) RemoveFirstSuffix(suffixes ...string) (suffix string, suffixRunes []rune)
- func (w *Word) RemoveFirstSuffixIfIn(startPos int, suffixes ...string) (suffix string, suffixRunes []rune)
- func (w *Word) RemoveFirstSuffixIn(startPos int, suffixes ...string) (suffix string, suffixRunes []rune)
- func (w *Word) RemoveLastNRunes(n int)
- func (w *Word) ReplaceSuffix(suffix, replacement string, force bool) bool
- func (w *Word) ReplaceSuffixRunes(suffixRunes []rune, replacementRunes []rune, force bool) bool
- func (w *Word) String() string
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Word ¶
type Word struct { // A slice of runes RS []rune // The index in RS where the R1 region begins R1start int // The index in RS where the R2 region begins R2start int // The index in RS where the RV region begins RVstart int }
Word represents a word that is going to be stemmed.
func (*Word) DebugString ¶
func (*Word) FirstPrefix ¶
Return the first prefix found or the empty string.
func (*Word) FirstSuffix ¶
Return the first suffix found or the empty string.
func (*Word) FirstSuffixIfIn ¶
func (w *Word) FirstSuffixIfIn(startPos, endPos int, suffixes ...string) (suffix string, suffixRunes []rune)
Find the first suffix that ends at `endPos` in the word among those provided; then, check to see if it begins after startPos. If it does, return it, else return the empty string and empty rune slice. This may seem a counterintuitive manner to do this. However, it matches what is required most of the time by the Snowball stemmer steps.
func (*Word) FirstSuffixIn ¶
func (*Word) HasSuffixRunes ¶
Return true if `w` ends with `suffixRunes`
func (*Word) HasSuffixRunesIn ¶
Return true if `w.RS[startPos:endPos]` ends with runes from `suffixRunes`. That is, the slice of runes between startPos and endPos have a suffix of suffixRunes.
func (*Word) RemoveFirstSuffix ¶
Removes the first suffix found
func (*Word) RemoveFirstSuffixIfIn ¶
func (w *Word) RemoveFirstSuffixIfIn(startPos int, suffixes ...string) (suffix string, suffixRunes []rune)
Find the first suffix in the word among those provided; then, check to see if it begins after startPos. If it does, remove it.
func (*Word) RemoveFirstSuffixIn ¶
func (w *Word) RemoveFirstSuffixIn(startPos int, suffixes ...string) (suffix string, suffixRunes []rune)
Removes the first suffix found that is in `word.RS[startPos:len(word.RS)]`
func (*Word) RemoveLastNRunes ¶
Remove the last `n` runes from the Word.
func (*Word) ReplaceSuffix ¶
Replace a suffix and adjust R1start and R2start as needed. If `force` is false, check to make sure the suffix exists first.
func (*Word) ReplaceSuffixRunes ¶
Replace a suffix and adjust R1start and R2start as needed. If `force` is false, check to make sure the suffix exists first.