Documentation ¶
Index ¶
- Constants
- Variables
- func ByteEqualFold(a, b byte) bool
- func ByteToLower(eax byte) byte
- func ByteToUpper(x byte) byte
- func CaseVariations(word string, style WordCase) []string
- func ReadTextFile(filename string) (string, error)
- func RemoveEmail(s string) string
- func RemoveHost(s string) string
- func RemoveNotWords(s string) string
- func RemovePath(s string) string
- func StringEqualFold(s1, s2 string) bool
- func StringHasPrefixFold(s1, s2 string) bool
- func StripURL(s string) string
- type Diff
- type Replacer
- func (r *Replacer) AddRuleList(additions []string)
- func (r *Replacer) Compile()
- func (r *Replacer) RemoveRule(ignore []string)
- func (r *Replacer) Replace(input string) (string, []Diff)
- func (r *Replacer) ReplaceGo(input string) (string, []Diff)
- func (r *Replacer) ReplaceReader(raw io.Reader, w io.Writer, next func(Diff)) error
- type StringReplacer
- type WordCase
Constants ¶
const Legal = `` /* 2042-byte string literal not displayed */
Legal provides licensing info.
Variables ¶
var DictAmerican = []string{}/* 3238 elements not displayed */
DictAmerican converts UK spellings to US spellings
var DictBritish = []string{}/* 2954 elements not displayed */
DictBritish converts US spellings to UK spellings
var DictMain = []string{}/* 56094 elements not displayed */
DictMain is the main rule set, not including locale-specific spellings
Functions ¶
func ByteEqualFold ¶
ByteEqualFold does ascii compare, case insensitive
func ByteToLower ¶
ByteToLower converts an ascii byte to lower case uses a branchless algorithm
func ByteToUpper ¶
ByteToUpper converts an ascii byte to upper cases Uses a branchless algorithm
func CaseVariations ¶
CaseVariations returns If AllUpper or First-Letter-Only is upcased: add the all upper case version If AllLower, add the original, the title and upcase forms If Mixed, return the original, and the all upcase form
func ReadTextFile ¶
ReadTextFile returns the contents of a file, first testing if it is a text file
returns ("", nil) if not a text file returns ("", error) if error returns (string, nil) if text
unfortunately, in worse case, this does
1 stat 1 open,read,close of 512 bytes 1 more stat,open, read everything, close (via ioutil.ReadAll) This could be kinder to the filesystem.
This uses some heuristics of the file's extension (e.g. .zip, .txt) and uses a sniffer to determine if the file is text or not. Using file extensions isn't great, but probably good enough for real-world use. Golang's built in sniffer is problematic for differnet reasons. It's optimized for HTML, and is very limited in detection. It would be good to explicitly add some tests for ELF/DWARF formats to make sure we never corrupt binary files.
func RemoveEmail ¶
RemoveEmail remove email-like strings, e.g. "nickg+junk@xfoobar.com", "nickg@xyz.abc123.biz"
func RemoveHost ¶
RemoveHost removes host-like strings "foobar.com" "abc123.fo1231.biz"
func RemoveNotWords ¶
RemoveNotWords blanks out all the not words
func RemovePath ¶
RemovePath attempts to strip away embedded file system paths, e.g.
/foo/bar or /static/myimg.png TODO: windows style
func StringEqualFold ¶
StringEqualFold ASCII case-insensitive comparison golang toUpper/toLower for both bytes and strings appears to be Unicode based which is super slow based from https://codereview.appspot.com/5180044/patch/14007/21002
func StringHasPrefixFold ¶
StringHasPrefixFold is similar to strings.HasPrefix but comparison is done ignoring ASCII case. /
Types ¶
type Diff ¶
type Diff struct { Filename string FullLine string Line int Column int Original string Corrected string }
Diff is datastructure showing what changed in a single line
type Replacer ¶
Replacer is the main struct for spelling correction
func (*Replacer) AddRuleList ¶
AddRuleList appends new rules. Input is in the same form as Strings.Replacer: [ old1, new1, old2, new2, ....] Note: does not check for duplictes
func (*Replacer) Compile ¶
func (r *Replacer) Compile()
Compile compiles the rules. Required before using the Replace functions
func (*Replacer) RemoveRule ¶
RemoveRule deletes existings rules. TODO: make inplace to save memory
func (*Replacer) Replace ¶
Replace is corrects misspellings in input, returning corrected version
along with a list of diffs.
type StringReplacer ¶
type StringReplacer struct {
// contains filtered or unexported fields
}
StringReplacer replaces a list of strings with replacements. It is safe for concurrent use by multiple goroutines.
func NewStringReplacer ¶
func NewStringReplacer(oldnew ...string) *StringReplacer
NewStringReplacer returns a new Replacer from a list of old, new string pairs. Replacements are performed in order, without overlapping matches.
func (*StringReplacer) Replace ¶
func (r *StringReplacer) Replace(s string) string
Replace returns a copy of s with all replacements performed.
func (*StringReplacer) WriteString ¶
WriteString writes s to w with all replacements performed.