gospell

package module
v1.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 26, 2022 License: MIT Imports: 18 Imported by: 0

README

gospell

GoDoc license

Провека правописания с помощью словарей Hunspell на чистом Go

ВНИМАНИЕ: Я не эксперт в лингвистике и проверке правописания.
Что такое словари Hunspell?
Где скачать словари?

Лучше всего скачать словари из LibreOffice, они там уже в UTF8.

Русский язык
Английский язык

Остальные словари вариантов английского языка здесь

Испанский язык

Остальные словари вариантов испанского языка здесь

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CaseVariations

func CaseVariations(word string, style WordCase) []string

CaseVariations returns If AllUpper or First-Letter-Only is upcased: add the all upper case version If AllLower, add the original, the title and upcase forms If Mixed, return the original, and the all upcase form

func RemovePath

func RemovePath(s string) string

RemovePath attempts to strip away embedded file system paths, e.g.

/foo/bar or /static/myimg.png

TODO: windows style

func RemoveURL

func RemoveURL(s string) string

RemoveURL attempts to strip away obvious URLs

Types

type Affix

type Affix struct {
	Type         AffixType // either PFX or SFX
	CrossProduct bool
	Rules        []Rule
}

Affix is a rule for affix (adding prefixes or suffixes)

func (Affix) Expand

func (a Affix) Expand(word string, out []string) []string

Expand provides all variations of a given word based on this affix rule

type AffixType

type AffixType int

AffixType is either an affix prefix or suffix

const (
	Prefix AffixType = iota
	Suffix
)

specific Affix types

type DictConfig

type DictConfig struct {
	Flag              string            `json:"flag,omitempty"`
	TryChars          string            `json:"try_chars,omitempty"`
	WordChars         string            `json:"word_chars,omitempty"`
	NoSuggestFlag     rune              `json:"no_suggest_flag,omitempty"`
	IconvReplacements []string          `json:"iconv_replacements,omitempty"`
	Replacements      [][2]string       `json:"replacements,omitempty"`
	AffixMap          map[rune]Affix    `json:"affix_map,omitempty"`
	CamelCase         int               `json:"camel_case,omitempty"`
	CompoundMin       int               `json:"compound_min,omitempty"`
	CompoundOnly      string            `json:"compound_only,omitempty"`
	CompoundRule      []string          `json:"compound_rule,omitempty"`
	CompoundMap       map[rune][]string `json:"compound_map,omitempty"`
}

DictConfig is a partial representation of a Hunspell AFF (Affix) file.

func NewDictConfig

func NewDictConfig(file io.Reader) (*DictConfig, error)

NewDictConfig reads an Hunspell AFF file

func (DictConfig) Expand

func (a DictConfig) Expand(wordAffix string, out []string) ([]string, error)

Expand expands a word/affix using dictionary/affix rules

This also supports CompoundRule flags

type Diff

type Diff struct {
	Filename string
	Path     string
	Original string
	Line     string
	LineNum  int
}

Diff represent a unknown word in a file

func SpellFile

func SpellFile(gs *GoSpell, ext plaintext.Extractor, raw []byte) []Diff

SpellFile is attempts to spell-check a file. This interface is not very good so expect changes.

type GoSpell

type GoSpell struct {
	Config DictConfig
	Dict   map[string]struct{} // likely will contain some value later
	DB     *gorm.DB
	// contains filtered or unexported fields
}

GoSpell is main struct

func NewGoSpell

func NewGoSpell(affFile, dicFile string) (*GoSpell, error)

NewGoSpell создает новый GoSpell из файлов AFF, DIC Hunspell

func NewGoSpellDB added in v1.3.0

func NewGoSpellDB(dbFile string, config *gorm.Config) (*GoSpell, error)

NewGoSpellDB создает GoSpell с использованием указанной в пути базы данных

func NewGoSpellDBForce added in v1.4.0

func NewGoSpellDBForce(affFile, dicFile, dbFile string, config *gorm.Config) (*GoSpell, error)

NewGoSpellDBForce создает из файлов AFF, DIC Hunspell и складывает всё в базу данных, указанную в dbFile

func NewGoSpellDBReader added in v1.4.0

func NewGoSpellDBReader(db *gorm.DB) (*GoSpell, error)

NewGoSpellDBReader создает GoSpell с использованием указанной базы данных

func NewGoSpellReader

func NewGoSpellReader(aff, dic io.Reader, db *gorm.DB, lang string) (*GoSpell, error)

NewGoSpellReader создает GoSpell из файлов Huspell, переданных, как io.Reader Если db передано не как nil, собирается таблица словоформ,

func (*GoSpell) AddWordList

func (s *GoSpell) AddWordList(r io.Reader) ([]string, error)

AddWordList adds basic word lists, just one word per line

Assumed to be in UTF-8

TODO: hunspell compatible with "*" prefix for forbidden words and affix support returns list of duplicated words and/or error

func (*GoSpell) AddWordListFile

func (s *GoSpell) AddWordListFile(name string) ([]string, error)

AddWordListFile reads in a word list file

func (*GoSpell) AddWordRaw

func (s *GoSpell) AddWordRaw(word string) bool

AddWordRaw adds a single word to the internal dictionary without modifications returns true if added return false is already exists

func (*GoSpell) GetSuggestions added in v1.4.0

func (s *GoSpell) GetSuggestions(word string) []string

GetSuggestions - Поиск возможных подстановок

func (*GoSpell) InputConversion

func (s *GoSpell) InputConversion(raw []byte) string

InputConversion does any character substitution before checking

This is based on the ICONV stanza

func (*GoSpell) Spell

func (s *GoSpell) Spell(word string) bool

Spell checks to see if a given word is in the internal dictionaries TODO: add multiple dictionaries

func (*GoSpell) SpellWithSuggestions added in v1.4.0

func (s *GoSpell) SpellWithSuggestions(word string) (suggestions []string)

SpellWithSuggestions — проверка слова и получение для него возможных замен

func (*GoSpell) Split

func (s *GoSpell) Split(text string) []string

Split a text into Words

type Preferences added in v1.4.0

type Preferences struct {
	ID   uint `gorm:"primaryKey"`
	Dict string
}

Preferences - настройки, хранящиеся в базе данных

type Rule

type Rule struct {
	Strip     string
	AffixText string // suffix or prefix text to add
	Pattern   string // original matching pattern from AFF file
	// contains filtered or unexported fields
}

Rule is a Affix rule

type Splitter

type Splitter struct {
	// contains filtered or unexported fields
}

Splitter splits a text into words Highly likely this implementation will change so we are encapsulating.

func NewSplitter

func NewSplitter(chars string) *Splitter

NewSplitter creates a new splitter. The input is a string in UTF-8 encoding. Each rune in the string will be considered to be a valid word character. Runes that are NOT here are deemed a word boundary Current implementation uses https://golang.org/pkg/strings/#FieldsFunc

func (*Splitter) Split

func (s *Splitter) Split(in string) []string

Split is the function to split an input into a `[]string`

type WordCase

type WordCase int

WordCase is an enum of various word casing styles

const (
	AllLower WordCase = iota
	AllUpper
	Title
	Mixed
	Camel
)

Various WordCase types.. likely to be not correct

func CaseStyle

func CaseStyle(word string) WordCase

CaseStyle returns what case style a word is in

type WordForm added in v1.3.0

type WordForm struct {
	ID   uint   `gorm:"primaryKey"`
	Word string `gorm:"index"`
	Lang string
	Case WordCase
}

WordForm — структура для базы данных

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL