Documentation ¶
Overview ¶
Package dic implements the dictionary of the morph analyzer.
Index ¶
Constants ¶
const (
// IPADicPath represents the internal IPA dictionary path.
IPADicPath = "dic/ipa"
)
const UserDicColumnSize = 4
UserDicColumnSize is the column size of the user dictionary.
Variables ¶
This section is empty.
Functions ¶
func NewContents ¶
NewContents creates dictionary contents from byte slice
Types ¶
type ConnectionTable ¶
ConnectionTable represents a connection matrix of morphs.
func LoadConnectionTable ¶
func LoadConnectionTable(r io.Reader) (t ConnectionTable, err error)
LoadConnectionTable loads ConnectionTable from io.Reader.
func (*ConnectionTable) At ¶
func (t *ConnectionTable) At(row, col int) int16
At returns the connection cost of matrix[row, col].
type Dic ¶
type Dic struct { Morphs []Morph POSTable POSTable Contents [][]string Connection ConnectionTable Index IndexTable CharClass []string CharCategory []byte InvokeList []bool GroupList []bool UnkMorphs []Morph UnkIndex map[int32]int32 UnkIndexDup map[int32]int32 UnkContents [][]string }
Dic represents a dictionary of a tokenizer.
func SysDicIPASimple ¶
func SysDicIPASimple() *Dic
SysDicIPASimple returns the IPA system dictionary without contents.
func SysDicSimple ¶
func SysDicSimple() *Dic
SysDicSimple returns the kagome system dictionary without contents.
func (Dic) CharacterCategory ¶
CharacterCategory returns the category of a rune.
type IndexTable ¶
type IndexTable struct { Da da.DoubleArray Dup map[int32]int32 }
IndexTable represents a dictionary index.
func BuildIndexTable ¶
func BuildIndexTable(sortedKeywords []string) (IndexTable, error)
BuildIndexTable constructs a index table from keywords.
func ReadIndexTable ¶
func ReadIndexTable(r io.Reader) (IndexTable, error)
ReadIndexTable loads a index table.
func (IndexTable) CommonPrefixSearch ¶
func (idx IndexTable) CommonPrefixSearch(input string) (lens []int, ids [][]int)
CommonPrefixSearch finds keywords sharing common prefix in an input and returns the ids and it's lengths if found.
func (IndexTable) CommonPrefixSearchCallback ¶
func (idx IndexTable) CommonPrefixSearchCallback(input string, callback func(id, l int))
CommonPrefixSearchCallback finds keywords sharing common prefix in an input and callback with id and length.
func (IndexTable) Search ¶
func (idx IndexTable) Search(input string) []int
Search finds the given keyword and returns the id if found.
type Morph ¶
type Morph struct {
LeftID, RightID, Weight int16
}
Morph represents part of speeches and an occurrence cost.
type POSMap ¶
POSMap represents a part of speech control table.
type POSTable ¶
POSTable represents a table for managing part of speeches.
func ReadPOSTable ¶
ReadPOSTable loads a POS table.
type Trie ¶
type Trie interface { Search(input string) []int32 PrefixSearch(input string) (length int, output []int32) CommonPrefixSearch(input string) (lens []int, outputs [][]int32) CommonPrefixSearchCallback(input string, callback func(id, l int)) }
Trie is an interface representing retrieval ability.
type UserDic ¶
type UserDic struct { Index IndexTable Contents []UserDicContent }
UserDic represents a user dictionary.
func NewUserDic ¶
NewUserDic build a user dictionary from a file.
type UserDicContent ¶
UserDicContent represents contents of a word in a user dictionary.