Documentation ¶
Index ¶
- Variables
- type Idf
- type Segment
- type Segments
- type StopWord
- type TagExtracter
- func (t *TagExtracter) ExtractTags(text string, topK int) (tags Segments)
- func (t *TagExtracter) LoadDict(fileName ...string) error
- func (t *TagExtracter) LoadIdf(fileName ...string) error
- func (t *TagExtracter) LoadStopWords(fileName ...string) error
- func (t *TagExtracter) WithGse(segs gse.Segmenter)
- type TextRanker
Constants ¶
This section is empty.
Variables ¶
var StopWordMap = map[string]bool{ "the": true, "of": true, "is": true, "and": true, "to": true, "in": true, "that": true, "we": true, "for": true, "an": true, "are": true, "by": true, "be": true, "as": true, "on": true, "with": true, "can": true, "if": true, "from": true, "which": true, "you": true, "it": true, "this": true, "then": true, "at": true, "have": true, "all": true, "not": true, "one": true, "has": true, "or": true, }
StopWordMap the default stop words.
Functions ¶
This section is empty.
Types ¶
type Idf ¶
type Idf struct {
// contains filtered or unexported fields
}
Idf type a dictionary for all words with the IDFs(Inverse Document Frequency).
type Segment ¶
type Segment struct {
// contains filtered or unexported fields
}
Segment type a word with weight.
type StopWord ¶
type StopWord struct {
// contains filtered or unexported fields
}
StopWord is a dictionary for all stop words.
func NewStopWord ¶
func NewStopWord() *StopWord
NewStopWord create a new StopWord with the default stop words.
func (*StopWord) IsStopWord ¶
IsStopWord check the word is a stop word
func (*StopWord) RemoveStop ¶
RemoveStop remove a token from StopWord dictionary.
type TagExtracter ¶
type TagExtracter struct { Idf *Idf // contains filtered or unexported fields }
TagExtracter is extract tags struct.
func (*TagExtracter) ExtractTags ¶
func (t *TagExtracter) ExtractTags(text string, topK int) (tags Segments)
ExtractTags extract the topK key words from text.
func (*TagExtracter) LoadDict ¶
func (t *TagExtracter) LoadDict(fileName ...string) error
LoadDict load and create a new dictionary from the file
func (*TagExtracter) LoadIdf ¶
func (t *TagExtracter) LoadIdf(fileName ...string) error
LoadIdf load and create a new Idf dictionary from the file.
func (*TagExtracter) LoadStopWords ¶
func (t *TagExtracter) LoadStopWords(fileName ...string) error
LoadStopWords load and create a new StopWord dictionary from the file.
func (*TagExtracter) WithGse ¶
func (t *TagExtracter) WithGse(segs gse.Segmenter)
WithGse register the gse segmenter
type TextRanker ¶
type TextRanker struct { HMM bool // contains filtered or unexported fields }
TextRanker is extract tags struct.
func (*TextRanker) LoadDict ¶
func (t *TextRanker) LoadDict(fileName ...string) error
LoadDict load and create a new dictionary from the file for Textranker
func (*TextRanker) TextRank ¶
func (t *TextRanker) TextRank(text string, topK int) Segments
TextRank extract keywords from text using TextRank algorithm. Parameter topK specify how many top keywords to be returned at most.
func (*TextRanker) TextRankWithPOS ¶
func (t *TextRanker) TextRankWithPOS(text string, topK int, allowPOS []string) Segments
TextRankWithPOS extracts keywords from text using TextRank algorithm. Parameter allowPOS allows a []string pos list.
func (*TextRanker) WithGse ¶
func (t *TextRanker) WithGse(segs gse.Segmenter)
WithGse register the gse segmenter