Documentation ¶
Index ¶
- Constants
- Variables
- type ConsOpt
- type Model
- type Progress
- type Tagger
- func (p *Tagger) Clone() *Tagger
- func (p *Tagger) Clusters() (map[string]lingo.Cluster, error)
- func (p Tagger) GobDecode(buf []byte) error
- func (p Tagger) GobEncode() ([]byte, error)
- func (p *Tagger) Lemmatize(a string, pt lingo.POSTag) ([]string, error)
- func (p *Tagger) Load(filename string) error
- func (p *Tagger) LoadShortcuts(shortcuts map[string]lingo.POSTag)
- func (p *Tagger) Progress() <-chan Progress
- func (p *Tagger) Run()
- func (p *Tagger) ShowWeights()
- func (p *Tagger) Stem(a string) (string, error)
- func (p *Tagger) Train(sentences []treebank.SentenceTag, iterations int)
Constants ¶
const BUILD_DEBUG = "POS TAGGER: Release Build"
const (
MAXCONTEXTTYPE contextType
)
const (
MAXFEATURETYPE featureType
)
Variables ¶
var TABCOUNT uint32 = 0
Functions ¶
This section is empty.
Types ¶
type ConsOpt ¶
type ConsOpt func(*Tagger)
ConsOpt is a construction option for a Tagger
func WithCluster ¶
WithCluster creates a *Tagger with a brown cluster corpus (a map of strings to the brown clusters). If no brown cluster corpus was passed in, the cluster won't be set, and the POSTagger will be less accurate
func WithCorpus ¶
WithCorpus creates a *Tagger with an existing Corpus
func WithLemmatizer ¶
func WithLemmatizer(l lingo.Lemmatizer) ConsOpt
WithLemmatizer creates a *Tagger with a lemmatizer. If no lemmatizer is passed into the POSTagger, then the lemmatization process will be skipped, and the POSTagger will be less accurate
func WithStemmer ¶
WithStemmer creates a *Tagger with a stemmer. If no stemmer is passed in, then the stemming will be skipped, and the POSTagger will be less accurate
type Model ¶
type Model struct {
// contains filtered or unexported fields
}
Model is the model that the POS Tagger runs on.
func LoadReader ¶
func LoadReader(rd io.ReadCloser) (*Model, error)
func (*Model) SaveWriter ¶
func (m *Model) SaveWriter(f io.WriteCloser) error
type Progress ¶
type Progress struct {
Iter, Correct, Count, ShortCutted int
}
Progress is just a tuple of training progress info
type Tagger ¶
type Tagger struct { *Model Input chan lingo.Lexeme Output chan lingo.AnnotatedSentence lingo.Lemmatizer lingo.Stemmer // contains filtered or unexported fields }
Tagger is the object that tags an incoming channel of lexemes, and outputs a channel of AnnotatedSentence. Each of the Annotation are tagged with the POSTag
The core of the Tagger is the perceptron (unexported).
A large percentage of how this POS Tagger works is inspired by Mathhew Honnibal's work in SpaCy
func (*Tagger) Lemmatize ¶
Lemmatize implements the lingo.Lemmatize interface. It however, defers the actual doing of the job to the Lemmatizer.
func (*Tagger) LoadShortcuts ¶
LoadShortcuts allows for domain specific things to be mapped into the tagger.
func (*Tagger) Progress ¶
Progress creates and returns a channel of progress. By default the progress channel isn't created, and no progress info is sent
func (*Tagger) Run ¶
func (p *Tagger) Run()
Run is used to tag a sentence. Lexemes arrive from the lexer in a channel (*Tagger.Input), and an annotated sentence is sent down the Output channel
func (*Tagger) ShowWeights ¶
func (p *Tagger) ShowWeights()