Documentation ¶
Overview ¶
Package model provides the tagger's data model.
Index ¶
- Constants
- type Bigram
- type ClosedClassSet
- type FrequencyCollector
- type Model
- func (m Model) BigramFreqs() map[Bigram]int
- func (m Model) ClosedClassTags() ClosedClassSet
- func (m *Model) GobDecode(data []byte) error
- func (m Model) GobEncode() ([]byte, error)
- func (m Model) String() string
- func (m Model) TagNumberer() *StringNumberer
- func (m Model) TrigramFreqs() map[Trigram]int
- func (m Model) UnigramFreqs() map[Unigram]int
- func (m Model) WordTagFreqs() map[string]map[Tag]int
- type StringNumberer
- func (l *StringNumberer) GobDecode(data []byte) error
- func (l *StringNumberer) GobEncode() ([]byte, error)
- func (l *StringNumberer) Label(number uint) string
- func (l *StringNumberer) Number(label string) uint
- func (l *StringNumberer) Read(reader io.Reader) error
- func (l *StringNumberer) Size() int
- func (l *StringNumberer) WriteStringStringNumberer(writer io.Writer) error
- type Tag
- type Trigram
- type Unigram
Constants ¶
const EndToken = "<END>"
EndToken is the end-of-sentence marker.
const StartToken = "<START>"
StartToken is the start-of-sentence marker.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ClosedClassSet ¶
type ClosedClassSet map[string]interface{}
type FrequencyCollector ¶
type FrequencyCollector struct {
// contains filtered or unexported fields
}
A FrequencyCollector collects frequencies from the training corpus that are relevant to a trigram HMM tagger.
func NewFrequencyCollector ¶
func NewFrequencyCollector() FrequencyCollector
NewFrequencyCollector constructs a FrequencyCollector instance.
func (FrequencyCollector) Model ¶
func (c FrequencyCollector) Model() Model
Model returns the collected frequencies as a model.
func (FrequencyCollector) ModelWithClosedClass ¶
func (c FrequencyCollector) ModelWithClosedClass(closedClassTags ClosedClassSet) Model
ModelWithClosedClass returns the collected frequencies as a model, the closed class set can be used by e.g. word handlers.
type Model ¶
type Model struct {
// contains filtered or unexported fields
}
Model stores a model of the training data.
func (Model) BigramFreqs ¶
BigramFreqs returns the tag bigram frequencies in the training data.
func (Model) ClosedClassTags ¶
func (m Model) ClosedClassTags() ClosedClassSet
func (Model) TagNumberer ¶
func (m Model) TagNumberer() *StringNumberer
TagNumberer returns the tag <-> number bijection.
func (Model) TrigramFreqs ¶
TrigramFreqs returns the tag trigram frequencies in the training data.
func (Model) UnigramFreqs ¶
UnigramFreqs returns the tag unigram frequencies in the training data.
type StringNumberer ¶
type StringNumberer struct {
// contains filtered or unexported fields
}
A StringNumberer creates a bijection between (string-based) labels and numbers.
func NewStringStringNumberer ¶
func NewStringStringNumberer() *StringNumberer
NewStringStringNumberer creates a new StringNumberer that is empty (it has no mappings yet).
func (*StringNumberer) GobDecode ¶
func (l *StringNumberer) GobDecode(data []byte) error
GobDecode decodes a Model from a gob.
func (*StringNumberer) GobEncode ¶
func (l *StringNumberer) GobEncode() ([]byte, error)
GobEncode encodes a StringNumberer as a gob.
func (*StringNumberer) Label ¶
func (l *StringNumberer) Label(number uint) string
Label returns the label (string) for a number.
func (*StringNumberer) Number ¶
func (l *StringNumberer) Number(label string) uint
Number returns the (unique) number for for a label (string).
func (*StringNumberer) Read ¶
func (l *StringNumberer) Read(reader io.Reader) error
Read a label <-> number bijection from a Reader.
func (*StringNumberer) Size ¶
func (l *StringNumberer) Size() int
Size returns the number of labels known in the bijection.
func (*StringNumberer) WriteStringStringNumberer ¶
func (l *StringNumberer) WriteStringStringNumberer(writer io.Writer) error
WriteStringStringNumberer writes the bijection in a StringNumberer to a file.