pos

package
v0.0.0-...-491e816 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 18, 2020 License: MIT Imports: 11 Imported by: 5

Documentation

Index

Constants

View Source
const BUILD_DEBUG = "POS TAGGER: Release Build"
View Source
const (
	MAXCONTEXTTYPE contextType
)
View Source
const (
	MAXFEATURETYPE featureType
)

Variables

View Source
var TABCOUNT uint32 = 0

Functions

This section is empty.

Types

type ConsOpt

type ConsOpt func(*Tagger)

ConsOpt is a construction option for a Tagger

func WithCluster

func WithCluster(c map[string]lingo.Cluster) ConsOpt

WithCluster creates a *Tagger with a brown cluster corpus (a map of strings to the brown clusters). If no brown cluster corpus was passed in, the cluster won't be set, and the POSTagger will be less accurate

func WithCorpus

func WithCorpus(c *corpus.Corpus) ConsOpt

WithCorpus creates a *Tagger with an existing Corpus

func WithLemmatizer

func WithLemmatizer(l lingo.Lemmatizer) ConsOpt

WithLemmatizer creates a *Tagger with a lemmatizer. If no lemmatizer is passed into the POSTagger, then the lemmatization process will be skipped, and the POSTagger will be less accurate

func WithModel

func WithModel(m *Model) ConsOpt

WithModel creates a *Tagger with the specified model

func WithStemmer

func WithStemmer(s lingo.Stemmer) ConsOpt

WithStemmer creates a *Tagger with a stemmer. If no stemmer is passed in, then the stemming will be skipped, and the POSTagger will be less accurate

type Model

type Model struct {
	// contains filtered or unexported fields
}

Model is the model that the POS Tagger runs on.

func Load

func Load(filename string) (*Model, error)

func LoadReader

func LoadReader(rd io.ReadCloser) (*Model, error)

func (Model) GobDecode

func (p Model) GobDecode(buf []byte) error

func (Model) GobEncode

func (p Model) GobEncode() ([]byte, error)

func (*Model) Save

func (m *Model) Save(filename string) error

Save saves the model

func (*Model) SaveWriter

func (m *Model) SaveWriter(f io.WriteCloser) error

type Progress

type Progress struct {
	Iter, Correct, Count, ShortCutted int
}

Progress is just a tuple of training progress info

type Tagger

type Tagger struct {
	*Model

	Input  chan lingo.Lexeme
	Output chan lingo.AnnotatedSentence

	lingo.Lemmatizer
	lingo.Stemmer
	// contains filtered or unexported fields
}

Tagger is the object that tags an incoming channel of lexemes, and outputs a channel of AnnotatedSentence. Each of the Annotation are tagged with the POSTag

The core of the Tagger is the perceptron (unexported).

A large percentage of how this POS Tagger works is inspired by Mathhew Honnibal's work in SpaCy

func New

func New(opts ...ConsOpt) *Tagger

New creates a new *Tagger

func (*Tagger) Clone

func (p *Tagger) Clone() *Tagger

Clone() makes a copy of a POSTagger

func (*Tagger) Clusters

func (p *Tagger) Clusters() (map[string]lingo.Cluster, error)

Clusters implements the lingo.AnnotationFixer interface.

func (Tagger) GobDecode

func (p Tagger) GobDecode(buf []byte) error

func (Tagger) GobEncode

func (p Tagger) GobEncode() ([]byte, error)

func (*Tagger) Lemmatize

func (p *Tagger) Lemmatize(a string, pt lingo.POSTag) ([]string, error)

Lemmatize implements the lingo.Lemmatize interface. It however, defers the actual doing of the job to the Lemmatizer.

func (*Tagger) Load

func (p *Tagger) Load(filename string) error

func (*Tagger) LoadShortcuts

func (p *Tagger) LoadShortcuts(shortcuts map[string]lingo.POSTag)

LoadShortcuts allows for domain specific things to be mapped into the tagger.

func (*Tagger) Progress

func (p *Tagger) Progress() <-chan Progress

Progress creates and returns a channel of progress. By default the progress channel isn't created, and no progress info is sent

func (*Tagger) Run

func (p *Tagger) Run()

Run is used to tag a sentence. Lexemes arrive from the lexer in a channel (*Tagger.Input), and an annotated sentence is sent down the Output channel

func (*Tagger) ShowWeights

func (p *Tagger) ShowWeights()

func (*Tagger) Stem

func (p *Tagger) Stem(a string) (string, error)

Stem implements the lingo.Stemmer interface. It however, defers the actual stemming to the stemmer passed in.

func (*Tagger) Train

func (p *Tagger) Train(sentences []treebank.SentenceTag, iterations int)

Train trains a POSTagger, given a bunch of SentenceTags

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL