Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type CBOW ¶
type CBOW struct { *State // contains filtered or unexported fields }
CBOW is a piece of Word2Vec model.
type Embedding ¶
type Embedding struct {
// contains filtered or unexported fields
}
Embedding represents a word embedding. It holds a Tensor, and preslices it for additional performance gains.
type HierarchicalSoftmax ¶
type HierarchicalSoftmax struct { MaxDepth int // contains filtered or unexported fields }
HierarchicalSoftmax is a piece of Word2Vec optimizer.
func NewHierarchicalSoftmax ¶
func NewHierarchicalSoftmax(maxDepth int) *HierarchicalSoftmax
NewHierarchicalSoftmax creates *HierarchicalSoftmax. The huffman tree is NOT built yet.
type NegativeSampling ¶
type NegativeSampling struct { NegativeSampleSize int // contains filtered or unexported fields }
NegativeSampling is a piece of Word2Vec optimizer.
func NewNegativeSampling ¶
func NewNegativeSampling(negativeSampleSize int) *NegativeSampling
NewNegativeSampling creates *NegativeSampling. The negative vector is NOT built yet.
type Optimizer ¶
type Optimizer interface { Init(c *corpus.Corpus, dimension int) error Update(targetID int, contextVector, poolVector tensor.Tensor, learningRate float64) error }
Optimizer is the interface to initialize after scanning corpus once, and update the word vector.
type SkipGram ¶
type SkipGram struct { *State // contains filtered or unexported fields }
SkipGram is a piece of Word2Vec model.
type State ¶
State stores all common configs for Word2Vec models.
func NewState ¶
func NewState(config *model.Config, opt Optimizer, subsampleThreshold, theta float64, batchSize int) *State
NewState creates *NewState.
func (*State) Preprocess ¶
func (s *State) Preprocess(f io.ReadSeeker) (io.ReadCloser, error)
Preprocess scans the corpus once before Train to count the word frequency.