Word Embedding in Golang
This is an implementation of word embedding (also referred to as word representation) models in Golang.
Details
Word embedding makes words' meaning, structure, and concept mapping into vector space (and low dimension). For representative instance:
Vector("King") - Vector("Man") + Vector("Woman") = Vector("Queen")
Like this example, it could calculate word meaning by arithmetic operations between vectors.
Features
Listed models for word embedding, and checked it already implemented.
Models
- Word2Vec
- Distributed Representations of Words and Phrases
and their Compositionality [pdf]
- GloVe
- GloVe: Global Vectors for Word Representation [pdf]
- SPPMI-SVD
- Neural Word Embedding as Implicit Matrix Factorization [pdf]
Installation
$ go get -u github.com/roscopecoltran/word-embedding
$ bin/word-embedding -h
Demo
Downloading text8 corpus, and training by Skip-Gram with negative sampling.
$ sh demo.sh
Usage
The tools embedding words into vector space
Usage:
word-embedding [flags]
word-embedding [command]
Available Commands:
sim Estimate the similarity between words
word2vec Embed words using word2vec
File I/O
References
- Just see it for more deep comprehension:
- Improving Distributional Similarity
with Lessons Learned from Word Embeddings [pdf]
- Don’t count, predict! A systematic comparison of
context-counting vs. context-predicting semantic vectors [pdf]