Documentation ¶
Index ¶
- func BOW(doc Document) []int
- func CalculateBM25Scores(query string, documents []string, avgdl float64, k1 float64, b float64) []float64
- func Cosine(a []float64, b []float64) float64
- func MakeCorpus(a []string) (map[string]int, []string)
- func TF(doc Document) []float64
- type Doc
- type DocScore
- type DocScores
- type Document
- type ScoreFn
- type TFIDF
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BOW ¶
BOW turns a document into a bag of words. The words of the document will have been deduplicated. A unique list of word IDs is then returned.
Types ¶
type DocScores ¶
type DocScores []DocScore
DocScores is a list of DocScore
type TFIDF ¶
type TFIDF struct { // Term Frequency TF map[int]float64 // Inverse Document Frequency IDF map[int]float64 // Docs is the count of documents Docs int // Len is the total length of docs Len int sync.Mutex }
TFIDF is a structure holding the relevant state information about TF/IDF
func (*TFIDF) CalculateIDF ¶
func (tf *TFIDF) CalculateIDF()
CalculateIDF calculates the inverse document frequency
Click to show internal directories.
Click to hide internal directories.