db

package
v0.0.0-...-0e4228c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 17, 2023 License: Apache-2.0 Imports: 2 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CosineDistance

func CosineDistance(a, b []float64) float64

CosineDistance calculates the cosine distance between two vectors

func FowlerNollVo32

func FowlerNollVo32(s string) uint32

Hash function using the FNV-1a algorithm https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function

func MakeWordFreq

func MakeWordFreq(text string) map[string]int

MakeWordFreq takes a text and returns a map of words to their frequencies

func SplitWords

func SplitWords(text string) (words []string)

Split a text into words The delimiters used to split the text into words are defined in the function

func Text2Vec

func Text2Vec(text string, dimension uint64) []float64

Text2Vec converts a text to a vector of floats A vector is initialized with zeros and then updated with word counts that are hashed to a random index in the vector After that the vector is returned The dimension should be larger than the number of unique words in the text

Types

type TextVec

type TextVec struct {
	RawText  string
	WordFreq map[string]int
	Vector   []float64
}

func (*TextVec) CalculateVector

func (t *TextVec) CalculateVector(dimension uint64)

CalculateVector takes a text and calculates its vector representation The text is split into words using the SplitWords functionthen each the word frequencies are then used to calculate the vector representation of the text

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL