Documentation ¶
Index ¶
- func CosineSimilarity(str1, str2 string, splitLength int) float32
- func DamerauLevenshteinDistance(str1, str2 string) int
- func FuzzySearch(str string, strList []string, algo Algorithm) (string, error)
- func FuzzySearchSet(str string, strList []string, quantity int, algo Algorithm) ([]string, error)
- func FuzzySearchSetThreshold(str string, strList []string, quantity int, minSim float32, algo Algorithm) ([]string, error)
- func FuzzySearchThreshold(str string, strList []string, minSim float32, algo Algorithm) (string, error)
- func HammingDistance(str1, str2 string) (int, error)
- func JaccardSimilarity(str1, str2 string, splitLength int) float32
- func JaroSimilarity(str1, str2 string) float32
- func JaroWinklerSimilarity(str1, str2 string) float32
- func LCS(str1, str2 string) int
- func LCSBacktrack(str1, str2 string) (string, error)
- func LCSBacktrackAll(str1, str2 string) ([]string, error)
- func LCSDiff(str1, str2 string) ([]string, error)
- func LCSEditDistance(str1, str2 string) int
- func LevenshteinDistance(str1, str2 string) int
- func OSADamerauLevenshteinDistance(str1, str2 string) int
- func QgramDistance(str1, str2 string, splitLength int) int
- func QgramDistanceCustomNgram(splittedStr1, splittedStr2 map[string]int) int
- func QgramSimilarity(str1, str2 string, splitLength int) float32
- func Shingle(s string, k int) map[string]int
- func ShingleSlice(s string, k int) []string
- func SorensenDiceCoefficient(str1, str2 string, splitLength int) float32
- func StringsSimilarity(str1 string, str2 string, algo Algorithm) (float32, error)
- type Algorithm
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CosineSimilarity ¶ added in v1.3.0
CosineSimilarity use cosine algorithm to return a similarity index between string vectors Takes two strings as parameters, a split length which define the k-gram single length (if zero split string on whitespaces) and return an index.
func DamerauLevenshteinDistance ¶
DamerauLevenshteinDistance calculate the distance between two string This algorithm computes the true Damerau–Levenshtein distance with adjacent transpositions Allowing insertions, deletions, substitutions and transpositions to change one string to the second Compatible with non-ASCII characters
func FuzzySearch ¶ added in v1.2.0
FuzzySearch realize an approximate search on a string list and return the closest one compared to the string input
func FuzzySearchSet ¶ added in v1.2.0
FuzzySearchSet realize an approximate search on a string list and return a set composed with x strings compared to the string input sorted by similarity with the base string. Takes the a quantity parameter to define the number of output strings desired (For example 3 in the case of the Google Keyboard word suggestion).
func FuzzySearchSetThreshold ¶ added in v1.2.0
func FuzzySearchSetThreshold(str string, strList []string, quantity int, minSim float32, algo Algorithm) ([]string, error)
FuzzySearchSetThreshold realize an approximate search on a string list and return a set composed with x strings compared to the string input sorted by similarity with the base string. Take a similarity threshold in parameter. Takes the a quantity parameter to define the number of output strings desired (For example 3 in the case of the Google Keyboard word suggestion). Takes also a threshold parameter for similarity with base string.
func FuzzySearchThreshold ¶ added in v1.2.0
func FuzzySearchThreshold(str string, strList []string, minSim float32, algo Algorithm) (string, error)
FuzzySearchThreshold realize an approximate search on a string list and return the closest one compared to the string input. Takes a similarity threshold in parameter.
func HammingDistance ¶
HammingDistance calculate the edit distance between two given strings using only substitutions Return edit distance integer and an error
func JaccardSimilarity ¶ added in v1.4.0
JaccardSimilarity compute the jaccard similarity coeffecient between two strings Takes two strings as parameters, a split length which define the k-gram single length (if zero split string on whitespaces) and return an index.
func JaroSimilarity ¶ added in v1.1.0
JaroSimilarity return a similarity index (between 0 and 1) It use Jaro distance algorithm and allow only transposition operation
func JaroWinklerSimilarity ¶ added in v1.1.0
JaroWinklerSimilarity return a similarity index (between 0 and 1) Use Jaro similarity and after look for a common prefix (length <= 4)
func LCSBacktrack ¶ added in v1.1.0
LCSBacktrack returns all choices taken during LCS process
func LCSBacktrackAll ¶ added in v1.1.0
LCSBacktrackAll returns an array containing all common substrings between str1 and str2
func LCSDiff ¶ added in v1.1.0
LCSDiff will backtrack through the lcs matrix and return the diff between the two sequences
func LCSEditDistance ¶
LCSEditDistance determines the edit distance between two strings using LCS function (allow only insert and delete operations)
func LevenshteinDistance ¶
LevenshteinDistance calculate the distance between two string This algorithm allow insertions, deletions and substitutions to change one string to the second Compatible with non-ASCII characters
func OSADamerauLevenshteinDistance ¶
OSADamerauLevenshteinDistance calculate the distance between two string Optimal string alignment distance variant that use extension of the Wagner-Fisher dynamic programming algorithm Doesn't allow multiple transformations on a same substring Allowing insertions, deletions, substitutions and transpositions to change one string to the second Compatible with non-ASCII characters
func QgramDistance ¶ added in v1.6.0
QgramDistance compute the q-gram similarity between two strings Takes two strings as parameters, a split length which defines the k-gram shingle length
func QgramDistanceCustomNgram ¶ added in v1.6.0
QgramDistanceCustomNgram compute the q-gram similarity between two custom set of individuals Takes two n-gram map as parameters
func QgramSimilarity ¶ added in v1.6.0
QgramSimilarity compute a similarity index (between 0 and 1) between two strings from a Qgram distance Takes two strings as parameters, a split length which defines the k-gram shingle length
func Shingle ¶ added in v1.5.0
Shingle Find the k-gram of a string for a given k Takes a string and an integer as parameters and return a map. Returns an empty map if the string is empty or if k is 0
func ShingleSlice ¶ added in v1.5.0
ShingleSlice Find the k-gram of a string for a given k Takes a string and an integer as parameters and return a slice. Returns an empty slice if the string is empty or if k is 0
func SorensenDiceCoefficient ¶ added in v1.6.0
SorensenDiceCoefficient computes the Sorensen-Dice coefficient between two strings Takes two strings as parameters, a split length which defines the k-gram shingle length
func StringsSimilarity ¶ added in v1.1.0
StringsSimilarity return a similarity index [0..1] between two strings based on given edit distance algorithm in parameter. Use defined Algorithm type. Through this function, Cosine and Jaccard algorithms are used with Shingle split method with a length of 2.