Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type SentenceSimilarity ¶
SentenceSimilarity is a wrapper object that is used to expose a sentence database for counting similar words with specific edit distances Duplicates is an exported int which is the number of edit distance 0 words in the input HashTable is the mapping of words to times they appear
func New ¶
func New(size int, hash func(gram *ngram.Ngram) [32]byte) *SentenceSimilarity
New creates a new sentence similarity object with a hashtable of size `size` and using a hashing algorithm of `hash`
func (*SentenceSimilarity) CountDupes ¶
func (ss *SentenceSimilarity) CountDupes() int
CountDupes returns the number of perfect duplicates (i.e. edit distance of 0) present within the hashtable
func (*SentenceSimilarity) CountSimilar ¶
func (ss *SentenceSimilarity) CountSimilar() int
CountSimilar determines if a sentence is similar if any one deletion or one addition of a word creates a duplicate (i.e. edit distance of 1) and returns the count of the number of sentences within edit distance 1 of another
func (*SentenceSimilarity) LoadFile ¶
func (ss *SentenceSimilarity) LoadFile(fname string)
LoadFile takes in a filename as the argument fname and propogates the SentenceSimilarity datastructure with the unique sentenes. Also counts duplicates along the way in in linear time