sentences

package
v0.0.0-...-45b5b11 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 25, 2022 License: GPL-3.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EditDistance

func EditDistance(s1, s2 string) int

export of the below minDistance function

Types

type SentenceSimilarity

type SentenceSimilarity struct {
	Duplicates int
	HashTable  *hashtable.Hashtable
}

SentenceSimilarity is a wrapper object that is used to expose a sentence database for counting similar words with specific edit distances Duplicates is an exported int which is the number of edit distance 0 words in the input HashTable is the mapping of words to times they appear

func New

func New(size int, hash func(gram *ngram.Ngram) [32]byte) *SentenceSimilarity

New creates a new sentence similarity object with a hashtable of size `size` and using a hashing algorithm of `hash`

func (*SentenceSimilarity) CountDupes

func (ss *SentenceSimilarity) CountDupes() int

CountDupes returns the number of perfect duplicates (i.e. edit distance of 0) present within the hashtable

func (*SentenceSimilarity) CountSimilar

func (ss *SentenceSimilarity) CountSimilar() int

CountSimilar determines if a sentence is similar if any one deletion or one addition of a word creates a duplicate (i.e. edit distance of 1) and returns the count of the number of sentences within edit distance 1 of another

func (*SentenceSimilarity) LoadFile

func (ss *SentenceSimilarity) LoadFile(fname string)

LoadFile takes in a filename as the argument fname and propogates the SentenceSimilarity datastructure with the unique sentenes. Also counts duplicates along the way in in linear time

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL