clustering

package
v0.1.13 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 28, 2022 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package clustering provides basic clustering functions.

Index

Constants

View Source
const (
	AggloMin     = iota // Minimal distance between any pair of elements.
	AggloMax            // Maximal distance between any pair of elements.
	AggloAverage        // Average distance between any pair of elements.
)

How agglomerative clustering should calculate distance between clusters.

Variables

This section is empty.

Functions

func AdjustedRandIndex

func AdjustedRandIndex(tags1, tags2 []int) float64

AdjustedRandIndex compares 2 taggings of the data for similarity. A score of 1 means identical, a score of 0 means as good as random, and a negative score means worse than random.

func Kmeans

func Kmeans(vecs [][]float64, k int) (means [][]float64, tags []int)

Kmeans performs k-means clustering on the given data. Each vector is an element in the clustering. Returns the generated means, and the tag each element was given.

func MeanSquaredError

func MeanSquaredError(vecs, means [][]float64, tags []int) float64

MeanSquaredError calculates the average squared-distance of elements from their assigned means.

Types

type AggloResult

type AggloResult struct {
	// contains filtered or unexported fields
}

AggloResult is an interactive agglomerative-clustering result.

func Agglo

func Agglo(n int, clusterDist int, d func(int, int) float64) *AggloResult

Agglo performs agglomerative clustering on the indexes 0 to n-1. d should return the distance between the i'th and j'th element, such that d(i,j)=d(j,i) and d(i,i)=0.

clusterDist should be one of AggloMin or AggloMax.

Works in O(n^2) time and makes O(n^2) calls to d.

func (*AggloResult) Dict

func (r *AggloResult) Dict() []string

Dict returns the string representations of elements in the clustering.

func (*AggloResult) Len

func (r *AggloResult) Len() int

Len returns the number of steps in this clustering. Equals the number of elements - 1.

func (*AggloResult) SetDict

func (r *AggloResult) SetDict(dict []string) *AggloResult

SetDict sets the string representation of each element, for the String() function. Returns itself for chaining.

func (*AggloResult) Step

func (r *AggloResult) Step(i int) AggloStep

Step returns the i'th step in the clustering.

func (*AggloResult) String

func (r *AggloResult) String() string

String returns a representation of the clustering. If SetDict was not called, will use element numbers.

type AggloStep

type AggloStep struct {
	C1 int     // Index of the first merged cluster.
	C2 int     // Index of the second merged cluster.
	D  float64 // Distance between the clusters when merging.
}

An AggloStep is a single step in the clustering process. The index of a cluster is the greatest indexed element in it. C2 is always greater than C1.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL