kmeans

package
v0.0.0-...-d359a71 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 31, 2024 License: BSD-3-Clause Imports: 1 Imported by: 2

Documentation

Overview

kmeans implements a generic k-means clustering algorithm.

To use this code create types that implements Clusterable, Centroid, and also a function that implements CalculateCentroid. In many cases the same type can be used as both a Clusterable and a Centroid.

See the unit tests for examples.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetClusters

func GetClusters(observations []Clusterable, centroids []Centroid) ([][]Clusterable, float64)

GetClusters returns the observations categorized into the clusters they fit into. The return value is sorted by the number of members of the cluster. The very first element of each cluster is the centroid, the remainging members are the observations that are in the cluster.

func TotalError

func TotalError(observations []Clusterable, centroids []Centroid) float64

TotalError calculates the total error between the centroids and the observations.

Types

type CalculateCentroid

type CalculateCentroid func([]Clusterable) Centroid

CalculateCentroid calculates a new centroid from a list of Clusterables.

type Centroid

type Centroid interface {
	// AsClusterable converts this Centroid to a Clusterable, or returns nil if
	// the conversion isn't possible.
	AsClusterable() Clusterable

	// Distance returns the distance from the given Clusterable to this Centroid.
	Distance(c Clusterable) float64
}

Centroid is the interface that Centroids must support to do k-means clustering.

func Do

func Do(observations []Clusterable, centroids []Centroid, f CalculateCentroid) []Centroid

Do does a single iteration of Loyd's Algorithm, taking an array of observations and a set of centroids along with a function to calcaulate new centroids for a cluster. It returns an updated array of centroids. Note that the centroids array passed in gets modified so the best way to call the function is:

centroids = Do(observations, centroids, f)

func KMeans

func KMeans(observations []Clusterable, centroids []Centroid, k, iters int, f CalculateCentroid) ([]Centroid, [][]Clusterable)

KMeans runs the k-means clustering algorithm over a set of observations and returns the centroids and clusters.

TODO(jcgregorio) Should just iterate until total error stops changing.

type Clusterable

type Clusterable interface{}

Clusterable defines the interface that an object must support to do k-means clustering on it.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL