Documentation ¶
Overview ¶
Package clusters provides abstract definitions of clusterers as well as their implementations.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( // EuclideanDistance is one of the common distance measurement EuclideanDistance = func(a, b []float64) float64 { var ( s, t float64 ) for i, _ := range a { t = a[i] - b[i] s += t * t } return math.Sqrt(s) } // EuclideanDistanceSquared is one of the common distance measurement EuclideanDistanceSquared = func(a, b []float64) float64 { var ( s, t float64 ) for i, _ := range a { t = a[i] - b[i] s += t * t } return s } )
Functions ¶
This section is empty.
Types ¶
type DistanceFunc ¶
DistanceFunc represents a function for measuring distance between n-dimensional vectors.
type Estimator ¶
type Estimator interface { // Estimate provides an expected number of clusters in the dataset Estimate([][]float64) (int, error) }
Estimator defines a computation used to determine an optimal number of clusters in the dataset
func KMeansEstimator ¶
func KMeansEstimator(iterations, clusters int, distance DistanceFunc) (Estimator, error)
Implementation of cluster number estimator using gap statistic ("Estimating the number of clusters in a data set via the gap statistic", Tibshirani et al.) with k-means++ as clustering algorithm
type HCEvent ¶
HCEvent represents the intermediate result of computation of hard clustering algorithm and are transmitted periodically to the caller during online learning
type HardClusterer ¶
type HardClusterer interface { // Sizes returns sizes of respective clusters Sizes() []int // Guesses returns mapping from data point indices to cluster numbers. Clusters' numbering begins at 1. Guesses() []int // Predict returns number of cluster to which the observation would be assigned Predict(observation []float64) int // IsOnline tells the algorithm supports online learning IsOnline() bool // WithOnline configures the algorithms for online learning with given parameters WithOnline(Online) HardClusterer // Online begins the process of online training of an algorithm. Observations are sent on the observations channel, // once no more are expected an empty struct needs to be sent on done channel. Caller receives intermediate results of computation via // the returned channel. Online(observations chan []float64, done chan struct{}) chan *HCEvent // Implement common operation Clusterer }
HardClusterer defines a set of operations for hard clustering algorithms
func DBSCAN ¶
func DBSCAN(minpts int, eps float64, workers int, distance DistanceFunc) (HardClusterer, error)
Implementation of DBSCAN algorithm with concurrent nearest neighbour computation. The number of goroutines acting concurrently is controlled via workers argument. Passing 0 will result in this number being chosen arbitrarily.
func KMeans ¶
func KMeans(iterations, clusters int, distance DistanceFunc) (HardClusterer, error)
Implementation of k-means++ algorithm with online learning
func OPTICS ¶
func OPTICS(minpts int, eps, xi float64, workers int, distance DistanceFunc) (HardClusterer, error)
Implementation of OPTICS algorithm with concurrent nearest neighbour computation. The number of goroutines acting concurrently is controlled via workers argument. Passing 0 will result in this number being chosen arbitrarily.
type Importer ¶
type Importer interface { // Import fetches the data from a file, start and end arguments allow user // to specify the span of data columns to be imported (inclusively) Import(file string, start, end int) ([][]float64, error) }
Importer defines an operation of importing the dataset from an external file
func CsvImporter ¶
func CsvImporter() Importer
func JsonImporter ¶
func JsonImporter() Importer