Documentation ¶
Index ¶
- func ComputeWeights(X *DenseMatrix, imp float64) []float64
- func CreateFrequencyTable(X *DenseMatrix) [][]KV
- func EuclideanDistance(a, b *DenseVector) (float64, error)
- func HammingDistance(a, b *DenseVector) (float64, error)
- func SetWeights(newWeights []float64)
- func WeightedHammingDistance(a, b *DenseVector) (float64, error)
- type DenseMatrix
- func InitCao(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
- func InitHuang(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
- func InitNum(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
- func InitRandom(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
- func NewDenseMatrix(r, c int, data []float64) *DenseMatrix
- type DenseVector
- type DistanceFunction
- type InitializationFunction
- type KModes
- type KPrototypes
- type KV
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ComputeWeights ¶
func ComputeWeights(X *DenseMatrix, imp float64) []float64
ComputeWeights derives weights based on the frequency of attribute values (more different values means lower weight).
func CreateFrequencyTable ¶
func CreateFrequencyTable(X *DenseMatrix) [][]KV
CreateFrequencyTable creates frequency table for attributes in given matrix, it returns attributes in frequency descending order.
func EuclideanDistance ¶
func EuclideanDistance(a, b *DenseVector) (float64, error)
EuclideanDistance computes eucdlidean distance between two vectors.
func HammingDistance ¶
func HammingDistance(a, b *DenseVector) (float64, error)
HammingDistance is a basic dissimilarity function for the kmodes algorithm.
func SetWeights ¶
func SetWeights(newWeights []float64)
SetWeights sets the weight vector used in WeightedHammingDistance function.
func WeightedHammingDistance ¶
func WeightedHammingDistance(a, b *DenseVector) (float64, error)
WeightedHammingDistance dissimilarity function is based on hamming distance but it adds improttance to attributes.
Types ¶
type DenseMatrix ¶
DenseMatrix wraps *gonum.org/v1/gonum/mat.Dense type.
func InitCao ¶
func InitCao(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
InitCao implements initialization of cluster centroids based on the frequency and density of attributes as defined in
"A new initialization method for categorical data clustering" by F.Cao(2009)
func InitHuang ¶
func InitHuang(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
InitHuang implements initialization of cluster centroids based on the frequency of attributes as defined in paper written by Z.Huang in 1998.
func InitNum ¶
func InitNum(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
InitNum initializes cluster centers for numerical data - random initialization.
func InitRandom ¶
func InitRandom(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
InitRandom randomly initializes cluster centers - vectors chosen from X table.
func NewDenseMatrix ¶
func NewDenseMatrix(r, c int, data []float64) *DenseMatrix
NewDenseMatrix creates new DenseMatrix
type DenseVector ¶
DenseVector wraps *gonum.org/v1/gonum/mat.VecDense type.
func NewDenseVector ¶
func NewDenseVector(n int, data []float64) *DenseVector
NewDenseVector creates new DenseVector
type DistanceFunction ¶
type DistanceFunction func(a, b *DenseVector) (float64, error)
DistanceFunction compute distance between two vectors.
type InitializationFunction ¶
type InitializationFunction func(X *DenseMatrix, clustersNumber int, distFunc DistanceFunction) (*DenseMatrix, error)
InitializationFunction compute initial vales for cluster_centroids_.
type KModes ¶
type KModes struct { DistanceFunc DistanceFunction InitializationFunc InitializationFunction ClustersNumber int RunsNumber int MaxIterationNumber int WeightVectors [][]float64 FrequencyTable [][]map[float64]float64 // frequency table - list of lists with dictionaries containing frequencies of values per cluster and attribute LabelsCounter []int Labels *DenseVector ClusterCentroids *DenseMatrix IsFitted bool ModelPath string }
KModes is a basic class for the k-modes algorithm, it contains all necessary information as alg. parameters, labels, centroids, ...
func NewKModes ¶
func NewKModes(dist DistanceFunction, init InitializationFunction, clusters int, runs int, iters int, weights [][]float64, modelPath string) *KModes
NewKModes implements constructor for the KModes struct.
func (*KModes) FitModel ¶
func (km *KModes) FitModel(X *DenseMatrix) error
FitModel main algorithm function which finds the best clusters centers for the given dataset X. func (km *KModes) FitModel(X *mat.Dense) error {
func (*KModes) Predict ¶
func (km *KModes) Predict(X *DenseMatrix) (*DenseVector, error)
Predict assign labels for the set of new vectors.
type KPrototypes ¶
type KPrototypes struct { DistanceFunc DistanceFunction InitializationFunc InitializationFunction CategoricalInd []int ClustersNumber int RunsNumber int MaxIterationNumber int WeightVectors [][]float64 FrequencyTable [][]map[float64]float64 // frequency table - list of lists with dictionaries containing frequencies of values per cluster and attribute MembershipNumTable [][]float64 // membership table for numeric attributes - list of labels for each cluster LabelsCounter []int Labels *DenseVector ClusterCentroids *DenseMatrix ClusterCentroidsCat *DenseMatrix ClusterCentroidsNum *DenseMatrix Gamma float64 IsFitted bool ModelPath string }
KPrototypes is a basic class for the k-prototypes algorithm, it contains all necessary information as alg. parameters, labels, centroids, ...
func NewKPrototypes ¶
func NewKPrototypes(dist DistanceFunction, init InitializationFunction, categorical []int, clusters int, runs int, iters int, weights [][]float64, g float64, modelPath string) *KPrototypes
NewKPrototypes implements constructor for the KPrototypes struct.
func (*KPrototypes) FitModel ¶
func (km *KPrototypes) FitModel(X *DenseMatrix) error
FitModel main algorithm function which finds the best clusters centers for the given dataset X.
func (*KPrototypes) LoadModel ¶
func (km *KPrototypes) LoadModel() error
LoadModel loads model (KPrototypes struct) from file.
func (*KPrototypes) Predict ¶
func (km *KPrototypes) Predict(X *DenseMatrix) (*DenseVector, error)
Predict assign labels for the set of new vectors.
func (*KPrototypes) SaveModel ¶
func (km *KPrototypes) SaveModel() error
SaveModel saves computed ml model (KPrototypes struct) in file specified in configuration.