clustering2

package
v0.0.0-...-03d6fc4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 23, 2019 License: BSD-3-Clause Imports: 11 Imported by: 0

Documentation

Index

Constants

View Source
const (

	// K is the k in k-means.
	K = 50

	// MAX_KMEANS_ITERATIONS is the maximum number of k-means iterations to run.
	MAX_KMEANS_ITERATIONS = 100

	// KMEAN_EPSILON is the smallest change in the k-means total error we will
	// accept per iteration.  If the change in error falls below KMEAN_EPSILON
	// the iteration will terminate.
	KMEAN_EPSILON = 1.0
)

Variables

This section is empty.

Functions

func GetParamSummariesForKeys

func GetParamSummariesForKeys(keys []string) map[string][]ValueWeight

GetParamSummariesForKeys summarizes all the parameters for all observations in a cluster.

The return value is an array of []ValueWeight's, one []ValueWeight per parameter. The members of each []ValueWeight are sorted by the Weight, with higher Weight's first.

Types

type ClusterSummaries

type ClusterSummaries struct {
	Clusters        []*ClusterSummary
	StdDevThreshold float32
	K               int
}

ClusterSummaries is one summary for each cluster that the k-means clustering found.

func CalculateClusterSummaries

func CalculateClusterSummaries(df *dataframe.DataFrame, k int, stddevThreshold float32, progress Progress, interesting float32) (*ClusterSummaries, error)

CalculateClusterSummaries runs k-means clustering over the trace shapes.

type ClusterSummary

type ClusterSummary struct {
	// Centroid is the calculated centroid of the cluster.
	Centroid []float32 `json:"centroid"`

	// Keys of all the members of the Cluster.
	//
	// The keys are sorted so that the ones at the beginning of the list are
	// closest to the centroid.
	//
	// Note: This value is not serialized to JSON.
	Keys []string `json:"-"`

	// Shortcut is the id of a shortcut for the above Keys.
	Shortcut string `json:"shortcut"`

	// ParamSummaries is a summary of all the parameters in the cluster.
	ParamSummaries map[string][]ValueWeight `json:"param_summaries"`

	// StepFit is info on the fit of the centroid to a step function.
	StepFit *stepfit.StepFit `json:"step_fit"`

	// StepPoint is the ColumnHeader for the step point.
	StepPoint *dataframe.ColumnHeader `json:"step_point"`

	// Num is the number of observations that are in this cluster.
	Num int `json:"num"`
}

ClusterSummary is a summary of a single cluster of traces.

func NewClusterSummary

func NewClusterSummary() *ClusterSummary

NewClusterSummary returns a new ClusterSummary.

type Progress

type Progress func(totalError float64)

type SortableClusterable

type SortableClusterable struct {
	Observation kmeans.Clusterable
	Distance    float64
}

SortableClusterable allows for sorting kmeans.Clusterables.

type ValueWeight

type ValueWeight struct {
	Value  string `json:"value"`
	Weight int    `json:"weight"`
}

ValueWeight is a weight proportional to the number of times the parameter Value appears in a cluster. Used in ClusterSummary.

type ValueWeightSortable

type ValueWeightSortable []ValueWeight

ValueWeightSortable is a utility class for sorting the ValueWeight's by Weight.

func (ValueWeightSortable) Len

func (p ValueWeightSortable) Len() int

func (ValueWeightSortable) Less

func (p ValueWeightSortable) Less(i, j int) bool

func (ValueWeightSortable) Swap

func (p ValueWeightSortable) Swap(i, j int)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL