Documentation ¶
Overview ¶
Package som allows to build and train Self-organizing Maps (SOM) in Go
You can create and train SOMs of arbitrary sizes using the provided API. The package implements two main SOM training algorithms: sequential and batch. Som package allows you to choose different map and training configuration paramters that can help you tune the output to discover undelrying data features. You can also visualize the trained SOM using umatrix function. The package also provides a handful of useful functions which can be use spearately outside the SOM realm.
Index ¶
- Constants
- func AsIsInit(codebook *mat.Dense, dims []int) (*mat.Dense, error)
- func BMUs(data, codebook *mat.Dense) ([]int, error)
- func Bubble(distance float64, radius float64) float64
- func ClosestNVec(metric string, n int, v []float64, m *mat.Dense) ([]int, error)
- func ClosestVec(metric string, v []float64, m *mat.Dense) (int, error)
- func Distance(metric string, a, b []float64) (float64, error)
- func DistanceMx(metric string, m *mat.Dense) (*mat.Dense, error)
- func Gaussian(distance float64, radius float64) float64
- func GridCoords(uShape string, dims []int) (*mat.Dense, error)
- func GridSize(data *mat.Dense, uShape string) ([]int, error)
- func LRate(iteration, totalIterations int, strategy string, initLRate float64) (float64, error)
- func LinInit(data *mat.Dense, dims []int) (*mat.Dense, error)
- func MakeColors(colorCount int)
- func MexicanHat(distance float64, radius float64) float64
- func QuantError(data, codebook *mat.Dense) (float64, error)
- func Radius(iteration, totalIterations int, strategy string, initRadius float64) (float64, error)
- func RandInit(data *mat.Dense, dims []int) (*mat.Dense, error)
- func TopoError(data, codebook, grid *mat.Dense) (float64, error)
- func TopoProduct(codebook, grid *mat.Dense) (float64, error)
- func UMatrixSVG(codebook *mat.Dense, dims []int, uShape, title string, writer io.Writer, ...) error
- type CbConfig
- type CbInitFunc
- type Grid
- type GridConfig
- type Map
- func (m Map) BMUs(data *mat.Dense) ([]int, error)
- func (m Map) Codebook() mat.Matrix
- func (m Map) Grid() *Grid
- func (m *Map) MarshalTo(format string, w io.Writer) (int, error)
- func (m Map) QuantError(data *mat.Dense) (float64, error)
- func (m Map) TopoError(data *mat.Dense) (float64, error)
- func (m Map) TopoProduct() (float64, error)
- func (m *Map) Train(c *TrainConfig, data *mat.Dense, iters int) error
- func (m Map) UMatrix(w io.Writer, data *mat.Dense, classMap map[int]int, format, title string) error
- func (m Map) UnitClasses(data *mat.Dense, classMap map[int]int) (map[int]int, error)
- func (m Map) UnitDist() (*mat.Dense, error)
- func (m Map) UnitMeanClasses(data *mat.Dense, classMap map[int]int) (map[int]int, error)
- type MapConfig
- type NeighbFunc
- type TrainConfig
Constants ¶
const MinLRate = 0.01
MinLRate smallest possible learning rate
const MinRadius = 1.0
MinRadius is the smallest allowed SOM unit Radius
Variables ¶
This section is empty.
Functions ¶
func BMUs ¶
BMUs returns a slice which contains indices of Best Match Unit (BMU) codebook vectors for each vector stored in data rows. Each item in the returned slice correspnds to index of BMU for a particular data sample. If some data row has more than one BMU the index of the first one found is used. It returns error if either the data or codebook are nil or if their dimensions are mismatched.
func ClosestNVec ¶
ClosestNVec finds the N closest vectors to v in the list of vectors stored in m rows using the supplied distance metric. It returns a slice which contains indices to the m rows. The length of the slice is the same as number of requested closest vectors - n. ClosestNVec fails in the same way as ClosestVec. If n is higher than the number of rows in m, or if it is not a positive integer, it fails with error too.
func ClosestVec ¶
ClosestVec finds the closest vector to v in the list of vectors stored in m rows using the supplied distance metric. It returns an index to matrix m rows. If unsupported metric is requested, ClosestVec falls over to euclidean metric. If several vectors of the same distance are found, it returns the index of the first one found. ClosestVec returns error if either v or m are nil or if the v dimension is different from the number of m columns. When the ClosestVec fails with error returned index is set to -1.
func Distance ¶
Distance calculates metric distance between vectors a and b. If unsupported metric is requested Distance returns euclidean distance. It returns error if the supplied vectors are either nil or have different dimensions
func DistanceMx ¶
DistanceMx calculates metric distance matrix for the supplied matrix. Distance matrix is also known in literature as dissimilarity matrix. DistanceMx returns a hollow symmetric matrix where an item x_ij contains the distance between vectors stored in rows i and j. If an unknown metric is supplied Euclidean distance is computed. It returns error if the supplied matrix is nil.
func GridCoords ¶
GridCoords returns a matrix which contains coordinates of all SOM units stored row by row. dims specify the size of the Grid, so the returned matrix has as many rows as is the product of the numbers stored in dims slice and as many columns as is the length of dims slice. GridCoords fails with error if the requested unit shape is unsupported or if the incorrect dimensions are supplied: dims slice can't be nil nor can its length be bigger than 3
func GridSize ¶
GridSize tries to estimate the best dimensions of map from data matrix and given unit shape. It determines the grid size from eigenvectors of input data: the grid dimensions are calculated from the ratio of two highest input eigenvalues. It returns error if the map dimensions could not be calculated.
func LRate ¶
LRate is a decay function for the SOM learning rate parameter. It supports exponential and linear decay strategies denoted as "exp" and "lin". Any other strategy defaults to "exp". At the first iteration the function returns the initLRate, at totalIterations-1 it returns MinLRate It returns error if initLRate is not a positive integer
func LinInit ¶
LinInit returns a matrix initialized to values lying in a linear space spanned by principal components of data stored in the data matrix passed in as parameter. It fails with error if the new matrix could not be initialized or if data is nil.
func MakeColors ¶
func MakeColors(colorCount int)
func MexicanHat ¶
MexicanHat calculates mexican hat neghbourhood
func QuantError ¶
QuantError computes SOM quantization error for the supplied data set and codebook and returns it. It fails with error if either data or codebook are nil or the distance between the codebook and data vectors could not be calculated. This could be because the dimensions of passed in data and codebook matrix are not the same. When the error is returned, quantization error is set to -1.0
func Radius ¶
Radius is a decay function for the SOM neighbourhood radius parameter. It supports exponential and linear decay strategies denoted as "exp" and "lin". Any other strategy defaults to "exp". At the first iteration the function returns the initRadius, at totalIterations-1 it returns MinRadius. It returns error if initRadius is not a positive integer
func RandInit ¶
RandInit returns a matrix initialized to uniformly distributed random values in each column in range between [max, min] where max and min are maximum and minmum values in particular matrix column. The returned matrix has product(dims) number of rows and as many columns as the matrix passed in as a parameter. It fails with error if the new matrix could not be initialized or if data is nil.
func TopoError ¶
TopoError calculate topographice error for given data set, codebook and grid and returns it It returns error if either data, codebook or grid are nil or if their dimensions are mismatched.
func TopoProduct ¶
TopoProduct calculates topographic product for given codebook and grid. TopoProduct returns error if either codebook or grid are nil or if number of codebook rows is not the same as the number of grid rows. If any two codebooks turn out to be the same TopoProduct returns +Inf - this can happen when map is trained using batch algorithm.
func UMatrixSVG ¶
func UMatrixSVG(codebook *mat.Dense, dims []int, uShape, title string, writer io.Writer, classes map[int]int) error
UMatrixSVG creates an SVG representation of the U-Matrix of the given codebook. It accepts the following parameters: codebook - the codebook we're displaying the U-Matrix for dims - the dimensions of the map grid uShape - the shape of the map grid title - the title of the output SVG writer - the io.Writter to write the output SVG to. classes - if the classes are known (i.e. these are test data) they can be displayed providing the information in this map. The map is: codebook vector row -> class number. When classes are not known (i.e. running with real data), just provide an empty map
Types ¶
type CbConfig ¶
type CbConfig struct { // Dim defines number of codebook vector dimension Dim int // InitFunc specifies codebook initialization function InitFunc CbInitFunc }
CbConfig holds SOM codebook configuration
type CbInitFunc ¶
CbInitFunc defines SOM codebook initialization function
type Grid ¶
type Grid struct {
// contains filtered or unexported fields
}
Grid is a SOM grid
func NewGrid ¶
func NewGrid(c *GridConfig) (*Grid, error)
NewGrid creates new grid and returns it It fails with error if the supplied configuration is incorrect
type GridConfig ¶
type GridConfig struct { // Size specifies SOM grid dimensions Size []int // Type specifies the type of SOM grid: planar Type string // UShape specifies SOM unit shape: hexagon, rectangle UShape string }
GridConfig holds SOM grid configuration
type Map ¶
type Map struct {
// contains filtered or unexported fields
}
Map is a Self Organizing Map (SOM)
func NewMap ¶
NewMap creates new SOM based on the provided configuration. It creates a map grid and initializes codebook vectors using the provided configuration parameter. NewMap returns error if the provided configuration is not valid or if the data matrix is nil or if the codebook matrix could not be initialized. TODO: Avoid passing in data matrix when creating new map
func (Map) BMUs ¶
BMUs returns a slice which contains indices of Best Match Unit vectors to the map codebook for each vector stored in data rows. It returns error if the data dimension and map codebook dimensions are not the same.
func (*Map) MarshalTo ¶
MarshalTo serializes SOM codebook in a given format to writer w. At the moment only the native gonum binary format is supported. It returns the number of bytes written to w or fails with error.
func (Map) QuantError ¶
QuantError computes SOM quantization error for the supplied data set It returns the quantization error or fails with error if the passed in data is nil or the distance betweent vectors could not be calculated. When the error is returned, quantization error is set to -1.0.
func (Map) TopoError ¶
TopoError computes SOM topographic error for a given data set. It returns a single number or fails with error if the error could not be computed
func (Map) TopoProduct ¶
TopoProduct computes SOM topographic product It returns a single number or fails with error if the product could not be computed
func (*Map) Train ¶
Train runs a SOM training for a given data set and training configuration parameters. It modifies the map codebook vectors based on the chosen training algorithm. It returns error if the supplied training configuration is invalid or training fails
func (Map) UMatrix ¶
func (m Map) UMatrix(w io.Writer, data *mat.Dense, classMap map[int]int, format, title string) error
UMatrix generates SOM u-matrix in a given format and writes the output to w. At the moment only SVG format is supported. It fails with error if the write to w fails.
func (Map) UnitClasses ¶
UnitClasses returns map that contains most frequent BMU class of all of its classes
type MapConfig ¶
type MapConfig struct { // Grid is SOM grid config configuration Grid *GridConfig // Codebook holds SOM codebook configuration Cb *CbConfig }
MapConfig holds SOM configuration
type NeighbFunc ¶
NeighbFunc defines SOM neighbourhood function
type TrainConfig ¶
type TrainConfig struct { // Algorithm specifies training method: seq or batch Algorithm string // Radius specifies initial SOM units radius Radius float64 // RDecay specifies radius decay strategy: lin, exp RDecay string // NeighbFn specifies SOM neighbourhood function: gaussian, bubble, mexican NeighbFn NeighbFunc // LRate specifies initial SOM learning rate LRate float64 // LDecay specifies learning rate decay strategy: lin, exp LDecay string }
TrainConfig holds SOM training configuration