som

package
v0.0.0-...-b9d6bac Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 11, 2020 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

Package som allows to build and train Self-organizing Maps (SOM) in Go

You can create and train SOMs of arbitrary sizes using the provided API. The package implements two main SOM training algorithms: sequential and batch. Som package allows you to choose different map and training configuration paramters that can help you tune the output to discover undelrying data features. You can also visualize the trained SOM using umatrix function. The package also provides a handful of useful functions which can be use spearately outside the SOM realm.

Index

Constants

View Source
const MinLRate = 0.01

MinLRate smallest possible learning rate

View Source
const MinRadius = 1.0

MinRadius is the smallest allowed SOM unit Radius

Variables

This section is empty.

Functions

func AsIsInit

func AsIsInit(codebook *mat.Dense, dims []int) (*mat.Dense, error)

AsIsInit returns codebook parameter as is

func BMUs

func BMUs(data, codebook *mat.Dense) ([]int, error)

BMUs returns a slice which contains indices of Best Match Unit (BMU) codebook vectors for each vector stored in data rows. Each item in the returned slice correspnds to index of BMU for a particular data sample. If some data row has more than one BMU the index of the first one found is used. It returns error if either the data or codebook are nil or if their dimensions are mismatched.

func Bubble

func Bubble(distance float64, radius float64) float64

Bubble calculates bubble neghbourhood

func ClosestNVec

func ClosestNVec(metric string, n int, v []float64, m *mat.Dense) ([]int, error)

ClosestNVec finds the N closest vectors to v in the list of vectors stored in m rows using the supplied distance metric. It returns a slice which contains indices to the m rows. The length of the slice is the same as number of requested closest vectors - n. ClosestNVec fails in the same way as ClosestVec. If n is higher than the number of rows in m, or if it is not a positive integer, it fails with error too.

func ClosestVec

func ClosestVec(metric string, v []float64, m *mat.Dense) (int, error)

ClosestVec finds the closest vector to v in the list of vectors stored in m rows using the supplied distance metric. It returns an index to matrix m rows. If unsupported metric is requested, ClosestVec falls over to euclidean metric. If several vectors of the same distance are found, it returns the index of the first one found. ClosestVec returns error if either v or m are nil or if the v dimension is different from the number of m columns. When the ClosestVec fails with error returned index is set to -1.

func Distance

func Distance(metric string, a, b []float64) (float64, error)

Distance calculates metric distance between vectors a and b. If unsupported metric is requested Distance returns euclidean distance. It returns error if the supplied vectors are either nil or have different dimensions

func DistanceMx

func DistanceMx(metric string, m *mat.Dense) (*mat.Dense, error)

DistanceMx calculates metric distance matrix for the supplied matrix. Distance matrix is also known in literature as dissimilarity matrix. DistanceMx returns a hollow symmetric matrix where an item x_ij contains the distance between vectors stored in rows i and j. If an unknown metric is supplied Euclidean distance is computed. It returns error if the supplied matrix is nil.

func Gaussian

func Gaussian(distance float64, radius float64) float64

Gaussian calculates gaussian neghbourhood

func GridCoords

func GridCoords(uShape string, dims []int) (*mat.Dense, error)

GridCoords returns a matrix which contains coordinates of all SOM units stored row by row. dims specify the size of the Grid, so the returned matrix has as many rows as is the product of the numbers stored in dims slice and as many columns as is the length of dims slice. GridCoords fails with error if the requested unit shape is unsupported or if the incorrect dimensions are supplied: dims slice can't be nil nor can its length be bigger than 3

func GridSize

func GridSize(data *mat.Dense, uShape string) ([]int, error)

GridSize tries to estimate the best dimensions of map from data matrix and given unit shape. It determines the grid size from eigenvectors of input data: the grid dimensions are calculated from the ratio of two highest input eigenvalues. It returns error if the map dimensions could not be calculated.

func LRate

func LRate(iteration, totalIterations int, strategy string, initLRate float64) (float64, error)

LRate is a decay function for the SOM learning rate parameter. It supports exponential and linear decay strategies denoted as "exp" and "lin". Any other strategy defaults to "exp". At the first iteration the function returns the initLRate, at totalIterations-1 it returns MinLRate It returns error if initLRate is not a positive integer

func LinInit

func LinInit(data *mat.Dense, dims []int) (*mat.Dense, error)

LinInit returns a matrix initialized to values lying in a linear space spanned by principal components of data stored in the data matrix passed in as parameter. It fails with error if the new matrix could not be initialized or if data is nil.

func MakeColors

func MakeColors(colorCount int)

func MexicanHat

func MexicanHat(distance float64, radius float64) float64

MexicanHat calculates mexican hat neghbourhood

func QuantError

func QuantError(data, codebook *mat.Dense) (float64, error)

QuantError computes SOM quantization error for the supplied data set and codebook and returns it. It fails with error if either data or codebook are nil or the distance between the codebook and data vectors could not be calculated. This could be because the dimensions of passed in data and codebook matrix are not the same. When the error is returned, quantization error is set to -1.0

func Radius

func Radius(iteration, totalIterations int, strategy string, initRadius float64) (float64, error)

Radius is a decay function for the SOM neighbourhood radius parameter. It supports exponential and linear decay strategies denoted as "exp" and "lin". Any other strategy defaults to "exp". At the first iteration the function returns the initRadius, at totalIterations-1 it returns MinRadius. It returns error if initRadius is not a positive integer

func RandInit

func RandInit(data *mat.Dense, dims []int) (*mat.Dense, error)

RandInit returns a matrix initialized to uniformly distributed random values in each column in range between [max, min] where max and min are maximum and minmum values in particular matrix column. The returned matrix has product(dims) number of rows and as many columns as the matrix passed in as a parameter. It fails with error if the new matrix could not be initialized or if data is nil.

func TopoError

func TopoError(data, codebook, grid *mat.Dense) (float64, error)

TopoError calculate topographice error for given data set, codebook and grid and returns it It returns error if either data, codebook or grid are nil or if their dimensions are mismatched.

func TopoProduct

func TopoProduct(codebook, grid *mat.Dense) (float64, error)

TopoProduct calculates topographic product for given codebook and grid. TopoProduct returns error if either codebook or grid are nil or if number of codebook rows is not the same as the number of grid rows. If any two codebooks turn out to be the same TopoProduct returns +Inf - this can happen when map is trained using batch algorithm.

func UMatrixSVG

func UMatrixSVG(codebook *mat.Dense, dims []int, uShape, title string, writer io.Writer, classes map[int]int) error

UMatrixSVG creates an SVG representation of the U-Matrix of the given codebook. It accepts the following parameters: codebook - the codebook we're displaying the U-Matrix for dims - the dimensions of the map grid uShape - the shape of the map grid title - the title of the output SVG writer - the io.Writter to write the output SVG to. classes - if the classes are known (i.e. these are test data) they can be displayed providing the information in this map. The map is: codebook vector row -> class number. When classes are not known (i.e. running with real data), just provide an empty map

Types

type CbConfig

type CbConfig struct {
	// Dim defines number of codebook vector dimension
	Dim int
	// InitFunc specifies codebook initialization function
	InitFunc CbInitFunc
}

CbConfig holds SOM codebook configuration

type CbInitFunc

type CbInitFunc func(*mat.Dense, []int) (*mat.Dense, error)

CbInitFunc defines SOM codebook initialization function

type Grid

type Grid struct {
	// contains filtered or unexported fields
}

Grid is a SOM grid

func NewGrid

func NewGrid(c *GridConfig) (*Grid, error)

NewGrid creates new grid and returns it It fails with error if the supplied configuration is incorrect

func (*Grid) Coords

func (g *Grid) Coords() mat.Matrix

Coords returns a matrix that contains grid coordinates

func (*Grid) Size

func (g *Grid) Size() []int

Size returns a slice that contains Grid dimensions

func (*Grid) UShape

func (g *Grid) UShape() string

UShape returns grid unit shape

type GridConfig

type GridConfig struct {
	// Size specifies SOM grid dimensions
	Size []int
	// Type specifies the type of SOM grid: planar
	Type string
	// UShape specifies SOM unit shape: hexagon, rectangle
	UShape string
}

GridConfig holds SOM grid configuration

type Map

type Map struct {
	// contains filtered or unexported fields
}

Map is a Self Organizing Map (SOM)

func NewMap

func NewMap(c *MapConfig, data *mat.Dense) (*Map, error)

NewMap creates new SOM based on the provided configuration. It creates a map grid and initializes codebook vectors using the provided configuration parameter. NewMap returns error if the provided configuration is not valid or if the data matrix is nil or if the codebook matrix could not be initialized. TODO: Avoid passing in data matrix when creating new map

func (Map) BMUs

func (m Map) BMUs(data *mat.Dense) ([]int, error)

BMUs returns a slice which contains indices of Best Match Unit vectors to the map codebook for each vector stored in data rows. It returns error if the data dimension and map codebook dimensions are not the same.

func (Map) Codebook

func (m Map) Codebook() mat.Matrix

Codebook returns a matrix which contains SOM codebook vectors

func (Map) Grid

func (m Map) Grid() *Grid

Grid returns SOM grid

func (*Map) MarshalTo

func (m *Map) MarshalTo(format string, w io.Writer) (int, error)

MarshalTo serializes SOM codebook in a given format to writer w. At the moment only the native gonum binary format is supported. It returns the number of bytes written to w or fails with error.

func (Map) QuantError

func (m Map) QuantError(data *mat.Dense) (float64, error)

QuantError computes SOM quantization error for the supplied data set It returns the quantization error or fails with error if the passed in data is nil or the distance betweent vectors could not be calculated. When the error is returned, quantization error is set to -1.0.

func (Map) TopoError

func (m Map) TopoError(data *mat.Dense) (float64, error)

TopoError computes SOM topographic error for a given data set. It returns a single number or fails with error if the error could not be computed

func (Map) TopoProduct

func (m Map) TopoProduct() (float64, error)

TopoProduct computes SOM topographic product It returns a single number or fails with error if the product could not be computed

func (*Map) Train

func (m *Map) Train(c *TrainConfig, data *mat.Dense, iters int) error

Train runs a SOM training for a given data set and training configuration parameters. It modifies the map codebook vectors based on the chosen training algorithm. It returns error if the supplied training configuration is invalid or training fails

func (Map) UMatrix

func (m Map) UMatrix(w io.Writer, data *mat.Dense, classMap map[int]int, format, title string) error

UMatrix generates SOM u-matrix in a given format and writes the output to w. At the moment only SVG format is supported. It fails with error if the write to w fails.

func (Map) UnitClasses

func (m Map) UnitClasses(data *mat.Dense, classMap map[int]int) (map[int]int, error)

UnitClasses returns map that contains most frequent BMU class of all of its classes

func (Map) UnitDist

func (m Map) UnitDist() (*mat.Dense, error)

UnitDist returns a matrix which contains Euclidean distances between SOM units

func (Map) UnitMeanClasses

func (m Map) UnitMeanClasses(data *mat.Dense, classMap map[int]int) (map[int]int, error)

UnitMeanClasses returns map that contains BMU mean class of all of its classes

type MapConfig

type MapConfig struct {
	// Grid is SOM grid config configuration
	Grid *GridConfig
	// Codebook holds SOM codebook configuration
	Cb *CbConfig
}

MapConfig holds SOM configuration

type NeighbFunc

type NeighbFunc func(float64, float64) float64

NeighbFunc defines SOM neighbourhood function

type TrainConfig

type TrainConfig struct {
	// Algorithm specifies training method: seq or batch
	Algorithm string
	// Radius specifies initial SOM units radius
	Radius float64
	// RDecay specifies radius decay strategy: lin, exp
	RDecay string
	// NeighbFn specifies SOM neighbourhood function: gaussian, bubble, mexican
	NeighbFn NeighbFunc
	// LRate specifies initial SOM learning rate
	LRate float64
	// LDecay specifies learning rate decay strategy: lin, exp
	LDecay string
}

TrainConfig holds SOM training configuration

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL