featselect

package
v0.0.0-...-8fdc90a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2020 License: Apache-2.0 Imports: 20 Imported by: 0

Documentation

Index

Constants

View Source
const RssTol = 1e-12

RssTol is the minimum residual sum of squares allowed. It is introduced to avoid problems inside math.Log

Variables

This section is empty.

Functions

func Abs

func Abs(x []float64) []float64

Abs calculate absolute value of the max element

func Aic

func Aic(numFeat int, numData int, logL float64) float64

Aic calculates the value of AIC given a number of selected featues (num_feat), number of data points (num_data) and the log of the likelihood function

func Aicc

func Aicc(numFeat int, numData int, logL float64) float64

Aicc calculates the corrected Aic which is more accurate when the sample size is small

func All

func All(a []int, value int) bool

All checks if all elements in a is equal to value

func Argmax

func Argmax(v []float64) int

Argmax returns the index of the maximum item

func Argmin

func Argmin(a []float64) int

Argmin returns the index of the minimum value in the array

func Argsort

func Argsort(a []float64) []int

Argsort returns the indices of the sorted slice in ascending order

func Bic

func Bic(numFeat int, numData int, logL float64) float64

Bic returns Bayes Information Criteria

func BoundsAIC

func BoundsAIC(model []bool, start int, X mat.Matrix, y []float64) (float64, float64)

BoundsAIC calculates lower and upper bound for AIC for all sub-models of the passed model

func BoundsAICC

func BoundsAICC(model []bool, start int, X mat.Matrix, y []float64) (float64, float64)

BoundsAICC calculats lower and upper bound for AICC for all sub-models of the passed model

func CalculateCohenSequence

func CalculateCohenSequence(numSamples int, targets []CohensKappaTarget) map[string]float64

CalculateCohenSequence calculates cohens kappa for a collection of values

func CalculateFNormDiff

func CalculateFNormDiff(mat1 *mat.Dense, mat2 *mat.Dense, gridPt int, step float64, res chan<- ValueGridPt)

CalculateFNormDiff between the thresholded version of mat1 and mat2. The method does not alter mat1, so it can be re-used

func CleanQueue

func CleanQueue(q *list.List, threshold float64)

CleanQueue removes all items where the lower bound is lower than the current score

func CohenKappa

func CohenKappa(s1 []int, s2 []int, totNum int) float64

CohenKappa returns cohen's kappa value for two sets s1 and s2

func CohensKappaWorker

func CohensKappaWorker(work <-chan CohensKappaWorkload, res chan<- CohensKappaWorkload)

CohensKappaWorker reads from a channel and pass the result to a new channel

func CovarianceMatrix

func CovarianceMatrix(X mat.Matrix) *mat.Dense

CovarianceMatrix returns the covariance matrix of X with itself

func CreateChildNodes

func CreateChildNodes(parentCh <-chan *Node, pruneCh chan<- int, nodeCh chan<- *Node, ready chan<- bool,
	X mat.Matrix, y []float64, cutoff float64, h *Highscore)

CreateChildNodes creates left child of a parent node

func DenumMatrixGcv

func DenumMatrixGcv(X mat.Matrix) *mat.Dense

DenumMatrixGcv returns the matrix that should be passed as the first argument to Gcv

func EqualInt

func EqualInt(a []int, b []int) bool

EqualInt checks if all items in a and b is equal

func ExistInt

func ExistInt(a []int, value int) bool

ExistInt returns true if value exists in a

func Fit

func Fit(X mat.Matrix, y []float64) []float64

Fit adapts a linear model to a dataset. X is the design matrix, y is the target data.

func FullCoeffVector

func FullCoeffVector(numFeat int, selection []int, coeff []float64) []float64

FullCoeffVector creates a vector containing the coefficient for all features features that are not selected will have a coefficient of zero

func Gcs

func Gcs(model []bool, start int) []bool

Gcs returns the greatest common model (gcs). GCS is equal to the model with the largest amount of features, given that all "bits" up to start is not altered

func Gcv

func Gcv(denum mat.Matrix, data []float64, pred []float64) float64

Gcv returns the generalized CV score

func GetCohensKappa

func GetCohensKappa(numSamples int, target CohensKappaTarget) float64

GetCohensKappa calculates the expected kappa value as by random partitioning

func GetDesignMatrix

func GetDesignMatrix(model []bool, X mat.Matrix) mat.Matrix

GetDesignMatrix returns the design matrix corresponding to the passed model

func IterProduct

func IterProduct(values []int, repeat int) [][]int

IterProduct mimics the product function in the itertools module of python

func LassoCrdDesc

func LassoCrdDesc(dset *NormalizedData, lamb float64, cov CovMat, x0 []float64, maxIter int, tol float64, corr LassoCorrection) []float64

LassoCrdDesc solves the lasso problem via coordinate descent

func Lcs

func Lcs(model []bool, start int) []bool

Lcs returns the least common model. The lcs is the model with as few features as possible given that all "bits" up to start are not altered

func Logspace

func Logspace(min float64, max float64, num int) []float64

Logspace returns a set of logspaced values

func MapSelectionByName

func MapSelectionByName(orig *Dataset, names []string, selection []int) []int

MapSelectionByName maps the selected features by its name in the corresponding dataset

func MaxInt

func MaxInt(a []int) int

MaxInt returns the maximum value in an array

func Mean

func Mean(v []float64) float64

Mean calculates the mean of an array

func MinFloat

func MinFloat(a []float64) float64

MinFloat returns the minum value in a slice of floats

func MinInt

func MinInt(a []int) int

MinInt returns the minumum value in an int array

func MulSlice

func MulSlice(X mat.Matrix, v []float64) []float64

MulSlice multiplies together a matrix and a vector

func NewCLassoCohen

func NewCLassoCohen() *cLassoCohen

NewCLassoCohen returns a new instance of cLassoCohen

func NewLog2Pruned

func NewLog2Pruned(current float64, numPruned int) float64

NewLog2Pruned updates the number of pruned solutions. current is log2 of the current number of pruned solutions numPruned is log2 of the new number of pruned solutions

func NewPureLassoCohen

func NewPureLassoCohen() *pureLassoCohen

NewPureLassoCohen returns a new instance of the lasso cohen

func NodesEqual

func NodesEqual(node1 *Node, node2 *Node) bool

NodesEqual compare two nodes

func NormalizeArray

func NormalizeArray(v []float64)

NormalizeArray normalizes an array to unit variance and zero mean

func NormalizeCols

func NormalizeCols(X *mat.Dense)

NormalizeCols normalizes all columnss to unit variance and zero mean

func NormalizeRows

func NormalizeRows(X *mat.Dense)

NormalizeRows normalizes all rows to unit variance and zero mean

func NumFeatures

func NumFeatures(model []bool) int

NumFeatures returns the number of features in a model

func Path2Unnormalized

func Path2Unnormalized(data *NormalizedData, path []*LassoLarsNode)

Path2Unnormalized converts all the coefficients in the path to unnormalzed values

func PerformLassoCrd

func PerformLassoCrd(workload <-chan LassoCrdWorkload, res chan<- LassoRes)

PerformLassoCrd listens to the workload channel and passes its result to res

func PerformNormDiffWork

func PerformNormDiffWork(workload <-chan Workload, res chan<- ValueGridPt)

PerformNormDiffWork executes the normalization work

func Predict

func Predict(X mat.Matrix, coeff []float64) []float64

Predict predicts the outcome of many variables. Each row in the matrix X is considered to be one data point

func PredictOne

func PredictOne(x []float64, coeff []float64) float64

PredictOne predicts the value given a set of coefficients (coeff)

func PrintHighscore

func PrintHighscore(path *LassoLarsPath, aicc []float64, bic []float64, num int)

PrintHighscore prints the top models in along the path

func PseudoInverse

func PseudoInverse(svd *mat.SVD, tol float64) *mat.Dense

PseudoInverse calculates the pseudo-inverse of a matrix X

func PseudoInverseXTX

func PseudoInverseXTX(svd *mat.SVD, tol float64) *mat.Dense

PseudoInverseXTX matris returns the inverse of X^T X. The passed SVD corresponds to the SVD of X

func RandomRowSplit

func RandomRowSplit(X mat.Matrix, num int) (*mat.Dense, *mat.Dense)

RandomRowSplit splits the rows of a matrix into two new matrices. The first matrix will have num rows, and the second will have N - num rows

func RearrangeDense

func RearrangeDense(X *mat.Dense, colOrder []int) *mat.Dense

RearrangeDense changes the order of the columns in the matrix X such that they appear in the order dictated by colOrder. If colOrder = [2, 0, 4, ...], the first column in the new matrix is the third column in the original matrix, the second column in the new matrix is the first column in the original etc.

func RemoveLeastPromising

func RemoveLeastPromising(q *list.List)

RemoveLeastPromising removes the least nodes that has is lower than the mean Lower bound

func Rss

func Rss(X mat.Matrix, coeff []float64, data []float64) float64

Rss calculates the residual sum of squares. X is the design matrix, coeff is an array with fitted coefficients and data is an array with the target data. The number of rows in the matrix X has to be the same as the length of data array

func ScoreWorker

func ScoreWorker(nodeCh <-chan *Node, scoreCh chan<- *Node, X mat.Matrix, y []float64)

ScoreWorker is a function that calculates the score of a node

func SelectModel

func SelectModel(X mat.Matrix, y []float64, highscore *Highscore, sp *SearchProgress, params *SelectModelOptParams)

SelectModel finds the model which minimizes AICC. X is the NxM design matrix, y is a vector of length N, highscore keeps track of the best models and cutoff is a value that is added to the lower bounds when judging if a node shoudl be added. The check for if a node will be added or not is this

lower_bound + cutoff < current_best_score

func Selected2Model

func Selected2Model(selected []int, numFeatures int) []bool

Selected2Model converts a list of selected features into a boolean array of true/false indicating whether the feature is selected or not

func SelectedFeatures

func SelectedFeatures(model []bool) []int

SelectedFeatures return indices of selected features

func SoftThreshold

func SoftThreshold(x float64, threshold float64) float64

SoftThreshold applyes a soft threshold to the value

func Std

func Std(v []float64) float64

Std calculates the standard deviation of an array

func Sum

func Sum(a []int) int

Sum sums all elements in a

func UnionInt

func UnionInt(s1 []int, s2 []int) []int

UnionInt joins two slices such that there are only unique entries. s1 is modified in place

func UnsatisfiedKKTConditions

func UnsatisfiedKKTConditions(Xy mat.Vector, covDotBeta []float64, coeff []float64, lamb float64, correction LassoCorrection) []int

UnsatisfiedKKTConditions returns the indices where KKT conditions are not met

func UpdateCovDotBeta

func UpdateCovDotBeta(cov mat.Matrix, covDotBeta []float64, coeffNo int, oldCoeff float64, newCoeff float64) []float64

UpdateCovDotBeta updates dot product between a matrix and an vector when one item changes

func WeightedAveragedCoeff

func WeightedAveragedCoeff(numFeat int, weights []float64, coeffs []SparseCoeff) []float64

WeightedAveragedCoeff computes a weighted average of a sparse representation of the coefficients

func WeightsFromAIC

func WeightsFromAIC(aic []float64) []float64

WeightsFromAIC returns the model weights based on the AIC values

Types

type AxisRange

type AxisRange struct {
	Min, Max float64
}

AxisRange is used to pass max and min information for an axis

type ByValue

type ByValue []ValueIdx

ByValue implements the sort interface for ValueIdx

func (ByValue) Len

func (b ByValue) Len() int

func (ByValue) Less

func (b ByValue) Less(i, j int) bool

func (ByValue) Swap

func (b ByValue) Swap(i, j int)

type CDParam

type CDParam interface {
	// C calculates the C parameter. The length of the return value is
	// equal to the number of items in the active set
	C(y mat.Vector) *mat.VecDense

	// D calculates the D parameter. The length of the return value is
	// equal to the number of items in the active set. The sign parameter
	// is a view of the signs in the active view. The corresponding column
	// in the full design matrix is given by activeSet
	D(signs mat.Vector) *mat.VecDense

	// SetActiveSet sets a new value for the active set
	SetActiveSet(active []int)

	// Set X sets the full design matrix
	SetX(X mat.Matrix)
}

CDParam interface provides a generic interface to calculate the c and d paremeter in the LASSO algorithm defined Tibshirani, R.J., 2013. The lasso problem and uniqueness. Electronic Journal of Statistics, 7, pp.1456-1490. In short C = (X^TX)^{-1}X^Ty and D = (X^TX)^{-1}s, where s is the sign vector. The column in X should only corresponds to the columns in the active set.

type CLasso

type CLasso struct {
	// contains filtered or unexported fields
}

CLasso implements the LassoCorrection interface and tends to promote selection of groups of correlated feature

func NewCLasso

func NewCLasso(numFeat int, eta float64) *CLasso

NewCLasso returns a pointer to a new instance of CLasso

func (*CLasso) Deriv

func (cl *CLasso) Deriv(beta []float64, featNo int) float64

Deriv calculates derivative of the correction with respect to the new beta values

func (*CLasso) Update

func (cl *CLasso) Update(beta []float64)

Update updates the feature covariance matrix

type CohensKappaTarget

type CohensKappaTarget interface {
	// GetX returns the full design matrix
	GetX() mat.Matrix

	// GetSelection returns the selected features when using the data points given by indices
	GetSelection(indices []int) []int

	// HyperParameters returns the current hyper parameters
	HyperParameters() map[string]float64

	// StringRep returns a string represenation that is used for logging
	StringRep() string
}

CohensKappaTarget is a interface that can be used to sample the cohens kappa

type CohensKappaWorkload

type CohensKappaWorkload struct {
	// contains filtered or unexported fields
}

CohensKappaWorkload is a struct used to calculate cohens kapps

type CorrectableLasso

type CorrectableLasso = func(dset *NormalizedData, lamb float64, cov CovMat, x0 []float64, maxIter int, tol float64, corr LassoCorrection) []float64

CorrectableLasso is a function that implements the lasso method with corrections

type CovMat

type CovMat interface {
	Get(X mat.Matrix) mat.Matrix
}

CovMat is a generic interface for returning covariance matrices

type Dataset

type Dataset struct {
	X         *mat.Dense
	Y         []float64
	Names     []string
	TargetCol int
}

Dataset is a structure that holds fitting data for linear fitting

func ParseCSV

func ParseCSV(handle io.Reader, targetCol int) *Dataset

ParseCSV parses data from CSV file. It is assumed that the file starts with a header The values in the column targetCol is placed in y of the returned struct and the rest of the columns are placed in a matrix

func ReadCSV

func ReadCSV(fname string, targetCol int) *Dataset

ReadCSV reads a dataset from a csv file

func (*Dataset) Copy

func (d *Dataset) Copy() *Dataset

Copy returns a deepcopy of the dataset

func (*Dataset) FeatNoByName

func (d *Dataset) FeatNoByName(name string) int

FeatNoByName returns the features number corresponding to the passed name

func (*Dataset) GetFeatName

func (d *Dataset) GetFeatName(i int) string

GetFeatName gets the name of the feature corresponding to the i-th column in X

func (*Dataset) GetSubset

func (d *Dataset) GetSubset(features []int) *Dataset

GetSubset returns a new dataset consisting only of the selected features

func (*Dataset) IsEqual

func (d *Dataset) IsEqual(o *Dataset) bool

IsEqual returns true if the two datasets are equal

func (*Dataset) MarshalJSON

func (dset *Dataset) MarshalJSON() ([]byte, error)

MarshalJSON is implemented to add the Dataset type to a JSON file

func (*Dataset) Save

func (dset *Dataset) Save(fname string)

Save dataset to a csv file

func (*Dataset) SaveHandle

func (dset *Dataset) SaveHandle(handle io.Writer)

SaveHandle writes the output to a writer

func (*Dataset) UnmarshalJSON

func (dset *Dataset) UnmarshalJSON(data []byte) error

UnmarshalJSON returns a datasetom from JSON

type ElasticNet

type ElasticNet struct {
	Lamb float64
}

Elastic net adds a L2 penalty

func (*ElasticNet) Deriv

func (e *ElasticNet) Deriv(beta []float64, featNo int) float64

Deriv calculates the derivative with respect to beta

func (*ElasticNet) Update

func (e *ElasticNet) Update(beta []float64)

Update does nothing for elastic net

type Empirical

type Empirical struct{}

Empirical returns the empirical covariance matrix

func (*Empirical) Get

func (e *Empirical) Get(X mat.Matrix) mat.Matrix

Get returns the empirical covariance matrix

type Highscore

type Highscore struct {
	Items    *list.List
	MaxItems int
}

Highscore is a structure that holds a list of Nodes sorted by their score

func BruteForceSelect

func BruteForceSelect(X *mat.Dense, y []float64) *Highscore

BruteForceSelect runs through all possible models

func NewHighscore

func NewHighscore(maxItems int) *Highscore

NewHighscore creates a new highscore list with maxItems entries

func (*Highscore) BestScore

func (h *Highscore) BestScore() float64

BestScore returns the best score

func (*Highscore) Equal

func (h *Highscore) Equal(h2 *Highscore) bool

func (*Highscore) Insert

func (h *Highscore) Insert(node *Node)

Insert insters a new node into the highscore list

func (*Highscore) Len

func (h *Highscore) Len() int

Len returns the number of items in the highscore list

func (*Highscore) MarshalJSON

func (h *Highscore) MarshalJSON() ([]byte, error)

MarshalJSON creates a JSON representation of the highscore list

func (*Highscore) Scores

func (h *Highscore) Scores() []float64

Scores returns all the scores in the highscore list

func (*Highscore) UnmarshalJSON

func (h *Highscore) UnmarshalJSON(data []byte) error

UnmarshalJSON decodes a JSON representation of the highscore list

type Identity

type Identity struct{}

Identity use returns the identity matrix as the covariance

func (*Identity) Get

func (i *Identity) Get(X mat.Matrix) mat.Matrix

Get returns the identity matrix

type IndexedColVecView

type IndexedColVecView struct {
	Vector mat.Vector
	Rows   []int
}

IndexedColVecView is a view for column vectors

func NewIndexedColVecView

func NewIndexedColVecView(v mat.Vector, rows []int) *IndexedColVecView

NewIndexedColVecView creates a new column vector view

func (*IndexedColVecView) At

func (v *IndexedColVecView) At(i, j int) float64

At returns the (i, 0) element

func (*IndexedColVecView) AtVec

func (v *IndexedColVecView) AtVec(i int) float64

AtVec returns the element at

func (*IndexedColVecView) Dims

func (v *IndexedColVecView) Dims() (int, int)

Dims returns the dimensions

func (*IndexedColVecView) Len

func (v *IndexedColVecView) Len() int

Len returns the length of the vector

func (*IndexedColVecView) T

func (v *IndexedColVecView) T() mat.Matrix

T returns an implicit transpose of the matrix field

type IndexedColView

type IndexedColView struct {
	Matrix     mat.Matrix
	ColIndices []int
}

IndexedColView implements the Matrix interface

func NewIndexedColView

func NewIndexedColView(m mat.Matrix, colInd []int) IndexedColView

NewIndexedColView returns a view of the matrix containing only the rows specified by rowInd and columns given by colInd. If rowInd/colInd is nil, all rows/columns are included. Thus, NewIndexedView(m, nil, nil) is the same as the original matrix

func (*IndexedColView) At

func (v *IndexedColView) At(i, j int) float64

At returns the value of the element at row i and column j of the indexed matrix, that is, RowIndices[i] and ColumnIndices[j] of the Matrix field.

func (*IndexedColView) Dims

func (v *IndexedColView) Dims() (int, int)

Dims returns the dimensions of the indexed matrix and the number of columns.

func (*IndexedColView) T

func (v *IndexedColView) T() mat.Matrix

T returns an implicit transpose of the matrix field

type JSONDataset

type JSONDataset struct {
	X         []float64
	Y         []float64
	TargetCol int
	Names     []string
	Nr, Nc    int
}

JSONDataset is a type defined to be able to read/write dataset in a simple way from JSON files

type LassoCorrection

type LassoCorrection interface {
	// Deriv returns the derivative with respect to beta
	Deriv(beta []float64, featNo int) float64

	// Update is a callback that is called every iteration
	Update(beta []float64)
}

LassoCorrection is an interface to types that can be added to the lasso solver

type LassoCrdWorkload

type LassoCrdWorkload struct {
	// contains filtered or unexported fields
}

LassoCrdWorkload is a struct holder information to carry out a lasso coordinate descent path

type LassoLarsNode

type LassoLarsNode struct {
	Coeff     []float64
	Lamb      float64
	Selection []int
}

LassoLarsNode is a structure the result of one of the lasso path

func LassoCrdDescPath

func LassoCrdDescPath(dset *NormalizedData, cov CovMat, lambs []float64, maxIter int, tol float64, correction LassoCorrection) []*LassoLarsNode

LassoCrdDescPath calculates a set of lasso solutions along equi-logspaced set of lambda values

func LassoLars

func LassoLars(data *NormalizedData, lambMin float64, estimator CDParam) []*LassoLarsNode

LassoLars computes the LASSO solution wiith the LARS algorithm

func NewLassoLarsNode

func NewLassoLarsNode(coeff []float64, lamb float64, selection []int) *LassoLarsNode

NewLassoLarsNode creates a new lasso-lars node

type LassoLarsParams

type LassoLarsParams struct {
	// contains filtered or unexported fields
}

LassoLarsParams is a convenience struct defined to hold the variable c and d in Tibshirani, R.J., 2013. The lasso problem and uniqueness. Electronic Journal of Statistics, 7, pp.1456-1490.

type LassoLarsPath

type LassoLarsPath struct {
	Dset           *Dataset
	LassoLarsNodes []*LassoLarsNode
	Aicc           []float64
	Bic            []float64
}

LassoLarsPath is a datatype that reads the data from json

func LassoLarsPathFromBytes

func LassoLarsPathFromBytes(data []byte) *LassoLarsPath

LassoLarsPathFromBytes loads the lasso lars path from a bytes array

func LassoLarsPathFromJSON

func LassoLarsPathFromJSON(fname string) *LassoLarsPath

LassoLarsPathFromJSON loads the lasso lars path from a JSON file

func (*LassoLarsPath) ExtractPath

func (p *LassoLarsPath) ExtractPath(featNo int) plotter.XYs

ExtractPath extracts the path of one coefficient

func (*LassoLarsPath) GetCriteria

func (p *LassoLarsPath) GetCriteria(criteria crit) []float64

GetCriteria returns the value of the passed criteria along the path

func (*LassoLarsPath) MaxMinFeatNo

func (p *LassoLarsPath) MaxMinFeatNo() (int, int)

MaxMinFeatNo reeturns the minimum and maximum feature in the set

func (*LassoLarsPath) PickMostRelevantFeatures

func (p *LassoLarsPath) PickMostRelevantFeatures(numFeat int) []int

PickMostRelevantFeatures picks out a subset of features based on when they entered the lasso path

func (*LassoLarsPath) PlotDeviations

func (p *LassoLarsPath) PlotDeviations() *plot.Plot

PlotDeviations plots RMSE error and GCV

func (*LassoLarsPath) PlotEntranceTimes

func (p *LassoLarsPath) PlotEntranceTimes() *plot.Plot

PlotEntranceTimes plots the lambda values when a feature is selected

func (*LassoLarsPath) PlotPath

func (p *LassoLarsPath) PlotPath(cr *AxisRange) *plot.Plot

PlotPath plots the LassoLarsPath

func (*LassoLarsPath) PlotQualityScores

func (p *LassoLarsPath) PlotQualityScores() *plot.Plot

PlotQualityScores plot the AICC value of the path

type LassoRes

type LassoRes struct {
	// contains filtered or unexported fields
}

LassoRes is a structure used to return the result

type LastZeroed

type LastZeroed struct {
	// contains filtered or unexported fields
}

LastZeroed holds information about the feature that least left the

type LazyPowerMatrix

type LazyPowerMatrix struct {
	X *mat.Dense
	// contains filtered or unexported fields
}

LazyPowerMatrix is a matrix that can have arbitrary additional columns formed by taking powers of the existing columns

func NewLazyMatrix

func NewLazyMatrix(X *mat.Dense, maxPower int) *LazyPowerMatrix

NewLazyMatrix creates a new matrix with all powers up to maxPower if maxPower is 0, it is equivalent to the original matrix X

func (*LazyPowerMatrix) AddPower

func (m *LazyPowerMatrix) AddPower(power map[int]int)

Add a set of power to the matrix

func (*LazyPowerMatrix) AddPowerSequence

func (m *LazyPowerMatrix) AddPowerSequence(cols []int, maxPower int)

AddPowerSequence adds all powers and cross-terms of the columns listed in cols to the matrix

func (*LazyPowerMatrix) FullMatrix

func (m *LazyPowerMatrix) FullMatrix() *mat.Dense

FullMatrix returns the full matrix corresponding to the LazyPowerMatrix

func (*LazyPowerMatrix) GetCol

func (m *LazyPowerMatrix) GetCol(col int) []float64

GetCol returns a column of the matrix

func (*LazyPowerMatrix) NumCols

func (m *LazyPowerMatrix) NumCols() int

NumCols return the total number of columns in the matrix

type Model

type Model struct {
	// contains filtered or unexported fields
}

Model is a wrapper around a bit array

func NewModel

func NewModel(size int) *Model

NewModel create a new model of a given size

func (*Model) Flip

func (m *Model) Flip(index int)

Flip flips the bit at position

func (*Model) Get

func (m *Model) Get(index int) bool

Get return true if feature at position index is 1

func (*Model) Set

func (m *Model) Set(index int)

Set sets the bit at position index

func (*Model) ToBools

func (m *Model) ToBools() []bool

ToBools converts the model into an array of booleans

type MorsePenroseCD

type MorsePenroseCD struct {
	X mat.Matrix
	// contains filtered or unexported fields
}

MorsePenroseCD implements the CDParem interface and will lead to the exact same algorithm as described in Tibshirani, R.J., 2013. The lasso problem and uniqueness. Electronic Journal of Statistics, 7, pp.1456-1490.

func (*MorsePenroseCD) C

C calculates the c-parameter in Tibshirani 2013

func (*MorsePenroseCD) D

func (m *MorsePenroseCD) D(signs mat.Vector) *mat.VecDense

D calculates the d-parameter in Tibshirani 2013

func (*MorsePenroseCD) SetActiveSet

func (m *MorsePenroseCD) SetActiveSet(active []int)

SetActiveSet sets the active set

func (*MorsePenroseCD) SetX

func (m *MorsePenroseCD) SetX(X mat.Matrix)

SetX sets the full design matrix

type NestedLassoLars

type NestedLassoLars struct {
	Dset  *Dataset
	Paths []PureLassoLarsPathWithCrit
}

NestedLassoLars is a type that holds a lasso path in addition to AICC and BIC along the path

func NestedLasso

func NestedLasso(data *Dataset, lambMin float64, keep float64, estimator CDParam) NestedLassoLars

NestedLasso performs a sequence of LASSO calculations where the least important features are removed on each iteration

type Node

type Node struct {
	Model      []bool
	Coeff      []float64
	Level      int
	Lower      float64
	Upper      float64
	Score      float64
	WasFlipped bool
}

Node is a type that holds an array of bools (model) indicating which features are active. The leven field tells which level in the tree this node is on. The lower and upper fields represents bounds of the score for all models that are childrens of this node. The score field holsd the score of this model.

func CreateChild

func CreateChild(node *Node, flip bool, X mat.Matrix, y []float64, cutoff float64, h *Highscore) *Node

CreateChild creates a child not of a parent. Returns nil if number of rows is zero or the lower bound is lower than the current best score

func NewNode

func NewNode(level int, model []bool) *Node

NewNode creates a new node

func (*Node) EstimateMemory

func (n *Node) EstimateMemory() int

EstimateMemory estimate the memory consumption of the node in bytes

func (*Node) GetChildNode

func (n *Node) GetChildNode(flip bool) *Node

GetChildNode creates a child not of node. If flip is true, the "bit" at parent.Level is flipped

func (*Node) MarshalJSON

func (n *Node) MarshalJSON() ([]byte, error)

MarshalJSON converts a node to JSON representation

func (*Node) ToSparseCoeff

func (n *Node) ToSparseCoeff() SparseCoeff

ToSparseCoeff converts a node into a SparseCoeff structure

func (*Node) UnmarshalJSON

func (n *Node) UnmarshalJSON(data []byte) error

UnmarshalJSON decodes a JSON representation of a Node

type NormalizedData

type NormalizedData struct {
	X *mat.Dense

	HasBias bool
	// contains filtered or unexported fields
}

NormalizedData is a structure that is used to normalise columns to zero mean and unit variance

func NewNormalizedData

func NewNormalizedData(X *mat.Dense, y []float64) *NormalizedData

NewNormalizedData initializes a new structure with normalised data Note that both X and y will be altered by this method.

func (*NormalizedData) LinearNormalizationTransformation

func (n *NormalizedData) LinearNormalizationTransformation(idx int, value float64) float64

LinearNormalizationTransformation computes the difference in coefficient in the expansion y1 = c_0 + c_1*x_1 + c_2*x_2+ ... and y2 = c_0' + c_1'*x_1' + c_2'*x_2', where primed x and y are normalized values

func (*NormalizedData) LinearTransformationBias

func (n *NormalizedData) LinearTransformationBias(selected []int, coeff []float64) float64

LinearTransformationBias calculates the bias coefficient

type OmpResult

type OmpResult struct {
	Coeff []float64
	Order []int
}

OmpResult is a structure that holds the result of the Orthogonal Matching Pursuit Coeff is the fitted coefficients, Order is the order the coefficients where included. Order[0] is the index of the coefficient that was first included

func NewOmpResult

func NewOmpResult(numFeatures int) *OmpResult

NewOmpResult returns a new instance of OmpResult. numFeatures is the number of features in the dataset

func Omp

func Omp(X mat.Matrix, y []float64, tol float64) *OmpResult

Omp performs Orthogonal Matching Pursuit

type PureLasso

type PureLasso struct{}

PureLasso is a type that does not alter the original lasso method

func (*PureLasso) Deriv

func (p *PureLasso) Deriv(bet []float64, featNo int) float64

Deriv returns 0.0

func (*PureLasso) Update

func (p *PureLasso) Update(beta []float64)

Update does not do anything

type PureLassoLarsPathWithCrit

type PureLassoLarsPathWithCrit struct {
	Nodes []*LassoLarsNode
	Aicc  []float64
	Bic   []float64
}

PureLassoLarsPathWithCrit is a type that holds all the nodes in addition to information on the Aicc and Bic values

type SAItem

type SAItem struct {
	Selection []int
	Coeff     []float64
	Score     float64
}

SAItem is one item in the SA queue

func NewSAItem

func NewSAItem(model []bool) *SAItem

NewSAItem creates a new instane of SAIte

type SARes

type SARes struct {
	Selected []int
	Coeff    []float64
	Scores   *SAScore
}

SARes holds the solution of SA search

func SelectModelSA

func SelectModelSA(X mat.Matrix, y []float64, nSweeps int, cost crit) *SARes

SelectModelSA uses simmulated annealing to select the model

type SAScore

type SAScore struct {
	Cap       int
	Items     []*SAItem
	BestItem  *SAItem
	WorstItem *SAItem
}

SAScore is a type that is used to efficiently maintain a highscore list of simmulated annealing features

func NewSAScore

func NewSAScore(length int) *SAScore

NewSAScore creates a new item with the scores

func (*SAScore) Exists

func (s *SAScore) Exists(item *SAItem) bool

Exists return true if the item already exists in the queue

func (*SAScore) Insert

func (s *SAScore) Insert(item *SAItem)

Insert a new item in the queue

type SearchProgress

type SearchProgress struct {
	BestScore     float64
	NumExplored   int
	Log2NumPruned float64
	// contains filtered or unexported fields
}

func (*SearchProgress) Get

func (sp *SearchProgress) Get() (float64, int, float64)

Get current state

func (*SearchProgress) Set

func (sp *SearchProgress) Set(bs float64, ne int, np float64)

Set a new state

type SelectModelOptParams

type SelectModelOptParams struct {
	Cutoff       float64
	RootModel    []bool
	MaxQueueSize int
}

SelectModelOptParams is a struct holding optional parameters for the SelectModel function.

func NewSelectModelOptParams

func NewSelectModelOptParams() *SelectModelOptParams

NewSelectModelOptParams initialises the struct with optional parameters with the default values

type SliceVec

type SliceVec struct {
	// contains filtered or unexported fields
}

SliceVec is a wrapper around a slice which satisfies the mat.Vector interface It does not allocate a new slice for the data (which mat.Vector does)

func NewSliceVec

func NewSliceVec(d []float64) *SliceVec

NewSliceVec returns a new slice vector

func (*SliceVec) At

func (s *SliceVec) At(i, j int) float64

At returns the i-th element of the vector

func (*SliceVec) AtVec

func (s *SliceVec) AtVec(i int) float64

AtVec returns the element at position i

func (*SliceVec) Dims

func (s *SliceVec) Dims() (int, int)

Dims returns the dimension of the a column vector

func (*SliceVec) Len

func (s *SliceVec) Len() int

Len returns the length of the vector

func (*SliceVec) T

func (s *SliceVec) T() mat.Matrix

T returns the transpose of the vector (i.e. a column vector)

type SparseCoeff

type SparseCoeff struct {
	Coeff     []float64
	Selection []int
}

SparseCoeff is a structure for representing sparse set of coefficients Each coefficient corresponding to the corresponding feature number in the Selection array

func LassoNodesSlice2SparsCoeff

func LassoNodesSlice2SparsCoeff(nodes []*LassoLarsNode) []SparseCoeff

LassoNodesSlice2SparsCoeff converts a slice with LassoLarsNodes into a SparseCoeff slice (which is simply to transfer the Coeff and Selection slices)

type SparseThresholded

type SparseThresholded struct {
	X mat.Matrix
	// contains filtered or unexported fields
}

SparseThresholded is a type that uses a sparse threshold algorithm to make a sparse approximation of the covariance matrix

func NewSparseThreshold

func NewSparseThreshold(X mat.Matrix) *SparseThresholded

NewSparseThreshold constructs a new sinstance of SparseThresholded covariance matrix

func (*SparseThresholded) Get

Get returns the covariance matrix

type ThresholdOperator

type ThresholdOperator struct {
	// contains filtered or unexported fields
}

ThresholdOperator is a type the is used to threshold matrices

func L2ConsistentCovTO

func L2ConsistentCovTO(X mat.Matrix, numSamples int, maxThreshold float64, step float64) *ThresholdOperator

L2ConsistentCovTO returns a threshold operator that yield a sparse approximation to the covariance matrix X^TX when applied. It does so by partitioning the data (rows) into to matrix at random a number times (numSamples) then it searches on a grid with (numGrid) for the optimal threshold. The grid is defined by {j*sqrt((log p)/n): 0 <= j < numGrud}, where p is the number of columns and n is the number of rows

func (*ThresholdOperator) Apply

func (t *ThresholdOperator) Apply(X mat.Mutable)

Apply sets all elements in X that is smaller than the threshold to 0

type ValueGridPt

type ValueGridPt struct {
	// contains filtered or unexported fields
}

ValueGridPt is a type that holds a float value as well as its channel number

type ValueIdx

type ValueIdx struct {
	// contains filtered or unexported fields
}

ValueIdx is a struct for holding a value and its corresponding index

type Workload

type Workload struct {
	// contains filtered or unexported fields
}

Workload is a struct with information to be passed to a go-worker

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL