Documentation ¶
Index ¶
- func KFoldsSplit(fileRows [][]string, k int) ([][][]string, error)
- func LooSplit(fileRows [][]string, idName string) ([][][]string, error)
- func ShuffleKFoldsSplit(fileRows [][]string, idName string, k int, seed string) ([][][]string, error)
- func ShuffleSplit(fileRows [][]string, idName string, percents int, seed string) ([2][][]string, error)
- func Split(fileRows [][]string, percents int) ([2][][]string, error)
- type BinClassValidation
- type RegressionValidation
- type Splitter
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func KFoldsSplit ¶
KFoldsSplit divides the file into `k` parts directly. k is the number of parts that only could be 5 or 10. The first row of `fileRows` contains just names of feature, and it should be kept in all parts of return.
func LooSplit ¶
LooSplit sorts file rows by IDs which extracted from file by `idName`, then divides each row into a subset.
func ShuffleKFoldsSplit ¶
func ShuffleKFoldsSplit(fileRows [][]string, idName string, k int, seed string) ([][][]string, error)
ShuffleKFoldsSplit sorts file rows by IDs which extracted from file by `idName`, and shuffles the sorted rows, then divides the file into `k` parts. k is the number of parts that only could be 5 or 10.
func ShuffleSplit ¶
func ShuffleSplit(fileRows [][]string, idName string, percents int, seed string) ([2][][]string, error)
ShuffleSplit sorts file rows by IDs which extracted from file by `idName`, and shuffles the sorted rows, then divides the file into two parts based on `percents` which denotes the first part of return.
Types ¶
type BinClassValidation ¶
type BinClassValidation interface { // Splitter divides data set into several subsets with some strategies (such as KFolds, LOO), // and hold out one subset as validation set and others as training set Splitter // SetPredictOut sets predicted probabilities from a prediction set to which `idx` refers. SetPredictOut(idx int, predProbas []float64) error // GetAllPredictOuts returns all prediction results has been stored. GetAllPredictOuts() map[int][]string // GetAccuracy returns classification accuracy. // idx is the index of prediction set (also of validation set) in split folds. GetAccuracy(idx int) (float64, error) // GetAllAccuracy returns scores of classification accuracy over all split folds, // and its Mean and Standard Deviation. GetAllAccuracy() (map[int]float64, float64, float64, error) // GetReport returns a json bytes of precision, recall, f1, true positive, // false positive, true negatives and false negatives for each class, and accuracy. GetReport(idx int) ([]byte, error) // GetReport returns a json bytes of precision, recall, f1, true positive, // false positive, true negatives and false negatives for each class, and accuracy, over all split folds. GetOverallReport() (map[int][]byte, error) // GetROCAndAUC returns a json bytes of roc's points and auc. GetROCAndAUC(idx int) ([]byte, error) // GetAllROCAndAUC returns a map contains all split folds' json bytes of roc and auc. GetAllROCAndAUC() (map[int][]byte, error) }
BinClassValidation performs validation of Binary Classfication case
func NewBinClassValidation ¶
func NewBinClassValidation(file [][]string, label string, idName string, posClass string, negClass string, threshold float64) (BinClassValidation, error)
NewBinClassValidation creates a BinClassValidation instance to handle binary classification validation. file contains all rows of a file,
and its first row contains just names of feature, and others contain all feature values
idName denotes which feature is ID that would be used in sample alignment label denotes name of lable feature posClass denotes name of positive class and must be one feature name in `file` negClass denotes name of negtive class, could be set with empty string
type RegressionValidation ¶
type RegressionValidation interface { // Splitter divides data set into several subsets with some strategies (such as KFolds, LOO), // and hold out one subset as validation set and others as training set Splitter // SetPredictOut sets prediction outcomes from a prediction set to which `idx` refers. SetPredictOut(idx int, yPred []float64) error // GetAllPredictOuts returns all prediction results has been stored. GetAllPredictOuts() map[int][]float64 // GetRMSE returns RMSE over the validation set to which `idx` refers. GetRMSE(idx int) (float64, error) // GetAllRMSE returns scores of RMSE over all split folds, // and its Mean and Standard Deviation. GetAllRMSE() (map[int]float64, float64, float64, error) }
RegressionValidation performs validation of Regression case
func NewRegressionValidation ¶
func NewRegressionValidation(file [][]string, label string, idName string) (RegressionValidation, error)
NewRegressionValidation creates a RegressionValidation instance to handle regression validation. file contains all rows of a file,
and its first row contains just names of feature, and others contain all feature values
idName denotes which feature is ID that would be used in sample alignment
type Splitter ¶
type Splitter interface { // Split divides the file into two parts directly // based on percentage which denotes the first part of divisions. Split(percents int) error // ShuffleSplit shuffles the rows with `seed`, // then divides the file into two parts // based on `percents` which denotes the first part of divisions. ShuffleSplit(percents int, seed string) error // KFoldsSplit divides the file into `k` parts directly. // k is the number of parts that only could be 5 or 10. KFoldsSplit(k int) error // ShuffleKFoldsSplit shuffles the sorted rows with `seed`, // then divides the file into `k` parts. // k is the number of parts that only could be 5 or 10. ShuffleKFoldsSplit(k int, seed string) error // LooSplit sorts file rows by IDs which extracted from file by `idName`, // then divides each row into a subset. LooSplit() error // GetAllFolds returns all folds after split. // And could be only called successfully after split. GetAllFolds() ([][][]string, error) // GetTrainSet holds out the subset to which refered by `idxHO` // and returns the remainings as training set. GetTrainSet(idxHO int) ([][]string, error) // GetPredictSet returns the subset to which refered by `idx` // as predicting set (without label feature). GetPredictSet(idx int) ([][]string, error) // GetPredictSet returns the subset to which refered by `idx` // as validation set. GetValidSet(idx int) ([][]string, error) }
Splitter divides data set into several subsets with some strategies (such as KFolds, LOO), and hold out one subset as validation set and others as training set
func NewSplitter ¶
NewSplitter creates a Splitter instance. file contains all rows of a file,
and its first row contains just names of feature, and others contain all feature values.
idName denotes which feature is ID that would be used in sample alignment. label denotes name of lable feature.