Documentation ¶
Overview ¶
Package classifier provides machine learning classifier library, including CART, Random Forest, Cascaded Random Forest, and KNN.
Index ¶
- func ComputeAccuracies(tp, fp, tn, fn []int64) (accuracies []float64)
- func ComputeElapsedTimes(start, end []int64) (elaps []int64)
- func ComputeFMeasures(precisions, recalls []float64) (fmeasures []float64)
- type CM
- func (cm *CM) ComputeNumeric(vs, actuals, predictions []int64)
- func (cm *CM) ComputeStrings(valueSpace, targets, predictions []string)
- func (cm *CM) FN() int
- func (cm *CM) FNIndices() []int
- func (cm *CM) FP() int
- func (cm *CM) FPIndices() []int
- func (cm *CM) GetColumnClassError() *tabula.Column
- func (cm *CM) GetFalseRate() float64
- func (cm *CM) GetTrueRate() float64
- func (cm *CM) GroupIndexPredictions(sampleListID []int, actuals, predictions []int64)
- func (cm *CM) GroupIndexPredictionsStrings(sampleListID []int, actuals, predictions []string)
- func (cm *CM) String() (s string)
- func (cm *CM) TN() int
- func (cm *CM) TNIndices() []int
- func (cm *CM) TP() int
- func (cm *CM) TPIndices() []int
- type Runtime
- func (rt *Runtime) AddOOBCM(cm *CM)
- func (rt *Runtime) AddStat(stat *Stat)
- func (rt *Runtime) CloseOOBStatsFile() (e error)
- func (rt *Runtime) ComputeCM(sampleListID []int, vs, actuals, predicts []string) (cm *CM)
- func (rt *Runtime) ComputeStatFromCM(stat *Stat, cm *CM)
- func (rt *Runtime) ComputeStatTotal(stat *Stat)
- func (rt *Runtime) Finalize() (e error)
- func (rt *Runtime) Initialize() error
- func (rt *Runtime) OOBStats() *Stats
- func (rt *Runtime) OpenOOBStatsFile() error
- func (rt *Runtime) Performance(samples tabula.ClasetInterface, predicts []string, probs []float64) (perfs Stats)
- func (rt *Runtime) PrintOobStat(stat *Stat, cm *CM)
- func (rt *Runtime) PrintStat(stat *Stat)
- func (rt *Runtime) PrintStatTotal(st *Stat)
- func (rt *Runtime) StatTotal() *Stat
- func (rt *Runtime) WriteOOBStat(stat *Stat) error
- func (rt *Runtime) WritePerformance() error
- type Stat
- func (stat *Stat) End()
- func (stat *Stat) Recall() float64
- func (stat *Stat) SetAUC(v float64)
- func (stat *Stat) SetFPRate(fp, n int64)
- func (stat *Stat) SetPrecisionFromRate(p, n int64)
- func (stat *Stat) SetTPRate(tp, p int64)
- func (stat *Stat) Start()
- func (stat *Stat) Sum(other *Stat)
- func (stat *Stat) ToRow() (row *tabula.Row)
- func (stat *Stat) Write(file string) (e error)
- type Stats
- func (stats *Stats) Accuracies() (accuracies []float64)
- func (stats *Stats) Add(stat *Stat)
- func (stats *Stats) EndTimes() (times []int64)
- func (stats *Stats) FMeasures() (fmeasures []float64)
- func (stats *Stats) FPRates() (fprates []float64)
- func (stats *Stats) OobErrorMeans() (oobmeans []float64)
- func (stats *Stats) Precisions() (precs []float64)
- func (stats *Stats) Recalls() (recalls []float64)
- func (stats *Stats) StartTimes() (times []int64)
- func (stats *Stats) TNRates() (tnrates []float64)
- func (stats *Stats) TPRates() (tprates []float64)
- func (stats *Stats) Write(file string) (e error)
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ComputeAccuracies ¶
ComputeAccuracies will compute and return accuracy from array of true-positive, false-positive, true-negative, and false-negative; using formula,
(tp + tn) / (tp + tn + tn + fn)
func ComputeElapsedTimes ¶
ComputeElapsedTimes will compute and return elapsed time between `start` and `end` timestamps.
func ComputeFMeasures ¶
ComputeFMeasures given array of precisions and recalls, compute F-measure of each instance and return it.
Types ¶
type CM ¶
CM represent the matrix of classification.
func (*CM) ComputeNumeric ¶
ComputeNumeric will calculate confusion matrix using targets and predictions values.
func (*CM) ComputeStrings ¶
ComputeStrings will calculate confusion matrix using targets and predictions class values.
func (*CM) GetColumnClassError ¶
GetColumnClassError return the last column which is the column that contain the error of classification.
func (*CM) GetFalseRate ¶
GetFalseRate return false-positive rate in term of,
false-positive / (false-positive + true negative)
func (*CM) GetTrueRate ¶
GetTrueRate return true-positive rate in term of
true-positive / (true-positive + false-positive)
func (*CM) GroupIndexPredictions ¶
GroupIndexPredictions given index of samples, group the samples by their class of prediction. For example,
sampleListID: [0, 1, 2, 3, 4, 5] actuals: [1, 1, 0, 0, 1, 0] predictions: [1, 0, 1, 0, 1, 1]
This function will group the index by true-positive, false-positive, true-negative, and false-negative, which result in,
true-positive indices: [0, 4] false-positive indices: [2, 5] true-negative indices: [3] false-negative indices: [1]
This function assume that positive value as "1" and negative value as "0".
func (*CM) GroupIndexPredictionsStrings ¶
GroupIndexPredictionsStrings is an alternative to GroupIndexPredictions which work with string class.
type Runtime ¶
type Runtime struct { // OOBStatsFile is the file where OOB statistic will be written. OOBStatsFile string `json:"OOBStatsFile"` // PerfFile is the file where statistic of performance will be written. PerfFile string `json:"PerfFile"` // StatFile is the file where statistic of classifying samples will be // written. StatFile string `json:"StatFile"` // RunOOB if its true the OOB will be computed, default is false. RunOOB bool `json:"RunOOB"` // contains filtered or unexported fields }
Runtime define a generic type which provide common fields that can be embedded by the real classifier (e.g. RandomForest).
func (*Runtime) CloseOOBStatsFile ¶
CloseOOBStatsFile will close statistics file for writing.
func (*Runtime) ComputeCM ¶
ComputeCM will compute confusion matrix of sample using value space, actual and prediction values.
func (*Runtime) ComputeStatFromCM ¶
ComputeStatFromCM will compute statistic using confusion matrix.
func (*Runtime) ComputeStatTotal ¶
ComputeStatTotal compute total statistic.
func (*Runtime) Finalize ¶
Finalize finish the runtime, compute total statistic, write it to file, and close the file.
func (*Runtime) Initialize ¶
Initialize will start the runtime for processing by saving start time and opening stats file.
func (*Runtime) OpenOOBStatsFile ¶
OpenOOBStatsFile will open statistic file for output.
func (*Runtime) Performance ¶
func (rt *Runtime) Performance(samples tabula.ClasetInterface, predicts []string, probs []float64, ) ( perfs Stats, )
Performance given an actuals class label and their probabilities, compute the performance statistic of classifier.
Algorithm, (1) Sort the probabilities in descending order. (2) Sort the actuals and predicts using sorted index from probs (3) Compute tpr, fpr, precision (4) Write performance to file.
func (*Runtime) PrintOobStat ¶
PrintOobStat will print the out-of-bag statistic to standard output.
func (*Runtime) PrintStatTotal ¶
PrintStatTotal will print total statistic to standard output.
func (*Runtime) WriteOOBStat ¶
WriteOOBStat will write statistic of process to file.
func (*Runtime) WritePerformance ¶
WritePerformance will write performance data to file.
type Stat ¶
type Stat struct { // ID unique id for this statistic (e.g. number of tree). ID int64 // StartTime contain the start time of classifier in unix timestamp. StartTime int64 // EndTime contain the end time of classifier in unix timestamp. EndTime int64 // ElapsedTime contain actual time, in seconds, between end and start // time. ElapsedTime int64 // TP contain true-positive value. TP int64 // FP contain false-positive value. FP int64 // TN contain true-negative value. TN int64 // FN contain false-negative value. FN int64 // OobError contain out-of-bag error. OobError float64 // OobErrorMean contain mean of out-of-bag error. OobErrorMean float64 // TPRate contain true-positive rate (recall): tp/(tp+fn) TPRate float64 // FPRate contain false-positive rate: fp/(fp+tn) FPRate float64 // TNRate contain true-negative rate: tn/(tn+fp) TNRate float64 // Precision contain: tp/(tp+fp) Precision float64 // FMeasure contain value of F-measure or the harmonic mean of // precision and recall. FMeasure float64 // Accuracy contain the degree of closeness of measurements of a // quantity to that quantity's true value. Accuracy float64 // AUC contain the area under curve. AUC float64 }
Stat hold statistic value of classifier, including TP rate, FP rate, precision, and recall.
func (*Stat) SetPrecisionFromRate ¶
SetPrecisionFromRate will set Precision value using tprate and fprate. `p` and `n` is the number of positive and negative class in samples.
func (*Stat) Sum ¶
Sum will add statistic from other stat object to current stat, not including the start and end time.
type Stats ¶
type Stats []*Stat
Stats define list of statistic values.
func (*Stats) Accuracies ¶
Accuracies return all accuracy values.
func (*Stats) OobErrorMeans ¶
OobErrorMeans return all out-of-bag error mean values.
func (*Stats) Precisions ¶
Precisions return all precision values.
func (*Stats) StartTimes ¶
StartTimes return all start times in unix timestamp.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
Package cart implement the Classification and Regression Tree by Breiman, et al.
|
Package cart implement the Classification and Regression Tree by Breiman, et al. |
Package crf implement the cascaded random forest algorithm, proposed by Baumann et.al in their paper:
|
Package crf implement the cascaded random forest algorithm, proposed by Baumann et.al in their paper: |
Package rf implement ensemble of classifiers using random forest algorithm by Breiman and Cutler.
|
Package rf implement ensemble of classifiers using random forest algorithm by Breiman and Cutler. |