classifier

package

v0.2.0 Latest Latest Go to latest Published: Jun 16, 2016 License: BSD-3-Clause Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/shuLhan/go-mining

Links

Open Source Insights

Documentation ¶

Index ¶

Variables
func ComputeAccuracies(tp, fp, tn, fn []int64) (accuracies []float64)
func ComputeElapsedTimes(start, end []int64) (elaps []int64)
func ComputeFMeasures(precisions, recalls []float64) (fmeasures []float64)
type CM
type Runtime
type Stat
type Stats

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// DEBUG level, can be set from environment through
	// CONFUSIONMATRIX_DEBUG variable.
	DEBUG = 0
)

View Source

var (
	// RuntimeDebug level, can be set it from environment variable
	// "RuntimeDebug".
	RuntimeDebug = 0
)

Functions ¶

func ComputeAccuracies ¶

func ComputeAccuracies(tp, fp, tn, fn []int64) (accuracies []float64)

ComputeAccuracies will compute and return accuracy from array of true-positive, false-positive, true-negative, and false-negative; using formula,

(tp + tn) / (tp + tn + tn + fn)

func ComputeElapsedTimes ¶

func ComputeElapsedTimes(start, end []int64) (elaps []int64)

ComputeElapsedTimes will compute and return elapsed time between `start` and `end` timestamps.

func ComputeFMeasures ¶

func ComputeFMeasures(precisions, recalls []float64) (fmeasures []float64)

ComputeFMeasures given array of precisions and recalls, compute F-measure of each instance and return it.

Types ¶

type CM ¶

type CM struct {
	tabula.Dataset
	// contains filtered or unexported fields
}

CM represent the matrix of classification.

func (*CM) ComputeNumeric ¶

func (cm *CM) ComputeNumeric(vs, actuals, predictions []int64)

ComputeNumeric will calculate confusion matrix using targets and predictions values.

func (*CM) ComputeStrings ¶

func (cm *CM) ComputeStrings(valueSpace, targets, predictions []string)

ComputeStrings will calculate confusion matrix using targets and predictions class values.

func (*CM) FN ¶

func (cm *CM) FN() int

FN return number of false-negative.

func (*CM) FNIndices ¶

func (cm *CM) FNIndices() []int

FNIndices return indices of all false-negative samples.

func (*CM) FP ¶

func (cm *CM) FP() int

FP return number of false-positive in confusion matrix.

func (*CM) FPIndices ¶

func (cm *CM) FPIndices() []int

FPIndices return indices of all false-positive samples.

func (*CM) GetColumnClassError ¶

func (cm *CM) GetColumnClassError() *tabula.Column

GetColumnClassError return the last column which is the column that contain the error of classification.

func (*CM) GetFalseRate ¶

func (cm *CM) GetFalseRate() float64

GetFalseRate return false-positive rate in term of,

false-positive / (false-positive + true negative)

func (*CM) GetTrueRate ¶

func (cm *CM) GetTrueRate() float64

GetTrueRate return true-positive rate in term of

true-positive / (true-positive + false-positive)

func (*CM) GroupIndexPredictions ¶

func (cm *CM) GroupIndexPredictions(sampleIds []int,
	actuals, predictions []int64,
)

GroupIndexPredictions given index of samples, group the samples by their class of prediction. For example,

sampleIds:   [0, 1, 2, 3, 4, 5]
actuals:     [1, 1, 0, 0, 1, 0]
predictions: [1, 0, 1, 0, 1, 1]

This function will group the index by true-positive, false-positive, true-negative, and false-negative, which result in,

	true-positive indices:  [0, 4]
	false-positive indices: [2, 5]
	true-negative indices:  [3]
     false-negative indices: [1]

This function assume that positive value as "1" and negative value as "0".

func (*CM) GroupIndexPredictionsStrings ¶

func (cm *CM) GroupIndexPredictionsStrings(sampleIds []int,
	actuals, predictions []string,
)

GroupIndexPredictionsStrings is an alternative to GroupIndexPredictions which work with string class.

func (*CM) String ¶

func (cm *CM) String() (s string)

String will return the output of confusion matrix in table like format.

func (*CM) TN ¶

func (cm *CM) TN() int

TN return number of true-negative.

func (*CM) TNIndices ¶

func (cm *CM) TNIndices() []int

TNIndices return indices of all true-negative samples.

func (*CM) TP ¶

func (cm *CM) TP() int

TP return number of true-positive in confusion matrix.

func (*CM) TPIndices ¶

func (cm *CM) TPIndices() []int

TPIndices return indices of all true-positive samples.

type Runtime ¶

type Runtime struct {
	// RunOOB if its true the OOB will be computed, default is false.
	RunOOB bool `json:"RunOOB"`

	// OOBStatsFile is the file where OOB statistic will be written.
	OOBStatsFile string `json:"OOBStatsFile"`

	// PerfFile is the file where statistic of performance will be written.
	PerfFile string `json:"PerfFile"`

	// StatFile is the file where statistic of classifying samples will be
	// written.
	StatFile string `json:"StatFile"`
	// contains filtered or unexported fields
}

Runtime define a generic type which provide common fields that can be embedded by the real classifier (e.g. RandomForest).

func (*Runtime) AddOOBCM ¶

func (rt *Runtime) AddOOBCM(cm *CM)

AddOOBCM will append new confusion matrix.

func (*Runtime) AddStat ¶

func (rt *Runtime) AddStat(stat *Stat)

AddStat will append new classifier statistic data.

func (*Runtime) CloseOOBStatsFile ¶

func (rt *Runtime) CloseOOBStatsFile() (e error)

CloseOOBStatsFile will close statistics file for writing.

func (*Runtime) ComputeCM ¶

func (rt *Runtime) ComputeCM(sampleIds []int,
	vs, actuals, predicts []string,
) (
	cm *CM,
)

ComputeCM will compute confusion matrix of sample using value space, actual and prediction values.

func (*Runtime) ComputeStatFromCM ¶

func (rt *Runtime) ComputeStatFromCM(stat *Stat, cm *CM)

ComputeStatFromCM will compute statistic using confusion matrix.

func (*Runtime) ComputeStatTotal ¶

func (rt *Runtime) ComputeStatTotal(stat *Stat)

ComputeStatTotal compute total statistic.

func (*Runtime) Finalize ¶

func (rt *Runtime) Finalize() (e error)

Finalize finish the runtime, compute total statistic, write it to file, and close the file.

func (*Runtime) Initialize ¶

func (rt *Runtime) Initialize() error

Initialize will start the runtime for processing by saving start time and opening stats file.

func (*Runtime) OOBStats ¶

func (rt *Runtime) OOBStats() *Stats

OOBStats return all statistic objects.

func (*Runtime) OpenOOBStatsFile ¶

func (rt *Runtime) OpenOOBStatsFile() error

OpenOOBStatsFile will open statistic file for output.

func (*Runtime) Performance ¶

func (rt *Runtime) Performance(samples tabula.ClasetInterface,
	predicts []string, probs []float64,
) (
	perfs Stats,
)

Performance given an actuals class label and their probabilities, compute the performance statistic of classifier.

Algorithm, (1) Sort the probabilities in descending order. (2) Sort the actuals and predicts using sorted index from probs (3) Compute tpr, fpr, precision (4) Write performance to file.

func (*Runtime) PrintOobStat ¶

func (rt *Runtime) PrintOobStat(stat *Stat, cm *CM)

PrintOobStat will print the out-of-bag statistic to standard output.

func (*Runtime) PrintStat ¶

func (rt *Runtime) PrintStat(stat *Stat)

PrintStat will print statistic value to standard output.

func (*Runtime) PrintStatTotal ¶

func (rt *Runtime) PrintStatTotal(st *Stat)

PrintStatTotal will print total statistic to standard output.

func (*Runtime) StatTotal ¶

func (rt *Runtime) StatTotal() *Stat

StatTotal return total statistic.

func (*Runtime) WriteOOBStat ¶

func (rt *Runtime) WriteOOBStat(stat *Stat) error

WriteOOBStat will write statistic of process to file.

func (*Runtime) WritePerformance ¶

func (rt *Runtime) WritePerformance() error

WritePerformance will write performance data to file.

type Stat ¶

type Stat struct {
	// ID unique id for this statistic (e.g. number of tree).
	ID int64
	// StartTime contain the start time of classifier in unix timestamp.
	StartTime int64
	// EndTime contain the end time of classifier in unix timestamp.
	EndTime int64
	// ElapsedTime contain actual time, in seconds, between end and start
	// time.
	ElapsedTime int64
	// TP contain true-positive value.
	TP int64
	// FP contain false-positive value.
	FP int64
	// TN contain true-negative value.
	TN int64
	// FN contain false-negative value.
	FN int64
	// OobError contain out-of-bag error.
	OobError float64
	// OobErrorMean contain mean of out-of-bag error.
	OobErrorMean float64
	// TPRate contain true-positive rate (recall): tp/(tp+fn)
	TPRate float64
	// FPRate contain false-positive rate: fp/(fp+tn)
	FPRate float64
	// TNRate contain true-negative rate: tn/(tn+fp)
	TNRate float64
	// Precision contain: tp/(tp+fp)
	Precision float64
	// FMeasure contain value of F-measure or the harmonic mean of
	// precision and recall.
	FMeasure float64
	// Accuracy contain the degree of closeness of measurements of a
	// quantity to that quantity's true value.
	Accuracy float64
	// AUC contain the area under curve.
	AUC float64
}

Stat hold statistic value of classifier, including TP rate, FP rate, precision, and recall.

func (*Stat) End ¶

func (stat *Stat) End()

End will stop the timer and compute the elapsed time.

func (*Stat) Recall ¶

func (stat *Stat) Recall() float64

Recall return value of recall.

func (*Stat) SetAUC ¶

func (stat *Stat) SetAUC(v float64)

SetAUC will set the AUC value.

func (*Stat) SetFPRate ¶

func (stat *Stat) SetFPRate(fp, n int64)

SetFPRate will set FP and FPRate using number of negative `n`.

func (*Stat) SetPrecisionFromRate ¶

func (stat *Stat) SetPrecisionFromRate(p, n int64)

SetPrecisionFromRate will set Precision value using tprate and fprate. `p` and `n` is the number of positive and negative class in samples.

func (*Stat) SetTPRate ¶

func (stat *Stat) SetTPRate(tp, p int64)

SetTPRate will set TP and TPRate using number of positive `p`.

func (*Stat) Start ¶

func (stat *Stat) Start()

Start will start the timer.

func (*Stat) Sum ¶

func (stat *Stat) Sum(other *Stat)

Sum will add statistic from other stat object to current stat, not including the start and end time.

func (*Stat) ToRow ¶

func (stat *Stat) ToRow() (row *tabula.Row)

ToRow will convert the stat to tabula.row in the order of Stat field.

func (*Stat) Write ¶

func (stat *Stat) Write(file string) (e error)

Write will write the content of stat to `file`.

type Stats ¶

type Stats []*Stat

Stats define list of statistic values.

func (*Stats) Accuracies ¶

func (stats *Stats) Accuracies() (accuracies []float64)

Accuracies return all accuracy values.

func (*Stats) Add ¶

func (stats *Stats) Add(stat *Stat)

Add will add other stat object to the slice.

func (*Stats) EndTimes ¶

func (stats *Stats) EndTimes() (times []int64)

EndTimes return all end times in unix timestamp.

func (*Stats) FMeasures ¶

func (stats *Stats) FMeasures() (fmeasures []float64)

FMeasures return all F-measure values.

func (*Stats) FPRates ¶

func (stats *Stats) FPRates() (fprates []float64)

FPRates return all false-positive rate values.

func (*Stats) OobErrorMeans ¶

func (stats *Stats) OobErrorMeans() (oobmeans []float64)

OobErrorMeans return all out-of-bag error mean values.

func (*Stats) Precisions ¶

func (stats *Stats) Precisions() (precs []float64)

Precisions return all precision values.

func (*Stats) Recalls ¶

func (stats *Stats) Recalls() (recalls []float64)

Recalls return all recall values.

func (*Stats) StartTimes ¶

func (stats *Stats) StartTimes() (times []int64)

StartTimes return all start times in unix timestamp.

func (*Stats) TNRates ¶

func (stats *Stats) TNRates() (tnrates []float64)

TNRates will return all true-negative rate values.

func (*Stats) TPRates ¶

func (stats *Stats) TPRates() (tprates []float64)

TPRates return all true-positive rate values.

func (*Stats) Write ¶

func (stats *Stats) Write(file string) (e error)

Write will write all statistic data to `file`.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cart Package cart implement the Classification and Regression Tree by Breiman, et al.	Package cart implement the Classification and Regression Tree by Breiman, et al.
crf Package crf implement the cascaded random forest algorithm, proposed by Baumann et.al in their paper: Baumann, Florian, et al.	Package crf implement the cascaded random forest algorithm, proposed by Baumann et.al in their paper: Baumann, Florian, et al.
rf Package rf implement ensemble of classifiers using random forest algorithm by Breiman and Cutler.	Package rf implement ensemble of classifiers using random forest algorithm by Breiman and Cutler.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL