cart

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2016 License: BSD-3-Clause Imports: 9 Imported by: 0

Documentation

Overview

Package cart implement the Classification and Regression Tree by Breiman, et al. CART is binary decision tree.

Breiman, Leo, et al. Classification and regression trees. CRC press,
1984.

The implementation is based on Data Mining book,

Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and
techniques: concepts and techniques. Elsevier, 2011.

Index

Constants

View Source
const (
	// ColFlagParent denote that the column is parent/split node.
	ColFlagParent = 1
	// ColFlagSkip denote that the column would be skipped.
	ColFlagSkip = 2
)
View Source
const (
	// SplitMethodGini if defined in Runtime, the dataset will be splitted
	// using Gini gain for each possible value or partition.
	//
	// This option is used in Runtime.SplitMethod.
	SplitMethodGini = "gini"
)

Variables

View Source
var (
	// DEBUG level, set from environment.
	DEBUG = 0
)

Functions

This section is empty.

Types

type NodeValue

type NodeValue struct {
	// Class of leaf node.
	Class string
	// SplitAttrName define the name of attribute which cause the split.
	SplitAttrName string
	// IsLeaf define whether node is a leaf or not.
	IsLeaf bool
	// IsContinu define whether the node split is continuous or discrete.
	IsContinu bool
	// Size define number of sample that this node hold before splitting.
	Size int
	// SplitAttrIdx define the attribute which cause the split.
	SplitAttrIdx int
	// SplitV define the split value.
	SplitV interface{}
}

NodeValue of tree in CART.

func (*NodeValue) String

func (nodev *NodeValue) String() (s string)

String will return the value of node for printable.

type Runtime

type Runtime struct {
	// SplitMethod define the criteria to used for splitting.
	SplitMethod string `json:"SplitMethod"`
	// NRandomFeature if less or equal to zero compute gain on all feature,
	// otherwise select n random feature and compute gain only on selected
	// features.
	NRandomFeature int `json:"NRandomFeature"`
	// OOBErrVal is the last out-of-bag error value in the tree.
	OOBErrVal float64
	// Tree in classification.
	Tree binary.Tree
}

Runtime data for building CART.

func New

func New(D tabula.ClasetInterface, splitMethod string, nRandomFeature int) (
	*Runtime, error,
)

New create new Runtime object.

func (*Runtime) Build

func (runtime *Runtime) Build(D tabula.ClasetInterface) (e error)

Build will create a tree using CART algorithm.

func (*Runtime) Classify

func (runtime *Runtime) Classify(data *tabula.Row) (class string)

Classify return the prediction of one sample.

func (*Runtime) ClassifySet

func (runtime *Runtime) ClassifySet(data tabula.ClasetInterface) (e error)

ClassifySet set the class attribute based on tree classification.

func (*Runtime) CountOOBError

func (runtime *Runtime) CountOOBError(oob tabula.Claset) (
	errval float64,
	e error,
)

CountOOBError process out-of-bag data on tree and return error value.

func (*Runtime) SelectRandomFeature

func (runtime *Runtime) SelectRandomFeature(D tabula.ClasetInterface)

SelectRandomFeature if NRandomFeature is greater than zero, select and compute gain in n random features instead of in all features

func (*Runtime) String

func (runtime *Runtime) String() (s string)

String yes, it will print it JSON like format.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL