Documentation ¶
Overview ¶
Package cart implement the Classification and Regression Tree by Breiman, et al. CART is binary decision tree.
Breiman, Leo, et al. Classification and regression trees. CRC press, 1984.
The implementation is based on Data Mining book,
Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and techniques: concepts and techniques. Elsevier, 2011.
Index ¶
- Constants
- type NodeValue
- type Runtime
- func (runtime *Runtime) Build(claset tabula.ClasetInterface) (e error)
- func (runtime *Runtime) Classify(data *tabula.Row) (class string)
- func (runtime *Runtime) ClassifySet(data tabula.ClasetInterface) (e error)
- func (runtime *Runtime) CountOOBError(oob tabula.Claset) (errval float64, e error)
- func (runtime *Runtime) SelectRandomFeature(claset tabula.ClasetInterface)
- func (runtime *Runtime) String() (s string)
Constants ¶
const ( // ColFlagParent denote that the column is parent/split node. ColFlagParent = 1 // ColFlagSkip denote that the column would be skipped. ColFlagSkip = 2 )
const ( // SplitMethodGini if defined in Runtime, the dataset will be splitted // using Gini gain for each possible value or partition. // // This option is used in Runtime.SplitMethod. SplitMethodGini = "gini" )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type NodeValue ¶
type NodeValue struct { // SplitV define the split value. SplitV interface{} // Class of leaf node. Class string // SplitAttrName define the name of attribute which cause the split. SplitAttrName string // Size define number of sample that this node hold before splitting. Size int // SplitAttrIdx define the attribute which cause the split. SplitAttrIdx int // IsLeaf define whether node is a leaf or not. IsLeaf bool // IsContinu define whether the node split is continuous or discrete. IsContinu bool }
NodeValue of tree in CART.
type Runtime ¶
type Runtime struct { // Tree in classification. Tree binary.Tree // SplitMethod define the criteria to used for splitting. SplitMethod string `json:"SplitMethod"` // NRandomFeature if less or equal to zero compute gain on all feature, // otherwise select n random feature and compute gain only on selected // features. NRandomFeature int `json:"NRandomFeature"` // OOBErrVal is the last out-of-bag error value in the tree. OOBErrVal float64 }
Runtime data for building CART.
func New ¶
func New(claset tabula.ClasetInterface, splitMethod string, nRandomFeature int) ( *Runtime, error, )
New create new Runtime object.
func (*Runtime) Build ¶
func (runtime *Runtime) Build(claset tabula.ClasetInterface) (e error)
Build will create a tree using CART algorithm.
func (*Runtime) ClassifySet ¶
func (runtime *Runtime) ClassifySet(data tabula.ClasetInterface) (e error)
ClassifySet set the class attribute based on tree classification.
func (*Runtime) CountOOBError ¶
CountOOBError process out-of-bag data on tree and return error value.
func (*Runtime) SelectRandomFeature ¶
func (runtime *Runtime) SelectRandomFeature(claset tabula.ClasetInterface)
SelectRandomFeature if NRandomFeature is greater than zero, select and compute gain in n random features instead of in all features.