cart

package
v0.58.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 7, 2024 License: BSD-3-Clause, BSD-3-Clause Imports: 7 Imported by: 0

Documentation

Overview

Package cart implement the Classification and Regression Tree by Breiman, et al. CART is binary decision tree.

Breiman, Leo, et al. Classification and regression trees. CRC press, 1984.

The implementation is based on Data Mining book,

Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and techniques: concepts and techniques. Elsevier, 2011.

Index

Constants

View Source
const (
	// ColFlagParent denote that the column is parent/split node.
	ColFlagParent = 1
	// ColFlagSkip denote that the column would be skipped.
	ColFlagSkip = 2
)
View Source
const (
	// SplitMethodGini if defined in Runtime, the dataset will be splitted
	// using Gini gain for each possible value or partition.
	//
	// This option is used in Runtime.SplitMethod.
	SplitMethodGini = "gini"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type NodeValue

type NodeValue struct {
	// SplitV define the split value.
	SplitV interface{}

	// Class of leaf node.
	Class string

	// SplitAttrName define the name of attribute which cause the split.
	SplitAttrName string

	// Size define number of sample that this node hold before splitting.
	Size int

	// SplitAttrIdx define the attribute which cause the split.
	SplitAttrIdx int

	// IsLeaf define whether node is a leaf or not.
	IsLeaf bool

	// IsContinu define whether the node split is continuous or discrete.
	IsContinu bool
}

NodeValue of tree in CART.

func (*NodeValue) String

func (nodev *NodeValue) String() (s string)

String will return the value of node for printable.

type Runtime

type Runtime struct {
	// Tree in classification.
	Tree binary.Tree

	// SplitMethod define the criteria to used for splitting.
	SplitMethod string `json:"SplitMethod"`

	// NRandomFeature if less or equal to zero compute gain on all feature,
	// otherwise select n random feature and compute gain only on selected
	// features.
	NRandomFeature int `json:"NRandomFeature"`

	// OOBErrVal is the last out-of-bag error value in the tree.
	OOBErrVal float64
}

Runtime data for building CART.

func New

func New(claset tabula.ClasetInterface, splitMethod string, nRandomFeature int) (
	*Runtime, error,
)

New create new Runtime object.

func (*Runtime) Build

func (runtime *Runtime) Build(claset tabula.ClasetInterface) (e error)

Build will create a tree using CART algorithm.

func (*Runtime) Classify

func (runtime *Runtime) Classify(data *tabula.Row) (class string)

Classify return the prediction of one sample.

func (*Runtime) ClassifySet

func (runtime *Runtime) ClassifySet(data tabula.ClasetInterface) (e error)

ClassifySet set the class attribute based on tree classification.

func (*Runtime) CountOOBError

func (runtime *Runtime) CountOOBError(oob tabula.Claset) (
	errval float64,
	e error,
)

CountOOBError process out-of-bag data on tree and return error value.

func (*Runtime) SelectRandomFeature

func (runtime *Runtime) SelectRandomFeature(claset tabula.ClasetInterface)

SelectRandomFeature if NRandomFeature is greater than zero, select and compute gain in n random features instead of in all features.

func (*Runtime) String

func (runtime *Runtime) String() (s string)

String yes, it will print it JSON like format.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL