confeito

package module
v0.0.0-...-f50cc68 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 26, 2018 License: MIT Imports: 3 Imported by: 0

README

Package confeito provides fast ensemble tree inference

confeito logo

Build Status Report Status

Copyright 2017- Tatsuhiro Aoshima (hiro4bbh@gmail.com).

Abstract

Package confeito provides fast ensemble tree inference. See documents on GoDoc for details.

This package is based on QuickScorer [Lucchese+ 2015]. We confirmed that the QuickScorer-based implementation is several times faster than the naive implementation on 65536 depth-12 trees on 65536D dense feature vector (see benchmarks BenchmarkForest and BenchmarkBasicEnsembleTrees).

References

  • [Lucchese+ 2015] C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. "QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees." Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2015.

Documentation

Overview

Package confeito provides fast ensemble tree inference.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DenseFeatureVector

type DenseFeatureVector []float32

DenseFeatureVector is a type for a data point with dense feature values. This implements interface FeatureVector.

func (DenseFeatureVector) Dim

func (v DenseFeatureVector) Dim() int

Dim is for interface FeatureVector.

func (DenseFeatureVector) Get

func (v DenseFeatureVector) Get(id FeatureID) (value float32, err error)

Get is for interface FeatureVector.

type FeatureID

type FeatureID uint32

FeatureID is the type of feature IDs is uint32.

type FeatureVector

type FeatureVector interface {
	// Dim returns the vector dimension.
	Dim() int
	// Get returns the value of the feature.
	// The default value should be float32(0.0).
	// If id is larger than the vector dimension, then returns the default value.
	//
	// The function should return an error if id is illegal one.
	Get(id FeatureID) (value float32, err error)
}

FeatureVector is the interface for a data point. FeatureVector interface can be used by any storage manager, so it should provide only read properties.

type Forest

type Forest struct {
	// contains filtered or unexported fields
}

Forest is a ensemble of tree (*Leaf). This is designed to compact and fast online prediction. Thus, there is no way to modify each tree, and users can enqueue/dequeue an tree, or get predicted values.

Result of prediction is slice of the value predicted by each tree. This design enables users to use the predicted values for estimators weighted arbitrarily.

NOTICE: Currently, Forest supports only trees having at most 64 terminal leaves.

func NewForest

func NewForest() *Forest

NewForest returns a new empty Forest.

func (*Forest) Dequeue

func (forest *Forest) Dequeue()

Dequeue dequeues the first enqueued tree from forest.

This would be too slow because the implementation is not designed for frequent dequeues.

func (*Forest) Enqueue

func (forest *Forest) Enqueue(trees ...*Leaf) error

Enqueue enqueues the given trees to forest in order.

This function returns an error if the number of leaves in tree is greater than 64.

func (*Forest) Predict

func (forest *Forest) Predict(x FeatureVector) ([]interface{}, error)

Predict returns a slice of the value predicted by each tree of forest.

This function returns an error at getting feature values of x.

type KeyValue

type KeyValue struct {
	Key   FeatureID
	Value float32
}

KeyValue is the pair of the FeatureID key and float32 value.

type Leaf

type Leaf struct {
	// contains filtered or unexported fields
}

Leaf is an element in a tree. It is either of non-terminal or terminal. If it is non-terminal, then it has left and right leaf, otherwise it has a value which can be any object (interface{}).

In predicting the value of the given feature, if feature[featureID] <= threshold, then the left leaf is taken, else the right one is taken. This process is repeated until the cursor points a terminal leaf, and returns the value of it.

Leaf is slow, because it is designed to use manipulating tree structure in training-phase or testing its correctness.

func NewLeaf

func NewLeaf(featureID FeatureID, threshold float32, leftValue, rightValue interface{}) (*Leaf, error)

NewLeaf returns a new non-terminal leaf with feature ID and threshold. Also, the function sets the default value of the left and right leaf.

This function returns an error if featureID is FEATURE_ID_TERMINAL_LEAF.

func NewTerminalLeaf

func NewTerminalLeaf(value interface{}) (*Leaf, error)

NewTerminalLeaf returns a new terminal leaf with value.

This function returns no error currently.

func (*Leaf) IsTerminal

func (l *Leaf) IsTerminal() bool

IsTerminal returns true if l is terminal, otherwise false.

func (*Leaf) Left

func (l *Leaf) Left() *Leaf

Left returns the left leaf. If l is terminal, then this returns nil.

func (*Leaf) Predict

func (l *Leaf) Predict(x FeatureVector) (value interface{}, err error)

Predict returns the predicted value of the given feature.

This function returns an errors at getting feature values of x.

func (*Leaf) Right

func (l *Leaf) Right() *Leaf

Right returns the right leaf. If l is terminal, then this returns nil.

func (*Leaf) SetLeft

func (l *Leaf) SetLeft(left *Leaf) error

SetLeft sets the left leaf.

This function returns an error if l is terminal, or the new leaf is nil.

func (*Leaf) SetRight

func (l *Leaf) SetRight(right *Leaf) error

SetRight sets the right leaf.

This function returns an error if l is terminal, or the new leaf is nil.

func (*Leaf) String

func (l *Leaf) String() string

String returns the human-readable string representation of l.

func (*Leaf) Threshold

func (l *Leaf) Threshold() (featureID FeatureID, threshold float32, err error)

Threshold returns the threshold with feature ID of l.

This function returns an error if l is terminal.

func (*Leaf) Value

func (l *Leaf) Value() (value interface{}, err error)

Value returns the value of the terminal leaf l.

This function returns an error if l is not terminal.

type SparseFeatureVector

type SparseFeatureVector []KeyValue

SparseFeatureVector is a type for a data point having sparse feature values. The features should be sorted in ascending order of its key.

This implements interface FeatureVector and sort.Sort.

func (SparseFeatureVector) Dim

func (v SparseFeatureVector) Dim() (d int)

Dim is for interface FeatureVector.

func (SparseFeatureVector) Get

func (v SparseFeatureVector) Get(id FeatureID) (value float32, err error)

Get is for interface FeatureVector.

func (SparseFeatureVector) Len

func (v SparseFeatureVector) Len() int

func (SparseFeatureVector) Less

func (v SparseFeatureVector) Less(i, j int) bool

func (SparseFeatureVector) Swap

func (v SparseFeatureVector) Swap(i, j int)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL