Documentation
¶
Index ¶
- func GetIndex(arr []string, val string) float64
- func LoadValueEncoders()
- func MinMax(value float64, sum *ColumnSummary) (result string)
- func MinMaxIntArr(array []float64) (float64, float64)
- func SetConfig(c *Config)
- func StoreValueEncoders()
- func ZScore(value float64, sum *ColumnSummary) (result string)
- type ColumnSummary
- type ColumnType
- type Config
- type ValueEncoder
- func (m *ValueEncoder) Bool(b bool) string
- func (m *ValueEncoder) Float64(field string, val float64) string
- func (m *ValueEncoder) GetSummary(colType ColumnType, field string) *ColumnSummary
- func (m *ValueEncoder) Int(field string, val int) string
- func (m *ValueEncoder) Int32(field string, val int32) string
- func (m *ValueEncoder) Int64(field string, val int64) string
- func (m *ValueEncoder) String(field string, val string) string
- func (m *ValueEncoder) Uint32(field string, val uint32) string
- func (m *ValueEncoder) Uint64(field string, val uint64) string
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func LoadValueEncoders ¶ added in v0.6.6
func LoadValueEncoders()
LoadValueEncoders loads all value encoders from disk.
func MinMax ¶ added in v0.6.6
func MinMax(value float64, sum *ColumnSummary) (result string)
MinMax will apply the minmax encoding.
func MinMaxIntArr ¶ added in v0.6.6
MinMaxIntArr returns the highest and the lowest numbers from a float64 array.
func SetConfig ¶ added in v0.6.6
func SetConfig(c *Config)
SetConfig will set the config for all registered encoders.
func StoreValueEncoders ¶ added in v0.6.6
func StoreValueEncoders()
StoreValueEncoders stores all value encoders on disk.
func ZScore ¶ added in v0.6.6
func ZScore(value float64, sum *ColumnSummary) (result string)
ZScore will apply the Zscore encoding.
Types ¶
type ColumnSummary ¶ added in v0.6.6
type ColumnSummary struct { Version string `json:"version"` Col string `json:"col"` // Data type of the column, eg: string or numeric Typ ColumnType `json:"typ"` // Map of strings mapped to their index // tracked as float64 to avoid additional type casts UniqueStrings map[string]float64 `json:"uniqueStrings"` // Current string index // tracked as float64 to avoid additional type casts Index float64 // standard deviation and mean Std float64 `json:"std"` Mean float64 `json:"mean"` // min, max Min float64 `json:"min"` Max float64 `json:"max"` sync.Mutex }
ColumnSummary collects statistical information about a column in the dataset.
type ColumnType ¶ added in v0.6.6
type ColumnType int
ColumnType is the data type of the column
const ( // TypeString is a data type for text columns TypeString ColumnType = iota // TypeNumeric is a data type for numeric columns TypeNumeric )
func (ColumnType) String ¶ added in v0.6.6
func (c ColumnType) String() string
type Config ¶
type Config struct { // use zscore for normalization ZScore bool // use minmax for normalization MinMax bool // normalize the categorical values after encoding them to numeric format NormalizeCategoricals bool }
Config holds configuration parameters.
type ValueEncoder ¶ added in v0.6.6
ValueEncoder handles online encoding of incoming data and keeps the required state for each feature.
func NewValueEncoder ¶ added in v0.6.6
func NewValueEncoder() *ValueEncoder
NewValueEncoder returns a new encoding manager instance.
func (*ValueEncoder) Bool ¶ added in v0.6.6
func (m *ValueEncoder) Bool(b bool) string
Bool handles encoding of boolean values to numeric format.
func (*ValueEncoder) Float64 ¶ added in v0.6.6
func (m *ValueEncoder) Float64(field string, val float64) string
Float64 handles encoding of 64bit float values according to the ValueEncoder configuration.
func (*ValueEncoder) GetSummary ¶ added in v0.6.6
func (m *ValueEncoder) GetSummary(colType ColumnType, field string) *ColumnSummary
GetSummary returns the summary for the given column type and field name. It will create a new one if none is being tracked yet.
func (*ValueEncoder) Int ¶ added in v0.6.6
func (m *ValueEncoder) Int(field string, val int) string
Int handles encoding of integer values according to the ValueEncoder configuration.
func (*ValueEncoder) Int32 ¶ added in v0.6.6
func (m *ValueEncoder) Int32(field string, val int32) string
Int32 handles encoding of 32bit integer values according to the ValueEncoder configuration.
func (*ValueEncoder) Int64 ¶ added in v0.6.6
func (m *ValueEncoder) Int64(field string, val int64) string
Int64 handles encoding of 64bit integer values according to the ValueEncoder configuration.
func (*ValueEncoder) String ¶ added in v0.6.6
func (m *ValueEncoder) String(field string, val string) string
String handles encoding of categorical values according to the ValueEncoder configuration.