data

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 14, 2018 License: BSD-3-Clause Imports: 6 Imported by: 6

Documentation

Index

Constants

View Source
const (
	// Dataset relative path.
	DatasetPath = ".cache/data/"

	// Dataset extensions.
	DatasetExtension = ".data"
)

Variables

This section is empty.

Functions

func Caltech

func Caltech() (map[int]*learn.Variable, []map[int]int)

Caltech downloads a partition of the Caltech-101 dataset containing only certain categories. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func Caltech3Bit

func Caltech3Bit() (map[int]*learn.Variable, []map[int]int)

Caltech downloads a partition of the Caltech-101 dataset in 3-bit containing only certain categories. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func Caltech4Bit

func Caltech4Bit() (map[int]*learn.Variable, []map[int]int)

Caltech downloads a partition of the Caltech-101 dataset in 4-bit containing only certain categories. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func Copy

func Copy(D []map[int]int, L []int) ([]map[int]int, []int)

Copy copies the dataset and labels. If labels does not exist, returns only the dataset.

func Digits

func Digits() (map[int]*learn.Variable, []map[int]int)

Digits downloads the digits dataset containing handwritten digits from 0 to 9. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func DigitsX

func DigitsX() (map[int]*learn.Variable, []map[int]int)

DigitsX downloads the digits-x dataset, an extended version of digits with more variance. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func Divide

func Divide(D []map[int]int, L []int, n int) ([][]map[int]int, [][]int)

Divide takes a dataset D and labels L and divides into n subdatasets and sublabels of approximately same size, ignoring order or proportion of labels.

func ExtractLabels

func ExtractLabels(S map[int]*learn.Variable, D []map[int]int) (map[int]*learn.Variable, []map[int]int, *learn.Variable, []int)

ExtractLabels attempts to separate the real variable values and the labels from a dataset. A label is always the last variable in a .data file. The converse is not true, since a dataset may not contain labels if it's not a classification job. In this case, the ExtractLabels function still tries to extract the last real variable values as labels. It is up to the user to only use ExtractLabels when the dataset is known to have classification labels. Return values are the original scope unaltered, the dataset with label values taken out from the matrix, the label variable, and a slice where each value in index i contains the classification value of the i-th element of the design matrix.

func Identical

func Identical(D []map[int]int, E []map[int]int, L []int, M []int) bool

Identical compares two datasets (and their labels if they exist) and returns whether they are identical in value and order.

func Join

func Join(D []map[int]int, E []map[int]int, L []int, M []int) ([]map[int]int, []int)

Join returns the concatenation of two datasets and their labels (if both exist).

func MNIST1000

func MNIST1000() (map[int]*learn.Variable, []map[int]int, []map[int]int)

MNIST1000 downloads a subset of 1000 MNIST samples. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and a pair of test and training dataset indexed by variables' ID.

func MNIST2000

func MNIST2000() (map[int]*learn.Variable, []map[int]int, []map[int]int)

MNIST2000 downloads a subset of 2000 MNIST samples. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and a pair of test and training dataset indexed by variables' ID.

func MNIST3Bits1000

func MNIST3Bits1000() (map[int]*learn.Variable, []map[int]int, []map[int]int)

MNIST3Bits1000 downloads a subset of 1000 MNIST samples with 3-bit resolution. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and a pair of test and training dataset indexed by variables' ID.

func MNIST3Bits2000

func MNIST3Bits2000() (map[int]*learn.Variable, []map[int]int, []map[int]int)

MNIST3Bits2000 downloads a subset of 2000 MNIST samples with 3-bit resolution. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and a pair of test and training dataset indexed by variables' ID.

func MergeLabel

func MergeLabel(D []map[int]int, L []int, V *learn.Variable) []map[int]int

MergeLabel takes a dataset D, a label slice L and a variable V such that V is not in D's scope and L are the instances of V. The function then returns a single dataset T where V is in D's scope and L is in T.

func Olivetti

func Olivetti() (map[int]*learn.Variable, []map[int]int)

Olivetti downloads a downscaled Olivetti Faces dataset from Bell Labs. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func Olivetti3Bit

func Olivetti3Bit() (map[int]*learn.Variable, []map[int]int)

Olivetti3Bit downloads a 3-bit resolution version of Olivetti. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func OlivettiBig

func OlivettiBig() (map[int]*learn.Variable, []map[int]int)

OlivettiBig downloads the original Olivetti Faces dataset. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func OlivettiPadded

func OlivettiPadded() (map[int]*learn.Variable, []map[int]int)

OlivettiPadded downloads a downscaled Olivetti Faces dataset with left and right sides padded by uniformly distributed pixels such that both width and height are divisible by four. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func OlivettiSmall

func OlivettiSmall() (map[int]*learn.Variable, []map[int]int)

OlivettiSmall downloads a smaller version of Olivetti. For more information: https://github.com/RenatoGeh/datasets. Returns scope (variables) and dataset indexed by variables' ID.

func Partition

func Partition(D []map[int]int, p []float64) [][]map[int]int

Partition partitions dataset D into random subdatasets following the proportions given by p. For example, if p=(0.3, 0.7), Partition will return a slice P of size |p| where |P[0]|=0.3*|D| and |P[1]|=0.7*|D|. This function assumes D has no labels. For a balanced uniformly partitioning wrt the labels of the dataset, use PartitionByLabels.

func PartitionByLabels

func PartitionByLabels(D []map[int]int, L []int, c int, p []float64) ([][]map[int]int, [][]int)

PartitionByLabels partitions the dataset D in a similar fashion to Partition. However, PartitionByLabels tries to keep the same proportion of labels for each subdataset. If the result of the proportions multiplied by |D| is an integer, then PartitionByLabels returns an exact partitioning following given proportions. Otherwise, the function tries to best approximate the given proportions. Arguments are the original dataset D, slice L of true labels of each instance, the number of classes c, and p the proportions.

func Shuffle

func Shuffle(D []map[int]int, L []int)

Shuffle shuffles a dataset and sets its labels accordingly (if it exists). It is an in-place shuffle.

func Split

func Split(D []map[int]int, c int, L []int) [][]map[int]int

Split takes a dataset D, the number of classes c and label assignments L and returns the dataset split by labels. That is, create c subdatasets where for each of these subdatasets, all items in the i-th dataset belongs to the class i, and such that the union of all subdatasets is D.

func SubtractLabel

func SubtractLabel(D []map[int]int, L []int, l int) ([]map[int]int, []int, []map[int]int, []int)

SubtractLabel takes a dataset D, its label array and a label l and returns the result of subtracting D with every instance of label l. It also returns the set of instances of label l. It does not modify D. Instead, it copies D and returns the result of the subtraction.

func SubtractVariable

func SubtractVariable(D []map[int]int, v *learn.Variable) []map[int]int

SubtractVariable takes a dataset D, a variable v and returns the result of subtracting all entries that belong to variable v. It does not modify D. Instead, it copies D and returns the result of the subtraction.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL