Documentation
¶
Overview ¶
Package neighbors implements the k-nearest neighbors algorithm. it contains NearestCentroid, KNeighborsClassifier and KNeighborsRegressor
Index ¶
- func EuclideanDistance(a, b mat.Vector) float64
- func MinkowskiDistanceP(a, b mat.Vector, p float64) float64
- func NewKNeighborsRegressor(K int, Weights interface{}) base.Predicter
- type Distance
- type InnerNode
- type KDTree
- type KNeighborsClassifier
- func (m *KNeighborsClassifier) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter
- func (m *KNeighborsClassifier) GetNOutputs() int
- func (m *KNeighborsClassifier) Predict(X mat.Matrix, Ymutable mat.Mutable) *mat.Dense
- func (m *KNeighborsClassifier) PredictProba(X, Y *mat.Dense) *KNeighborsClassifier
- func (m *KNeighborsClassifier) Score(X, Y mat.Matrix) float64
- type KNeighborsRegressor
- func (m *KNeighborsRegressor) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter
- func (m *KNeighborsRegressor) GetNOutputs() int
- func (*KNeighborsRegressor) IsClassifier() bool
- func (m *KNeighborsRegressor) Predict(X mat.Matrix, Ymutable mat.Mutable) *mat.Dense
- func (m *KNeighborsRegressor) PredicterClone() base.Predicter
- func (m *KNeighborsRegressor) Score(X, Y mat.Matrix) float64
- type LeafNode
- type NearestCentroid
- type NearestNeighbors
- func (m *NearestNeighbors) Fit(X, Y mat.Matrix)
- func (m *NearestNeighbors) KNeighbors(X mat.Matrix, NNeighbors int) (distances, indices *mat.Dense)
- func (m *NearestNeighbors) KNeighborsGraph(X *mat.Dense, NNeighbors int, mode string, includeSelf bool) (graph *mat.Dense)
- func (m *NearestNeighbors) RadiusNeighbors(X *mat.Dense, radius float64) (distances [][]float64, indices [][]int)
- type Node
- type Rectangle
- func (r *Rectangle) MaxDistancePoint(x []float64, p float64) float64
- func (r *Rectangle) MaxDistanceRectangle(other *Rectangle, p float64) float64
- func (r *Rectangle) MinDistancePoint(x []float64, p float64) float64
- func (r *Rectangle) MinDistanceRectangle(other *Rectangle, p float64) float64
- func (r *Rectangle) Split(d int, split float64) (less, greater *Rectangle)
- func (r *Rectangle) String() string
- func (r *Rectangle) Volume() float64
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func EuclideanDistance ¶
EuclideanDistance is a Distancer
Example ¶
a, b := mat.NewVecDense(4, []float64{0, 0, 0, 0}), mat.NewVecDense(4, []float64{1, 1, 0, 1}) fmt.Printf("%.8f\n", EuclideanDistance(a, b))
Output: 1.73205081
func MinkowskiDistanceP ¶
MinkowskiDistanceP ...
func NewKNeighborsRegressor ¶
NewKNeighborsRegressor returns an initialized *KNeighborsRegressor Weights may be "uniform", "distance" or func(dist []float64) []float64
Types ¶
type Distance ¶
Distance has Distance(Vector,Vector)float64
func MinkowskiDistance ¶
MinkowskiDistance ...
Example ¶
a, b := mat.NewVecDense(4, []float64{0, 0, 0, 0}), mat.NewVecDense(4, []float64{1, 1, 0, 1}) fmt.Printf("%.8f\n", MinkowskiDistance(1)(a, b)) fmt.Printf("%.8f\n", MinkowskiDistance(2)(a, b)) fmt.Printf("%.8f\n", MinkowskiDistance(math.Inf(1))(a, b))
Output: 3.00000000 1.73205081 1.00000000
type InnerNode ¶
type InnerNode struct { Node // contains filtered or unexported fields }
InnerNode ...
type KDTree ¶
KDTree for quick nearest-neighbor lookup This class provides an index into a set of k-dimensional points which can be used to rapidly look up the nearest neighbors of any point. Parameters Data : (N,K) mat.Dense The data points to be indexed. This array is not copied, and
so modifying this data will result in bogus results.
Leafsize : int, optional The number of points at which the algorithm switches over to
brute-force. Has to be positive.
Notes The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kd-tree is a binary tree, each of whose nodes represents an axis-aligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value. During construction, the axis and splitting point are chosen by the "sliding midpoint" rule, which ensures that the cells do not all become long and thin. The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors. For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. High-dimensional nearest-neighbor queries are a substantial open problem in computer science. The tree also supports all-neighbors queries, both with arrays of points and with other kd-trees. These do use a reasonably efficient algorithm, but the kd-tree is not necessarily the best data structure for this sort of calculation.
Example ¶
// v https://github.com/scipy/scipy/blob/v1.1.0/scipy/spatial/kdtree.py X := mat.NewDense(30, 2, nil) for i := 0; i < 5; i++ { for j := 0; j < 6; j++ { X.Set(i*6+j, 0, float64(i)) X.Set(i*6+j, 1, float64(j+2)) } } //fmt.Println(mat.Formatted(X)) tree := NewKDTree(X, 1) pts := mat.NewDense(2, 2, []float64{0, 0, 2.1, 2.9}) distances, indices := tree.Query(pts, 1, 1e-15, 2, math.Inf(1)) fmt.Printf("%.6f\n", mat.Formatted(distances.T())) fmt.Println(mat.Formatted(indices.T()))
Output: [2.000000 0.141421] [ 0 13]
func (*KDTree) Query ¶
func (tr *KDTree) Query(X mat.Matrix, k int, eps, p, distanceUpperBound float64) (dd, ii *mat.Dense)
Query ... Query the kd-tree for nearest neighbors Parameters ---------- x : array_like, last dimension self.m
An array of points to query.
k : int, optional
The number of nearest neighbors to return.
eps : nonnegative float, optional
Return approximate nearest neighbors; the kth returned value is guaranteed to be no further than (1+eps) times the distance to the real kth nearest neighbor.
p : float, 1<=p<=infinity, optional
Which Minkowski p-norm to use. 1 is the sum-of-absolute-values "Manhattan" distance 2 is the usual Euclidean distance infinity is the maximum-coordinate-difference distance
distance_upper_bound : nonnegative float, optional
Return only neighbors within this distance. This is used to prune tree searches, so if you are doing a series of nearest-neighbor queries, it may help to supply the distance to the nearest neighbor of the most recent point.
Returns ------- d : float or array of floats
The distances to the nearest neighbors. If x has shape tuple+(self.m,), then d has shape tuple if k is one, or tuple+(k,) if k is larger than one. Missing neighbors (e.g. when k > n or distance_upper_bound is given) are indicated with infinite distances. If k is None, then d is an object array of shape tuple, containing lists of distances. In either case the hits are sorted by distance (nearest first).
i : integer or array of integers
The locations of the neighbors in self.data. i is the same shape as d.
type KNeighborsClassifier ¶
type KNeighborsClassifier struct { base.Predicter NearestNeighbors K int Weight string Scale bool Distance Distance // Runtime members Xscaled, Y *mat.Dense Classes [][]float64 // contains filtered or unexported fields }
KNeighborsClassifier is a Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.
Example ¶
X := mat.NewDense(4, 1, []float64{0, 1, 2, 3}) Y := mat.NewDense(4, 1, []float64{0, 0, 1, 1}) neigh := NewKNeighborsClassifier(3, "uniform") neigh.Fit(X, Y) Xtest := mat.NewDense(1, 1, []float64{1.1}) Ypred := mat.NewDense(1, 1, nil) neigh.Predict(Xtest, Ypred) fmt.Println(mat.Formatted(Ypred)) Xtest.Set(0, 0, 0.9) Yprob := mat.NewDense(1, 2, nil) neigh.PredictProba(Xtest, Yprob) fmt.Printf("%.8f\n", mat.Formatted(Yprob))
Output: [0] [0.66666667 0.33333333]
func NewKNeighborsClassifier ¶
func NewKNeighborsClassifier(K int, Weights string) *KNeighborsClassifier
NewKNeighborsClassifier returns an initialized *KNeighborsClassifier
func (*KNeighborsClassifier) Fit ¶
func (m *KNeighborsClassifier) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter
Fit ...
func (*KNeighborsClassifier) GetNOutputs ¶
func (m *KNeighborsClassifier) GetNOutputs() int
GetNOutputs returns output columns number for Y to pass to predict
func (*KNeighborsClassifier) PredictProba ¶
func (m *KNeighborsClassifier) PredictProba(X, Y *mat.Dense) *KNeighborsClassifier
PredictProba for KNeighborsClassifier
type KNeighborsRegressor ¶
type KNeighborsRegressor struct { NearestNeighbors K int // Weights may be "uniform", "distance" or func(dstWeights, srcDists []float64) Weights interface{} Scale bool Distance Distance // Runtime members Xscaled, Y *mat.Dense }
KNeighborsRegressor is a Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.
Example ¶
ExampleKNeighborsRegressor from http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html#sklearn.neighbors.KNeighborsRegressor
X := mat.NewDense(4, 1, []float64{0, 1, 2, 3}) Y := mat.NewDense(4, 1, []float64{0, 0, 1, 1}) neigh := NewKNeighborsRegressor(2, "distance") neigh.Fit(X, Y) Xtest := mat.NewDense(1, 1, []float64{1.5}) Ypred := mat.NewDense(1, 1, nil) neigh.Predict(Xtest, Ypred) fmt.Println(mat.Formatted(Ypred))
Output: [0.5]
func (*KNeighborsRegressor) Fit ¶
func (m *KNeighborsRegressor) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter
Fit ...
func (*KNeighborsRegressor) GetNOutputs ¶
func (m *KNeighborsRegressor) GetNOutputs() int
GetNOutputs return Y width
func (*KNeighborsRegressor) IsClassifier ¶
func (*KNeighborsRegressor) IsClassifier() bool
IsClassifier returns false for KNeighborsRegressor
func (*KNeighborsRegressor) PredicterClone ¶
func (m *KNeighborsRegressor) PredicterClone() base.Predicter
PredicterClone return a (possibly unfitted) copy of predicter
type LeafNode ¶
type LeafNode struct { Node // contains filtered or unexported fields }
LeafNode ...
type NearestCentroid ¶
type NearestCentroid struct { base.Predicter Metric string ShrinkThreshold float64 // runtime filled members NearestNeighbors Classes [][]float64 ClassCount [][]int Centroids *mat.Dense }
NearestCentroid is a Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.
Example ¶
// Example adapted from from http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestCentroid.html#sklearn.neighbors.NearestCentroid X := mat.NewDense(6, 2, []float64{-1, -1, -2, -1, -3, -2, 1, 1, 2, 1, 3, 2}) Y := mat.NewDense(6, 1, []float64{1, 1, 1, 2, 2, 2}) clf := NewNearestCentroid("euclidean", 0.) clf.Fit(X, Y) Xtest := mat.NewDense(1, 2, []float64{-0.8, -1}) Ypred := mat.NewDense(1, 1, nil) clf.Predict(Xtest, Ypred) fmt.Println(mat.Formatted(Ypred))
Output: [1]
func NewNearestCentroid ¶
func NewNearestCentroid(metric string, shrinkThreshold float64) *NearestCentroid
NewNearestCentroid ... if Metric is "manhattan", centroids are computed using median else mean
func (*NearestCentroid) Fit ¶
func (m *NearestCentroid) Fit(Xmatrix, Ymatrix mat.Matrix) base.Fiter
Fit ...
func (*NearestCentroid) PredictProba ¶
func (m *NearestCentroid) PredictProba(X mat.Matrix, Y *mat.Dense) *NearestCentroid
PredictProba for NearestCentroid
type NearestNeighbors ¶
type NearestNeighbors struct { Algorithm string Metric string P float64 NJobs int LeafSize int // Runtime filled members Distance func(a, b mat.Vector) float64 X, Y *mat.Dense Tree *KDTree }
NearestNeighbors is the unsupervised alog implementing search of k nearest neighbors Algorithm is one of 'auto', 'ball_tree', 'kd_tree', 'brute' defaults to "auto"
Metric = 'cityblock', 'cosine', 'euclidean', 'l1', 'l2', 'manhattan' defaults to euclidean (= minkowski with P=2) P is power for 'minkowski' NJobs: number of concurrent jobs. NJobs<0 means runtime.NumCPU() default to -1
Example ¶
// test from http://scikit-learn.org/stable/modules/neighbors.html X := mat.NewDense(6, 2, []float64{-1, -1, -2, -1, -3, -2, 1, 1, 2, 1, 3, 2}) nbrs := NewNearestNeighbors() nbrs.Fit(X, mat.Matrix(nil)) distances, indices := nbrs.KNeighbors(X, 2) fmt.Printf("indices:\n%v\n", mat.Formatted(indices)) fmt.Printf("distances:\n%v\n", mat.Formatted(distances))
Output: indices: ⎡0 1⎤ ⎢1 0⎥ ⎢2 1⎥ ⎢3 4⎥ ⎢4 3⎥ ⎣5 4⎦ distances: ⎡ 0 1⎤ ⎢ 0 1⎥ ⎢ 0 1.4142135623730951⎥ ⎢ 0 1⎥ ⎢ 0 1⎥ ⎣ 0 1.4142135623730951⎦
func NewNearestNeighbors ¶
func NewNearestNeighbors() *NearestNeighbors
NewNearestNeighbors returns an *NearestNeighbors
func (*NearestNeighbors) Fit ¶
func (m *NearestNeighbors) Fit(X, Y mat.Matrix)
Fit for NearestNeighbors. Y is unused
func (*NearestNeighbors) KNeighbors ¶
KNeighbors returns distances and indices of first NNeighbors
func (*NearestNeighbors) KNeighborsGraph ¶
func (m *NearestNeighbors) KNeighborsGraph(X *mat.Dense, NNeighbors int, mode string, includeSelf bool) (graph *mat.Dense)
KNeighborsGraph Computes the (weighted) graph of k-Neighbors for points in X mode : {‘connectivity’, ‘distance’}, optional
Type of returned matrix: ‘connectivity’ will return the connectivity matrix with ones and zeros, in ‘distance’ the edges are Euclidean distance between points.
Returns: A : shape = [n_samples, n_samples_fit]
n_samples_fit is the number of samples in the fitted data A[i, j] is assigned the weight of edge that connects i to j.
Example ¶
X := mat.NewDense(3, 1, []float64{0, 3, 1}) neigh := NewNearestNeighbors() neigh.Fit(X, mat.Matrix(nil)) A := neigh.KNeighborsGraph(X, 2, "connectivity", true) fmt.Println(mat.Formatted(A))
Output: ⎡1 0 1⎤ ⎢0 1 1⎥ ⎣1 0 1⎦
func (*NearestNeighbors) RadiusNeighbors ¶
func (m *NearestNeighbors) RadiusNeighbors(X *mat.Dense, radius float64) (distances [][]float64, indices [][]int)
RadiusNeighbors Finds the neighbors within a given radius of a point or points. Return the indices and distances of each point from the dataset lying in a ball with size “radius“ around the points of the query array. Points lying on the boundary are included in the results. The result points are *not* necessarily sorted by distance to their query point. Parameters ---------- X : array-like, (n_samples, n_features), optional
The query point or points. If not provided, neighbors of each indexed point are returned. In this case, the query point is not considered its own neighbor.
radius : float
Limiting distance of neighbors to return. (default is the value passed to the constructor).
Example ¶
// adapted example from http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.RadiusNeighborsRegressor.html#sklearn.neighbors.RadiusNeighborsRegressor samples := mat.NewDense(3, 3, []float64{0, 0, 0, 0, .5, 0, 1, 1, .5}) neigh := NewNearestNeighbors() neigh.Fit(samples, mat.Matrix(nil)) distances, indices := neigh.RadiusNeighbors(mat.NewDense(1, 3, []float64{1, 1, 1}), 1.6) fmt.Println(distances[0]) fmt.Println(indices[0]) // unlike the python example, distances are sorted by increasing distances
Output: [0.5 1.5] [2 1]
type Rectangle ¶
type Rectangle struct{ Maxes, Mins []float64 }
Rectangle Hyperrectangle class. Represents a Cartesian product of intervals.
func (*Rectangle) MaxDistancePoint ¶
MaxDistancePoint return the maximum distance between input and points in the hyperrectangle. Parameters x : array_like Input array. p : float, optional Input.
func (*Rectangle) MaxDistanceRectangle ¶
MaxDistanceRectangle compute the maximum distance between points in the two hyperrectangles. Parameters other : hyperrectangle Input. p : float, optional Input.
func (*Rectangle) MinDistancePoint ¶
MinDistancePoint Return the minimum distance between input and points in the hyperrectangle. Parameters x : array_like Input. p : float, optional Input.
func (*Rectangle) MinDistanceRectangle ¶
MinDistanceRectangle compute the minimum distance between points in the two hyperrectangles. Parameters other : hyperrectangle Input. p : float Input.
func (*Rectangle) Split ¶
Split Produce two hyperrectangles by splitting. In general, if you need to compute maximum and minimum distances to the children, it can be done more efficiently by updating the maximum and minimum distances to the parent. Parameters ---------- d : int
Axis to split hyperrectangle along.
split : float
Position along axis `d` to split at.