Documentation
¶
Overview ¶
Package bitknn provides a fast exact k-nearest neighbors (k-NN) implementation for binary feature vectors.
Example ¶
package main import ( "fmt" "github.com/keilerkonzept/bitknn" ) func main() { // feature vectors packed into uint64s data := []uint64{0b101010, 0b111000, 0b000111} // class labels labels := []int{0, 1, 1} model := bitknn.Fit(data, labels, bitknn.WithLinearDistanceWeighting()) // one vote counter per class votes := make([]float64, 2) k := 2 model.Predict(k, 0b101011, bitknn.VoteSlice(votes)) // or, just return the nearest neighbor's distances and indices: // distances,indices := model.Find(k, 0b101011) fmt.Println("Votes:", votes) // you can also use a map for the votes. // this is good if you have a very large number of different labels: votesMap := make(map[int]float64) model.Predict(k, 0b101011, bitknn.VoteMap(votesMap)) fmt.Println("Votes for 0:", votesMap[0]) }
Output: Votes: [0.5 0.25] Votes for 0: 0.5
Index ¶
- Constants
- func DistanceWeightingFuncLinear(dist int) float64
- func DistanceWeightingFuncQuadratic(dist int) float64
- func Nearest(data []uint64, k int, x uint64, distances, indices []int) int
- func NearestWide(data [][]uint64, k int, x []uint64, distances, indices []int) int
- func NearestWideV(data [][]uint64, k int, x []uint64, batch []uint32, distances, indices []int) int
- type DistanceWeighting
- type Model
- func (me *Model) Find(k int, x uint64) ([]int, []int)
- func (me *Model) FindInto(k int, x uint64, distances []int, indices []int) ([]int, []int)
- func (me *Model) PreallocateHeap(k int)
- func (me *Model) Predict(k int, x uint64, votes VoteCounter)
- func (me *Model) PredictAlloc(k int, x uint64, votes VoteCounter)
- func (me *Model) PredictInto(k int, x uint64, distances []int, indices []int, votes VoteCounter)
- func (me *Model) Vote(k int, distances []int, indices []int, votes VoteCounter)
- type Option
- type VoteCounter
- type VoteMap
- type VoteSlice
- type WideModel
- func (me *WideModel) Find(k int, x []uint64) ([]int, []int)
- func (me *WideModel) FindInto(k int, x []uint64, distances []int, indices []int) ([]int, []int)
- func (me *WideModel) FindIntoV(k int, x []uint64, batch []uint32, distances []int, indices []int) ([]int, []int)
- func (me *WideModel) FindV(k int, x []uint64, batch []uint32) ([]int, []int)
- func (me *WideModel) PreallocateHeap(k int)
- func (me *WideModel) Predict(k int, x []uint64, votes VoteCounter) int
- func (me *WideModel) PredictInto(k int, x []uint64, distances []int, indices []int, votes VoteCounter) int
- func (me *WideModel) PredictIntoV(k int, x []uint64, batch []uint32, distances []int, indices []int, ...) int
- func (me *WideModel) PredictV(k int, x []uint64, batch []uint32, votes VoteCounter) int
Examples ¶
Constants ¶
const DiscardVotes = discardVotes(0)
DiscardVotes is a no-op vote counter.
Variables ¶
This section is empty.
Functions ¶
func DistanceWeightingFuncLinear ¶ added in v0.3.0
func DistanceWeightingFuncQuadratic ¶ added in v0.3.0
func Nearest ¶
Nearest finds the nearest neighbors of the given point `x` by Hamming distance in `data`. The neighbor's distances and indices (in `data`) are written to the slices `distances` and `indices`. The two slices should be pre-allocated to length `k+1`. pre:
cap(distances) = cap(indices) = k+1 >= 1
func NearestWide ¶ added in v0.4.0
Nearest, but for wide data.
func NearestWideV ¶ added in v0.7.0
NearestWide, but vectorizable (currently only on ARM64 with NEON instructions). The `batch` array must have at least length `k`, and is used to pre-compute batches of distances.
Types ¶
type DistanceWeighting ¶ added in v0.2.0
type DistanceWeighting distanceWeighting
const ( DistanceWeightingNone DistanceWeighting = iota DistanceWeightingLinear DistanceWeightingQuadratic DistanceWeightingCustom )
func (DistanceWeighting) String ¶ added in v0.2.0
func (me DistanceWeighting) String() string
type Model ¶
type Model struct { // Input data points. Data []uint64 // Class labels for each data point. Labels []int // Vote values for each data point. Values []float64 // Distance weighting function. DistanceWeighting DistanceWeighting // Custom function when [Model.DistanceWeighting] is [DistanceWeightingCustom]. DistanceWeightingFunc func(int) float64 HeapDistances []int HeapIndices []int }
A k-NN model for uint64s.
func (*Model) Find ¶ added in v0.6.0
Finds the nearest neighbors of the given point. Writes their distances and indices in the dataset into the pre-allocated slices. Returns the distance and index slices, truncated to the actual number of neighbors found.
func (*Model) FindInto ¶ added in v0.6.0
Finds the nearest neighbors of the given point. Writes their distances and indices in the dataset into the provided slices. The slices should be pre-allocated to length k+1. Returns the distance and index slices, truncated to the actual number of neighbors found.
func (*Model) PreallocateHeap ¶ added in v0.2.0
func (*Model) Predict ¶ added in v0.6.0
func (me *Model) Predict(k int, x uint64, votes VoteCounter)
Predicts the label of a single input point. Reuses two slices of length K+1 for the neighbor heap.
func (*Model) PredictAlloc ¶ added in v0.6.0
func (me *Model) PredictAlloc(k int, x uint64, votes VoteCounter)
Predicts the label of a single input point. Each call allocates two new slices of length K+1 for the neighbor heap.
func (*Model) PredictInto ¶ added in v0.6.0
Predicts the label of a single input point, using the given slices for the neighbor heap.
type Option ¶
type Option func(*Model)
func WithDistanceWeightingFunc ¶ added in v0.2.0
Use a custom distance weighting function.
func WithLinearDistanceWeighting ¶ added in v0.2.0
func WithLinearDistanceWeighting() Option
Apply linear distance weighting (`1 / (1 + dist)`).
func WithQuadraticDistanceWeighting ¶ added in v0.2.0
func WithQuadraticDistanceWeighting() Option
Apply quadratic distance weighting (`1 / (1 + dist^2)`).
type VoteCounter ¶ added in v0.5.1
type VoteCounter interface { // Clear removes all votes. Clear() // ArgMax returns the label with the highest vote count. // If there are no votes, it returns 0. ArgMax() int // Max returns the highest vote count. Max() float64 // Get returns the vote count for the given label. Get(label int) float64 // Add adds the specified delta to the vote count for the given label. Add(label int, delta float64) }
VoteCounter is a k-NN vote counter interface.
type VoteMap ¶ added in v0.4.0
VoteMap is a sparse vote counter that stores votes in a map. Good for large sets of class labels.
func (VoteMap) Add ¶ added in v0.4.0
Add adds the specified delta to the vote count for the given label.
func (VoteMap) ArgMax ¶ added in v0.4.0
ArgMax returns the label with the highest vote count. If there are no votes, it returns 0.
func (VoteMap) Clear ¶ added in v0.4.0
func (me VoteMap) Clear()
Clear resets all the votes in the map.
type VoteSlice ¶ added in v0.4.0
type VoteSlice []float64
VoteSlice is a dense vote counter that stores votes in a slice. It is efficient for small sets of class labels.
func (VoteSlice) Add ¶ added in v0.4.0
Add adds the specified delta to the vote count for the given label.
func (VoteSlice) ArgMax ¶ added in v0.4.0
ArgMax returns the index (label) of the highest vote. If there are no votes, it returns 0.
func (VoteSlice) Clear ¶ added in v0.4.0
func (me VoteSlice) Clear()
Clear resets all the votes in the slice to zero.
type WideModel ¶ added in v0.4.0
A k-NN model for slices of uint64s.
func FitWide ¶ added in v0.4.0
Create a k-NN model for the given data points and labels.
Example ¶
package main import ( "fmt" "github.com/keilerkonzept/bitknn" "github.com/keilerkonzept/bitknn/pack" ) func main() { // feature vectors packed into uint64s data := [][]uint64{ pack.String("foo"), pack.String("bar"), pack.String("baz"), } // class labels labels := []int{0, 1, 1} model := bitknn.FitWide(data, labels, bitknn.WithLinearDistanceWeighting()) // one vote counter per class votes := make([]float64, 2) k := 2 query := pack.String("fob") model.Predict(k, query, bitknn.VoteSlice(votes)) fmt.Println("Votes:", votes) }
Output: Votes: [0.25 0.16666666666666666]
func (*WideModel) Find ¶ added in v0.6.0
Finds the nearest neighbors of the given point. Writes their distances and indices in the dataset into the pre-allocated slices. Returns the distance and index slices, truncated to the actual number of neighbors found.
func (*WideModel) FindInto ¶ added in v0.6.0
Finds the nearest neighbors of the given point. Writes their distances and indices in the dataset into the provided slices. The slices should be pre-allocated to length k+1. Returns the distance and index slices, truncated to the actual number of neighbors found.
func (*WideModel) FindIntoV ¶ added in v0.7.0
func (me *WideModel) FindIntoV(k int, x []uint64, batch []uint32, distances []int, indices []int) ([]int, []int)
FindIntoV is WideModel.FindInto, but vectorizable (currently only on ARM64 with NEON instructions). The provided [batch] slice must have length >=k and is used to pre-compute batches of distances.
func (*WideModel) FindV ¶ added in v0.7.0
FindV is WideModel.Find, but vectorizable (currently only on ARM64 with NEON instructions). The provided [batch] slice must have length >=k and is used to pre-compute batches of distances.
func (*WideModel) PreallocateHeap ¶ added in v0.4.0
func (*WideModel) Predict ¶ added in v0.6.0
func (me *WideModel) Predict(k int, x []uint64, votes VoteCounter) int
Predicts the label of a single input point. Reuses two slices of length K+1 for the neighbor heap. Returns the number of neighbors found.
func (*WideModel) PredictInto ¶ added in v0.6.0
func (me *WideModel) PredictInto(k int, x []uint64, distances []int, indices []int, votes VoteCounter) int
Predicts the label of a single input point, using the given slices for the neighbor heap. Returns the number of neighbors found.
func (*WideModel) PredictIntoV ¶ added in v0.7.0
func (me *WideModel) PredictIntoV(k int, x []uint64, batch []uint32, distances []int, indices []int, votes VoteCounter) int
PredictIntoV is WideModel.PredictInto, but vectorizable (currently only on ARM64 with NEON instructions). The provided [batch] slice must have length >=k and is used to pre-compute batches of distances.
func (*WideModel) PredictV ¶ added in v0.7.0
PredictV is WideModel.Predict, but vectorizable (currently only on ARM64 with NEON instructions). The provided [batch] slice must have length >=k and is used to pre-compute batches of distances.