vector

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 17, 2024 License: MIT Imports: 5 Imported by: 7

README

Vector

The library implement fast (highly optimized) vector algebra for Golang.

Version Documentation Build Status Git Hub Coverage Status Go Report Card

Inspiration

Hierarchical Navigable Small World (HNSW) algorithms are state-of-the-art techniques employed for approximate nearest neighbor searches. Vectors and the concept of distance are essential components of the algorithm. The algorithm requires large quantity of vector comparisons on the write path.

The specific count of vector operations performed by HNSW can vary, contingent on factors such as implementation details and the chosen parameters for the algorithm. We have observed a logarithmic progression, starting from 2.2K distance computations per write operation, with subsequent growth reaching 8K and beyond.

In handling such extensive data, the significance of nanoseconds cannot be overstated!

Getting started

The latest version of the library is available at main branch of this repository. All development, including new features and bug fixes, take place on the main branch using forking and pull requests as described in contribution guidelines. The stable version is available via Golang modules.

import "github.com/kshard/vector"

// Our vectors
a := []float32{60.1699, 24.9384, 24.9384, 60.1699}
b := []float32{59.9311, 30.3609, 30.3609, 59.9311}

// instantiate Euclidean distance function
euclidean := vector.Euclidean()

euclidean.Distance(a, b) // 58.921078

// instantiate Cosine distance function
cosine := vector.Cosine()

cosine.Distance(a, b)  // 0.001443

Note: One known limitation, the library requires vectors aligned to 4.

Supported CPU architectures

The library provides two versions of vector algebra functions. The library automatically selects appropriate version depending on the architecture.

  1. A Golang native version optimized for instruction pipelining.
  2. A pure assembly implementation leveraging the SIMD ("Single Instruction Multiple Data") instruction set.

Internally each version is implemented by the package

  1. internal/noasm - optimized Golang version.
  2. internal/simd - pure assembly implementation.
  3. internal/pure - reference implementation using idiomatic Golang.

The library provides Info function to determine configuration runtime

config = vector.Info()

config[vector.CONFIG_XXX]
// XXX_WITH_PURE
// XXX_WITH_NOASM
// XXX_WITH_SIMD

Performance

Instruction pipeline is 3.3x faster and SIMD is 3.8x faster (due to bounds check elimination)

go test -run=^$ -bench=. -cpu=1

BenchmarkPureEqualF32         12029072          99.30 ns/op
BenchmarkNoAsmEqualF32        13555147          87.51 ns/op
BenchmarkPureEuclideanF32      4425328         269.50 ns/op
BenchmarkNoAsmEuclideanF32    14897370          79.70 ns/op
BenchmarkSIMDEuclideanF32     16708311          70.73 ns/op
BenchmarkContraMapEuclidean   16567862          71.07 ns/op
BenchmarkPureCosineF32         3995820         299.70 ns/op
BenchmarkNoAsmCosineF32       11836580         100.80 ns/op

High-level abstraction

ContraMap turn morphisms around f: B ⟼ A.

type Node struct {
  ID     int
  Vector vector.F32
}

vector.ContraMap[vector.F32, Node]{
	Surface:   vector.Euclidean(),
	ContraMap: func(n Node) []float32 { return n.Vector },
}

How To Contribute

The library is MIT licensed and accepts contributions via GitHub pull requests:

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

The build and testing process requires Go version 1.13 or later.

build and test library.

git clone https://github.com/kshard/vector
cd vector
go test

Checklist:

  1. No new bounds check introduced: go build -gcflags="-d=ssa/check_bce"
  2. Check inlining: go build -gcflags "-m"
  3. No performance degradations: go test -run=^$ -bench=. -cpu=1
commit message

The commit message helps us to write a good release note, speed-up review process. The message should address two question what changed and why. The project follows the template defined by chapter Contributing to a Project of Git book.

bugs

If you experience any issues with the library, please let us know via GitHub issues. We appreciate detailed and accurate reports that help us to identity and replicate the issue.

License

See LICENSE

References

  1. A Quick Guide to Go's Assembler
  2. https://github.com/below/HelloSilicon
  3. https://github.com/hztools/go-sdr/tree/main/internal/simd
  4. Bounds Check Elimination

Documentation

Index

Constants

View Source
const (
	EUCLIDEAN_WITH_PURE = iota
	EUCLIDEAN_WITH_NOASM
	EUCLIDEAN_WITH_SIMD
)
View Source
const (
	COSINE_WITH_PURE = iota
	COSINE_WITH_NOASM
	COSINE_WITH_SIMD
)
View Source
const (
	CONFIG_EUCLIDEAN = "euclidean"
	CONFIG_COSINE    = "cosine"
)

Variables

This section is empty.

Functions

func Cosine

func Cosine() interface {
	Equal(F32, F32) bool
	Distance(F32, F32) float32
}

Cosine Distance

func Euclidean

func Euclidean() interface {
	Equal(F32, F32) bool
	Distance(F32, F32) float32
}

Squared Euclidean distance between two vectors

func Info

func Info() map[string]int

Info about config

Types

type ContraMap added in v0.0.2

type ContraMap[A, B any] struct {
	Surface[A]
	pure.ContraMap[A, B]
}

ContraMap is a combinator that build a new instance of type trait Surface[V] using existing instance of Distance[A] and f: b ⟼ a

func (ContraMap[A, B]) Distance added in v0.0.2

func (f ContraMap[A, B]) Distance(a, b B) float32

Distance contra variant functor

func (ContraMap[A, B]) Equal added in v0.0.5

func (f ContraMap[A, B]) Equal(a, b B) bool

Equality contra variant functor

type Distance added in v0.0.5

type Distance[Vector any] interface {
	Distance(Vector, Vector) float32
}

Generic trait to estimate "distance" between two vectors

type F32

type F32 = []float32

Vector of float32

type Surface added in v0.0.2

type Surface[Vector any] interface {
	eq.Eq[Vector]
	Distance[Vector]
}

Generic trait defines vector category

Directories

Path Synopsis
internal
vtest
Package vtest is test suite for vector algebra re-usable across provided implementation variants
Package vtest is test suite for vector algebra re-usable across provided implementation variants

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL