hnsw

package
v0.13.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 30, 2024 License: Apache-2.0, CC0-1.0 Imports: 17 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrDifferentVectorLengths = fmt.Errorf("vectors have different lengths")
)

Functions

func CosineDistance

func CosineDistance(a, b []float32) (float32, error)

CosineDistance computes the cosine distance between two vectors.

func EuclideanDistance

func EuclideanDistance(a, b []float32) (float32, error)

EuclideanDistance computes the Euclidean distance between two vectors.

func RegisterDistanceFunc

func RegisterDistanceFunc(name string, fn DistanceFunc)

RegisterDistanceFunc registers a distance function with a name. A distance function must be registered here before a graph can be exported and imported.

Types

type Analyzer

type Analyzer[K cmp.Ordered] struct {
	Graph *Graph[K]
}

Analyzer is a struct that holds a graph and provides methods for analyzing it. It offers no compatibility guarantee as the methods of measuring the graph's health with change with the implementation.

func (*Analyzer[T]) Connectivity

func (a *Analyzer[T]) Connectivity() []float64

Connectivity returns the average number of edges in the graph for each non-empty layer.

func (*Analyzer[T]) Height

func (a *Analyzer[T]) Height() int

func (*Analyzer[T]) Topography

func (a *Analyzer[T]) Topography() []int

Topography returns the number of nodes in each layer of the graph.

type DistanceFunc

type DistanceFunc func(a, b []float32) (float32, error)

DistanceFunc is a function that computes the distance between two vectors.

type Graph

type Graph[K cmp.Ordered] struct {

	// Distance is the distance function used to compare embeddings.
	Distance DistanceFunc

	// Rng is used for level generation. It may be set to a deterministic value
	// for reproducibility. Note that deterministic number generation can lead to
	// degenerate graphs when exposed to adversarial inputs.
	Rng *rand.Rand

	// M is the maximum number of neighbors to keep for each node.
	// A good default for OpenAI embeddings is 16.
	M int

	// Ml is the level generation factor.
	// E.g., for Ml = 0.25, each layer is 1/4 the size of the previous layer.
	Ml float64

	// EfSearch is the number of nodes to consider in the search phase.
	// 20 is a reasonable default. Higher values improve search accuracy at
	// the expense of memory.
	EfSearch int

	// EfConstruction is the number of nodes to consider in the construction phase.
	// 16 is a reasonable default. Higher values improve graph quality at the
	// expense of memory.
	EfConstruction int
	// contains filtered or unexported fields
}

Graph is a Hierarchical Navigable Small World graph. All public parameters must be set before adding nodes to the graph. K is cmp.Ordered instead of of comparable so that they can be sorted.

func NewGraph

func NewGraph[K cmp.Ordered]() *Graph[K]

NewGraph returns a new graph with default parameters, roughly designed for storing OpenAI embeddings.

func (*Graph[K]) Add

func (g *Graph[K]) Add(nodes ...Node[K]) error

Add inserts nodes into the graph. If another node with the same ID exists, it is replaced.

func (*Graph[K]) Delete

func (h *Graph[K]) Delete(key K) bool

Delete removes a node from the graph by key. It tries to preserve the clustering properties of the graph by replenishing connectivity in the affected neighborhoods.

func (*Graph[K]) DeleteWithLock

func (h *Graph[K]) DeleteWithLock(key K) bool

func (*Graph[K]) Dims

func (g *Graph[K]) Dims() int

Dims returns the number of dimensions in the graph, or 0 if the graph is empty.

func (*Graph[K]) Export

func (h *Graph[K]) Export(w io.Writer) error

Export writes the graph to a writer.

T must implement io.WriterTo.

func (*Graph[K]) Import

func (h *Graph[K]) Import(r io.Reader) error

Import reads the graph from a reader. T must implement io.ReaderFrom. The imported graph does not have to match the exported graph's parameters (except for dimensionality). The graph will converge onto the new parameters.

func (*Graph[K]) Len

func (h *Graph[K]) Len() int

Len returns the number of nodes in the graph.

func (*Graph[K]) Lookup

func (h *Graph[K]) Lookup(key K) (Vector, bool)

Lookup returns the vector with the given key.

func (*Graph[K]) Search

func (h *Graph[K]) Search(near Vector, k int) ([]SearchResultNode[K], error)

Search finds the k nearest neighbors from the target node.

type Node

type Node[K cmp.Ordered] struct {
	Key   K
	Value Vector
}

Node is a node in the graph.

func MakeNode

func MakeNode[K cmp.Ordered](key K, vec Vector) Node[K]

func MakeNodes

func MakeNodes[K cmp.Ordered](keys []K, vecs []Vector) ([]Node[K], error)

type SavedGraph

type SavedGraph[K cmp.Ordered] struct {
	*Graph[K]
	Path string
}

SavedGraph is a wrapper around a graph that persists changes to a file upon calls to Save. It is more convenient but less powerful than calling Graph.Export and Graph.Import directly.

func LoadSavedGraph

func LoadSavedGraph[K cmp.Ordered](path string) (*SavedGraph[K], error)

LoadSavedGraph opens a graph from a file, reads it, and returns it.

If the file does not exist (i.e. this is a new graph), the equivalent of NewGraph is returned.

It does not hold open a file descriptor, so SavedGraph can be forgotten without ever calling Save.

func (*SavedGraph[K]) Save

func (g *SavedGraph[K]) Save() error

Save writes the graph to the file.

type SearchResultNode

type SearchResultNode[K cmp.Ordered] struct {
	Node[K]
	Distance float32
}

type Vector

type Vector = []float32

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL