vectorstore

package
v0.0.0-...-18b8ac3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 8, 2024 License: MIT Imports: 15 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CompressedDocument

type CompressedDocument struct {
	*gollum.Document
	Encoded   []byte
	Unencoded []byte
}

type CompressedVectorStore

type CompressedVectorStore struct {
	Data       []CompressedDocument
	Compressor Compressor
}

func NewDummyVectorStore

func NewDummyVectorStore() *CompressedVectorStore

func NewGzipVectorStore

func NewGzipVectorStore() *CompressedVectorStore

func NewStdGzipVectorStore

func NewStdGzipVectorStore() *CompressedVectorStore

func NewZstdVectorStore

func NewZstdVectorStore() *CompressedVectorStore

func (*CompressedVectorStore) Insert

Insert compresses the document and inserts it into the store. An alternative implementation would ONLY store the compressed representation and decompress as necessary.

func (*CompressedVectorStore) Query

func (*CompressedVectorStore) RetrieveAll

func (cvs *CompressedVectorStore) RetrieveAll(ctx context.Context) ([]gollum.Document, error)

type Compressor

type Compressor interface {
	Compress(src []byte) []byte
}

Compressor is a single method interface that returns a compressed representation of an object.

type DummyCompressor

type DummyCompressor struct {
}

func (*DummyCompressor) Compress

func (g *DummyCompressor) Compress(src []byte) []byte

type GzipCompressor

type GzipCompressor struct {
	// contains filtered or unexported fields
}

GzipCompressor uses the klauspost/compress gzip compressor. We generally suggest using this optimized implementation over the stdlib.

func (*GzipCompressor) Compress

func (g *GzipCompressor) Compress(src []byte) []byte

type Heap

type Heap []NodeSimilarity

Heap is a custom heap implementation, to avoid interface{} conversion. I _think_ theoretically that a memory arena would be useful here, but that feels a bit beyond the pale, even for me. In benchmarking, we see that allocations are limited by scale according to k -- since K is known, we should be able to allocate a fixed-size arena and use that. That being said... let's revisit in the future :)

func (*Heap) Init

func (h *Heap) Init(k int)

func (*Heap) Len

func (h *Heap) Len() int

func (Heap) Less

func (h Heap) Less(i, j int) bool

func (*Heap) Pop

func (h *Heap) Pop() NodeSimilarity

func (*Heap) Push

func (h *Heap) Push(e NodeSimilarity)

func (Heap) Swap

func (h Heap) Swap(i, j int)

type MemoryVectorStore

type MemoryVectorStore struct {
	Documents []gollum.Document
	LLM       gollum.Embedder
}

MemoryVectorStore embeds documents on insert and stores them in memory

func NewMemoryVectorStore

func NewMemoryVectorStore(llm gollum.Embedder) *MemoryVectorStore

func NewMemoryVectorStoreFromDisk

func NewMemoryVectorStoreFromDisk(ctx context.Context, bucket *blob.Bucket, path string, llm gollum.Embedder) (*MemoryVectorStore, error)

func (*MemoryVectorStore) Insert

func (*MemoryVectorStore) Persist

func (m *MemoryVectorStore) Persist(ctx context.Context, bucket *blob.Bucket, path string) error

func (*MemoryVectorStore) Query

func (*MemoryVectorStore) RetrieveAll

func (m *MemoryVectorStore) RetrieveAll(ctx context.Context) ([]gollum.Document, error)

RetrieveAll returns all documents

type NodeSimilarity

type NodeSimilarity struct {
	Document   *gollum.Document
	Similarity float32
}

type QueryRequest

type QueryRequest struct {
	// Query is the text to query
	Query string
	// EmbeddingStrings is a list of strings to concatenate and embed instead of Query
	EmbeddingStrings []string
	// EmbeddingFloats is a query vector to use instead of Query
	EmbeddingFloats []float32
	// K is the number of results to return
	K int
}

QueryRequest is a struct that contains the query and optional query strings or embeddings

type StdGzipCompressor

type StdGzipCompressor struct {
	// contains filtered or unexported fields
}

StdGzipCompressor uses the std gzip compressor.

func (*StdGzipCompressor) Compress

func (g *StdGzipCompressor) Compress(src []byte) []byte

type VectorStore

type VectorStore interface {
	Insert(context.Context, gollum.Document) error
	Query(ctx context.Context, qb QueryRequest) ([]*gollum.Document, error)
	RetrieveAll(ctx context.Context) ([]gollum.Document, error)
}

type ZstdCompressor

type ZstdCompressor struct {
	// contains filtered or unexported fields
}

ZstdCompressor uses the klauspost/compress zstd compressor.

func (*ZstdCompressor) Compress

func (g *ZstdCompressor) Compress(src []byte) []byte

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL