llm

package
v0.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 25, 2024 License: BSD-3-Clause Imports: 2 Imported by: 0

Documentation

Overview

Package llm defines interfaces implemented by LLMs (or LLM-related services).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func UnquoteVector

func UnquoteVector(v Vector) string

UnquoteVector recovers the original text prefix passed to a QuoteEmbedder's EmbedDocs method. Like QuoteEmbedder, UnquoteVector is only useful in tests.

Types

type EmbedDoc

type EmbedDoc struct {
	Title string // title of document
	Text  string // text of document
}

An EmbedDoc is a single document to be embedded.

type Embedder

type Embedder interface {
	EmbedDocs(docs []EmbedDoc) ([]Vector, error)
}

An Embedder computes vector embeddings of a list of documents.

EmbedDocs accepts an arbitrary number of documents and returns their embeddings. If the underlying implementation has a limit on the batch size, it should make multiple requests in order to process all the documents. If an error occurs after some, but not all, documents have been processed, EmbedDocs can return an error along with a shortened vector slice giving the vectors for a prefix of the document slice.

See QuoteEmbedder for a semantically useless embedder that can nonetheless be helpful when writing tests, and see rsc.io/gaby/internal/gemini for a real implementation.

func QuoteEmbedder

func QuoteEmbedder() Embedder

QuoteEmbedder returns an implementation of Embedder that can be useful for testing but is completely pointless for real use. It encodes up to the first 122 bytes of each document directly into the first 122 elements of a 123-element unit vector.

type Vector

type Vector []float32

A Vector is an embedding vector, typically a high-dimensional unit vector.

func (*Vector) Decode

func (v *Vector) Decode(enc []byte)

Decode decodes the byte encoding enc into the vector v. Enc should be a multiple of 4 bytes; any trailing bytes are ignored.

func (Vector) Dot

func (v Vector) Dot(w Vector) float64

Dot returns the dot product of v and w.

TODO(rsc): Using a float64 for the result is slightly higher precision and may be worth doing in the intermediate calculation but may not be worth the type conversions involved to return a float64. Perhaps the return type should still be float32 even if the math is float64.

func (Vector) Encode

func (v Vector) Encode() []byte

Encode returns a byte encoding of the vector v, suitable for storing in a database.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL