llm

package
v0.0.0-...-e03ab5e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 2, 2024 License: Apache-2.0 Imports: 27 Imported by: 0

Documentation

Overview

Package llm runs a LLM locally via llama.cpp, llamafile, or with a python server. It takes care of everything, including fetching gguf packed models from hugging face.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Conversation

type Conversation struct {
	User       string
	Channel    string
	Started    time.Time
	LastUpdate time.Time
	Messages   []Message
	// contains filtered or unexported fields
}

Conversation is a conversation with one user.

type KnownLLM

type KnownLLM struct {
	// Source is the repository in the form "hf:<author>/<repo>/<basename>".
	Source PackedFileRef `yaml:"source"`
	// PackagingType is the file format used in the model. It can be one of
	// "safetensors" or "gguf".
	PackagingType string
	// Upstream is the upstream repo in the form "hf:<author>/<repo>" when the
	// model is based on another one.
	Upstream PackedRepoRef `yaml:"upstream"`
	// PromptEncoding is only used when using llama-server in /completion mode.
	// When not present, llama-server is used in OpenAI compatible API mode.
	PromptEncoding *PromptEncoding `yaml:"prompt_encoding"`
	// contains filtered or unexported fields
}

KnownLLM is a known model.

Currently assumes the model is hosted on HuggingFace.

func (*KnownLLM) Validate

func (k *KnownLLM) Validate() error

Validate checks for obvious errors in the fields.

type Memory

type Memory struct {
	// contains filtered or unexported fields
}

Memory holds the bot's conversations.

func (*Memory) Forget

func (m *Memory) Forget()

Forget forgets old conversations.

func (*Memory) Get

func (m *Memory) Get(user, channel string) *Conversation

Get gets a previous conversations or returns a new one if it's a new conversation.

func (*Memory) Load

func (m *Memory) Load(r io.Reader) error

Load loads previous memory.

func (*Memory) Save

func (m *Memory) Save(w io.Writer) error

Save saves the memory for later reuse.

type Message

type Message struct {
	Role    Role   `json:"role"`
	Content string `json:"content"`
}

Message is a message to send to the LLM as part of the exchange.

type Metrics

type Metrics struct {
	Prompt             TokenPerformance
	Generated          TokenPerformance
	KVCacheUsage       float64
	KVCacheTokens      int
	RequestsProcessing int
	RequestedPending   int
}

Metrics represents the metrics for the LLM server.

type Options

type Options struct {
	// Remote is the host:port of a pre-existing server to use instead of
	// starting our own.
	Remote string
	// Model specifies a model to use.
	//
	// It will be selected automatically from KnownLLMs.
	//
	// Use "python" to use the integrated python backend.
	Model PackedFileRef
	// ContextLength will limit the context length. This is useful with the newer
	// 128K context window models that will require too much memory and quite
	// slow to run. A good value to recommend is 8192 or 32768.
	ContextLength int `yaml:"context_length"`
	// contains filtered or unexported fields
}

Options for NewLLM.

func (*Options) Validate

func (o *Options) Validate() error

Validate checks for obvious errors in the fields.

type PackedFileRef

type PackedFileRef string

PackedFileRef is a packed reference to a file in an hugging face repository.

The form is "hf:<author>/<repo>/HEAD/<file>"

HEAD is the git commit reference or "revision". HEAD means the default branch. It can be replaced with a branch name or a commit hash. The default branch used by huggingface_hub official python library is "main".

DEFAULT_REVISION in https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/constants.py

func MakePackedFileRef

func MakePackedFileRef(author, repo, revision, file string) PackedFileRef

MakePackedFileRef returns a PackedFileRef

func (PackedFileRef) Author

func (p PackedFileRef) Author() string

Author returns the <author> part of the packed reference.

func (PackedFileRef) Basename

func (p PackedFileRef) Basename() string

Basename returns the basename part of this reference.

func (PackedFileRef) ModelRef

func (p PackedFileRef) ModelRef() huggingface.ModelRef

ModelRef returns the ModelRef reference to the repo containing this file.

func (PackedFileRef) Repo

func (p PackedFileRef) Repo() string

Repo returns the <repo> part of the packed reference.

func (PackedFileRef) RepoID

func (p PackedFileRef) RepoID() string

RepoID returns the canonical "<author>/<repo>" for this repository.

func (PackedFileRef) RepoURL

func (p PackedFileRef) RepoURL() string

RepoURL returns the canonical URL for this repository.

func (PackedFileRef) Revision

func (p PackedFileRef) Revision() string

Revision returns the HEAD part of the packed reference.

func (PackedFileRef) Validate

func (p PackedFileRef) Validate() error

Validate checks for obvious errors in the string.

type PackedRepoRef

type PackedRepoRef string

PackedRepoRef is a packed reference to an hugging face repository.

The form is "hf:<author>/<repo>"

func (PackedRepoRef) ModelRef

func (p PackedRepoRef) ModelRef() huggingface.ModelRef

ModelRef converts to a ModelRef reference.

func (PackedRepoRef) RepoID

func (p PackedRepoRef) RepoID() string

RepoID returns the canonical "<author>/<repo>" for this repository.

func (PackedRepoRef) RepoURL

func (p PackedRepoRef) RepoURL() string

RepoURL returns the canonical URL for this repository.

func (PackedRepoRef) Validate

func (p PackedRepoRef) Validate() error

Validate checks for obvious errors in the string.

type PromptEncoding

type PromptEncoding struct {
	// Prompt encoding.
	BeginOfText              string `yaml:"begin_of_text"`
	SystemTokenStart         string `yaml:"system_token_start"`
	SystemTokenEnd           string `yaml:"system_token_end"`
	UserTokenStart           string `yaml:"user_token_start"`
	UserTokenEnd             string `yaml:"user_token_end"`
	AssistantTokenStart      string `yaml:"assistant_token_start"`
	AssistantTokenEnd        string `yaml:"assistant_token_end"`
	ToolsAvailableTokenStart string `yaml:"tools_available_token_start"`
	ToolsAvailableTokenEnd   string `yaml:"tools_available_token_end"`
	ToolCallTokenStart       string `yaml:"tool_call_token_start"`
	ToolCallTokenEnd         string `yaml:"tool_call_token_end"`
	ToolCallResultTokenStart string `yaml:"tool_call_result_token_start"`
	ToolCallResultTokenEnd   string `yaml:"tool_call_result_token_end"`
	// contains filtered or unexported fields
}

PromptEncoding describes how to encode the prompt.

type Role

type Role string

Role is one of the LLM known roles.

const (
	System    Role = "system"
	User      Role = "user"
	Assistant Role = "assistant"
	// Specific to Mistral models.
	AvailableTools Role = "available_tools"
	ToolCall       Role = "tool_call"
	ToolCallResult Role = "tool_call_result"
)

LLM known roles.

type Session

type Session struct {
	HF       *huggingface.Client
	Model    PackedFileRef
	Encoding *PromptEncoding
	// contains filtered or unexported fields
}

Session runs a llama.cpp or llamafile server and runs queries on it.

While it is expected that the model is an Instruct form, it is not a requirement.

func New

func New(ctx context.Context, cache string, opts *Options, knownLLMs []KnownLLM) (*Session, error)

New instantiates a llama.cpp or llamafile server, or optionally uses python instead.

func (*Session) Close

func (l *Session) Close() error

func (*Session) GetHealth

func (l *Session) GetHealth(ctx context.Context) (string, error)

GetHealth retrieves the heath of the server.

func (*Session) GetMetrics

func (l *Session) GetMetrics(ctx context.Context, m *Metrics) error

GetMetrics retrieves the performance statistics from the server.

func (*Session) Prompt

func (l *Session) Prompt(ctx context.Context, msgs []Message, maxtoks, seed int, temperature float64) (string, error)

Prompt prompts the LLM and returns the reply.

See PromptStreaming for the arguments values.

The first message is assumed to be the system prompt.

func (*Session) PromptStreaming

func (l *Session) PromptStreaming(ctx context.Context, msgs []Message, maxtoks, seed int, temperature float64, words chan<- string) error

PromptStreaming prompts the LLM and returns the reply in the supplied channel.

Use a non-zero seed to get deterministic output (without strong guarantees).

Use low temperature (<1.0) to get more deterministic and repetitive output.

Use high temperature (>1.0) to get more creative and random text. High values can result in nonsensical responses.

It is recommended to use 1.0 by default, except some models (like Mistral-Nemo) requires much lower value <=0.3.

The first message is assumed to be the system prompt.

type TokenPerformance

type TokenPerformance struct {
	Count    int
	Duration time.Duration
}

TokenPerformance is the performance for the metrics

func (*TokenPerformance) Rate

func (t *TokenPerformance) Rate() float64

Rate is the number of token per second.

Directories

Path Synopsis
Package tools contains structures to generate function calls, tool calling from LLMs.
Package tools contains structures to generate function calls, tool calling from LLMs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL