vllm

package
v0.301.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 10, 2024 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

Manager manages the Ollama service.

TODO(kenji): Refactor this class once we completely switch to the one-odel-per-pod implementation where inference-manager-engine doesn't directly run vLLM or Ollama.

func New

func New(modelDir string, s3Client s3Client) *Manager

New returns a new Manager.

func (*Manager) CreateNewModelOfGGUF added in v0.273.0

func (m *Manager) CreateNewModelOfGGUF(modelName string, spec *ollama.ModelSpec) error

CreateNewModelOfGGUF creates a new model with the given name and spec that uses a GGUF model file.

func (*Manager) DownloadAndCreateNewModel added in v0.273.0

func (m *Manager) DownloadAndCreateNewModel(ctx context.Context, modelName string, resp *mv1.GetBaseModelPathResponse) error

DownloadAndCreateNewModel downloads the model from the given path and creates a new model.

func (*Manager) UpdateModelTemplateToLatest added in v0.222.0

func (m *Manager) UpdateModelTemplateToLatest(modelName string) error

UpdateModelTemplateToLatest updates the model template to the latest.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL