Documentation
¶
Index ¶
- Variables
- func ChatLatency(model Model) *latency.MovingAverage
- func ChatStreamLatency(model Model) *latency.MovingAverage
- type LangModel
- type LangModelConfig
- type LangProvider
- type LanguageModel
- func (m *LanguageModel) Chat(ctx context.Context, params *schemas.ChatParams) (*schemas.ChatResponse, error)
- func (m LanguageModel) ChatLatency() *latency.MovingAverage
- func (m *LanguageModel) ChatStream(ctx context.Context, params *schemas.ChatParams) (<-chan *clients.ChatStreamResult, error)
- func (m LanguageModel) ChatStreamLatency() *latency.MovingAverage
- func (m LanguageModel) Healthy() bool
- func (m LanguageModel) ID() string
- func (m LanguageModel) LatencyUpdateInterval() *fields.Duration
- func (m *LanguageModel) ModelName() string
- func (m *LanguageModel) Provider() string
- func (m *LanguageModel) SupportChatStream() bool
- func (m LanguageModel) Weight() int
- type Model
- type ModelProvider
Constants ¶
This section is empty.
Variables ¶
View Source
var ErrProviderNotFound = errors.New("provider not found")
Functions ¶
func ChatLatency ¶
func ChatLatency(model Model) *latency.MovingAverage
func ChatStreamLatency ¶
func ChatStreamLatency(model Model) *latency.MovingAverage
Types ¶
type LangModel ¶
type LangModel interface { Model Provider() string ModelName() string Chat(ctx context.Context, params *schemas.ChatParams) (*schemas.ChatResponse, error) ChatStream(ctx context.Context, params *schemas.ChatParams) (<-chan *clients.ChatStreamResult, error) }
type LangModelConfig ¶
type LangModelConfig struct { ID string `yaml:"id" json:"id" validate:"required"` // Model instance ID (unique in scope of the router) Enabled bool `yaml:"enabled" json:"enabled" validate:"required"` // Is the model enabled? ErrorBudget *health.ErrorBudget `yaml:"error_budget" json:"error_budget" swaggertype:"primitive,string"` Latency *latency.Config `yaml:"latency" json:"latency"` Weight int `yaml:"weight" json:"weight"` Client *clients.ClientConfig `yaml:"client" json:"client"` // Add other providers like OpenAI *openai.Config `yaml:"openai,omitempty" json:"openai,omitempty"` AzureOpenAI *azureopenai.Config `yaml:"azureopenai,omitempty" json:"azureopenai,omitempty"` Cohere *cohere.Config `yaml:"cohere,omitempty" json:"cohere,omitempty"` OctoML *octoml.Config `yaml:"octoml,omitempty" json:"octoml,omitempty"` Anthropic *anthropic.Config `yaml:"anthropic,omitempty" json:"anthropic,omitempty"` Bedrock *bedrock.Config `yaml:"bedrock,omitempty" json:"bedrock,omitempty"` Ollama *ollama.Config `yaml:"ollama,omitempty" json:"ollama,omitempty"` }
func DefaultLangModelConfig ¶
func DefaultLangModelConfig() *LangModelConfig
func (*LangModelConfig) ToModel ¶
func (c *LangModelConfig) ToModel(tel *telemetry.Telemetry) (*LanguageModel, error)
func (*LangModelConfig) UnmarshalYAML ¶
func (c *LangModelConfig) UnmarshalYAML(unmarshal func(interface{}) error) error
type LangProvider ¶
type LangProvider interface { ModelProvider SupportChatStream() bool Chat(ctx context.Context, params *schemas.ChatParams) (*schemas.ChatResponse, error) ChatStream(ctx context.Context, params *schemas.ChatParams) (clients.ChatStream, error) }
LangProvider defines an interface a provider should fulfill to be able to serve language chat requests
type LanguageModel ¶
type LanguageModel struct {
// contains filtered or unexported fields
}
LanguageModel wraps provider client and expend it with health & latency tracking
The model health is assumed to be independent of model actions (e.g. chat & chatStream) The latency is assumed to be action-specific (e.g. streaming chat chunks are much low latency than the full chat action)
func NewLangModel ¶
func NewLangModel(modelID string, client LangProvider, budget *health.ErrorBudget, latencyConfig latency.Config, weight int) *LanguageModel
func (*LanguageModel) Chat ¶
func (m *LanguageModel) Chat(ctx context.Context, params *schemas.ChatParams) (*schemas.ChatResponse, error)
func (LanguageModel) ChatLatency ¶
func (m LanguageModel) ChatLatency() *latency.MovingAverage
func (*LanguageModel) ChatStream ¶
func (m *LanguageModel) ChatStream(ctx context.Context, params *schemas.ChatParams) (<-chan *clients.ChatStreamResult, error)
func (LanguageModel) ChatStreamLatency ¶
func (m LanguageModel) ChatStreamLatency() *latency.MovingAverage
func (LanguageModel) Healthy ¶
func (m LanguageModel) Healthy() bool
func (LanguageModel) ID ¶
func (m LanguageModel) ID() string
func (LanguageModel) LatencyUpdateInterval ¶
func (m LanguageModel) LatencyUpdateInterval() *fields.Duration
func (*LanguageModel) ModelName ¶
func (m *LanguageModel) ModelName() string
func (*LanguageModel) Provider ¶
func (m *LanguageModel) Provider() string
func (*LanguageModel) SupportChatStream ¶
func (m *LanguageModel) SupportChatStream() bool
func (LanguageModel) Weight ¶
func (m LanguageModel) Weight() int
type Model ¶
type Model interface { ID() string Healthy() bool LatencyUpdateInterval() *fields.Duration Weight() int }
Model represent a configured external modality-agnostic model with its routing properties and status
type ModelProvider ¶
ModelProvider exposes provider context
Click to show internal directories.
Click to hide internal directories.