Documentation ¶
Index ¶
- Variables
- type LLM
- type Option
- func WithChatMode(chatMode bool) Option
- func WithCustomTemplate(template string) Option
- func WithDoSample(doSample bool) Option
- func WithFormat(format string) Option
- func WithHTTPClient(client *http.Client) Option
- func WithMaxTokens(maxTokens int) Option
- func WithModel(model string) Option
- func WithRepetitionPenalty(repetitionPenalty float64) Option
- func WithServerURL(rawURL string) Option
- func WithStoppingTokens(tokens []string) Option
- func WithStream(stream bool) Option
- func WithSystemPrompt(p string) Option
- func WithTemperature(temperature float64) Option
- func WithToken(token string) Option
- func WithTokensPerMessage(tokensPerMessage int) Option
- func WithTopP(topP float64) Option
Constants ¶
This section is empty.
Variables ¶
var ( ErrEmptyResponse = errors.New("no response") ErrIncompleteEmbedding = errors.New("not all input got embedded") )
Functions ¶
This section is empty.
Types ¶
type LLM ¶
LLM is a maritaca LLM implementation.
func (*LLM) GenerateContent ¶
func (o *LLM) GenerateContent(ctx context.Context, messages []llms.MessageContent, options ...llms.CallOption) (*llms.ContentResponse, error)
GenerateContent implements the Model interface. nolint: goerr113
type Option ¶
type Option func(*options)
func WithChatMode ¶
WithChatMode Set the chat mode. default: true If True, the model will run in chat mode, where messages is a string containing the user's message or a list of messages containing the iterations of the conversation between user and assistant. If False, messages must be a string containing the desired prompt.
func WithCustomTemplate ¶
WithCustomTemplate To override the templating done on maritaca model side.
func WithDoSample ¶
WithDoSample Set the model's generation will be sampled via top-k sampling. Default: true If True, the model's generation will be sampled via top-k sampling. Otherwise, the generation will always select the token with the highest probability. Using do_sample=False leads to a deterministic result, but with less diversity.
func WithFormat ¶
WithFormat Sets the maritaca output format (currently maritaca only supports "json").
func WithHTTPClient ¶
WithHTTPClient Set custom http client.
func WithMaxTokens ¶
WithMaxTokens Set the maximum number of tokens that will be generated by the mode. minimum: 1 Maximum number of tokens that will be generated by the mode.
func WithRepetitionPenalty ¶
WithFrequencyPenalty Set the frequency penalty.
minimum: 0
default: 1 Repetition penalty. Positive values encourage the model not to repeat previously generated tokens.
func WithServerURL ¶
WithServerURL Set the URL of the maritaca instance to use.
func WithStoppingTokens ¶
WithFrequencyPenalty Set the frequency penalty. List of tokens that, when generated, indicate that the model should stop generating tokens.
func WithStream ¶
WithStream Set the model will run in streaming mode. default: false If True, the model will run in streaming mode, where tokens will be generated and returned to the client as they are produced. If False, the model will run in batch mode, where all tokens will be generated before being returned to the client.
func WithSystemPrompt ¶
WithSystem Set the system prompt. This is only valid if WithCustomTemplate is not set and the maritaca model use .System in its model template OR if WithCustomTemplate is set using {{.System}}.
func WithTemperature ¶
WithTemperature Set the sampling temperature. minimum: 0 default: 0.7 Sampling temperature (greater than or equal to zero). Higher values lead to greater diversity in generation but also increase the likelihood of generating nonsensical texts. Values closer to zero result in more plausible texts but increase the chances of generating repetitive texts.
func WithTokensPerMessage ¶
WithTokensPerMessage Set the number of tokens that will be returned per message.
minimum: 1
default: 4 Number of tokens that will be returned per message. This field is ignored if stream=False.
func WithTopP ¶
WithTopK Set the number of top tokens to consider for sampling. exclusiveMaximum: 1 exclusiveMinimum: 0 default: 0.95 If less than 1, it retains only the top tokens with cumulative probability >= top_p (nucleus filtering). For example, 0.95 means that only the tokens that make up the top 95% of the probability mass are considered when predicting the next token.
Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).