Documentation ¶
Overview ¶
Package shorttermmemory provides functionality for managing the runtime state of message processing, including message aggregation, forking, and joining of message streams, as well as usage tracking.
Package shorttermmemory provides token usage tracking and aggregation for AI model interactions. It maintains detailed statistics about token consumption across different aspects of model usage, essential for monitoring, billing, and optimizing context window utilization.
Design decisions:
- Granular tracking: Separate tracking for prompts and completions
- Detailed breakdowns: Sub-categories for different token types
- Aggregation support: Easy combining of usage across multiple operations
- JSON compatibility: Full serialization support for persistence/API integration
- Thread safety: Usage objects can be safely updated concurrently
- Extensible structure: Easy to add new token categories as models evolve
Usage hierarchy:
- Usage: Top-level structure tracking overall token consumption ├── CompletionTokensDetails: Breakdown of completion token usage │ ├── AcceptedPredictionTokens: Tokens from successful predictions │ ├── RejectedPredictionTokens: Tokens from unused predictions │ ├── ReasoningTokens: Tokens used for model reasoning │ └── AudioTokens: Tokens from audio processing └── PromptTokensDetails: Breakdown of prompt token usage ├── AudioTokens: Tokens from audio inputs └── CachedTokens: Tokens retrieved from cache
Example usage:
// Track usage for a model interaction usage := &Usage{ CompletionTokens: 150, PromptTokens: 100, TotalTokens: 250, CompletionTokensDetails: CompletionTokensDetails{ ReasoningTokens: 50, AcceptedPredictionTokens: 100, }, PromptTokensDetails: PromptTokensDetails{ CachedTokens: 20, }, } // Aggregate usage from multiple operations totalUsage := &Usage{} totalUsage.AddUsage(usage1) totalUsage.AddUsage(usage2)
The package is designed to be internal, providing essential token tracking functionality while keeping implementation details private. It's particularly useful for:
- Cost monitoring and billing
- Context window optimization
- Performance analysis
- Cache effectiveness measurement
- Model behavior analysis
Package shorttermmemory provides functionality for tracking and managing token usage in AI model interactions. It helps monitor resource consumption and maintain usage metrics across conversation turns.
Index ¶
- func AddMessage[T messages.ModelMessage](a *Aggregator, m messages.Message[T])
- type AggregatedMessages
- type Aggregator
- func (a *Aggregator) AddAssistantMessage(m messages.Message[messages.AssistantMessage])
- func (a *Aggregator) AddToolCall(m messages.Message[messages.ToolCallMessage])
- func (a *Aggregator) AddToolResponse(m messages.Message[messages.ToolResponse])
- func (a *Aggregator) AddUsage(u *Usage)
- func (a *Aggregator) AddUserPrompt(m messages.Message[messages.UserMessage])
- func (a *Aggregator) Checkpoint() Checkpoint
- func (a *Aggregator) Fork() *Aggregator
- func (a *Aggregator) ID() uuid.UUID
- func (a *Aggregator) Join(b *Aggregator)
- func (a *Aggregator) Len() int
- func (a *Aggregator) Messages() AggregatedMessages
- func (a *Aggregator) MessagesIter() iter.Seq[messages.Message[messages.ModelMessage]]
- func (a *Aggregator) TurnLen() int
- func (a *Aggregator) Usage() Usage
- type Checkpoint
- type CompletionTokensDetails
- type PromptTokensDetails
- type Usage
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AddMessage ¶
func AddMessage[T messages.ModelMessage](a *Aggregator, m messages.Message[T])
AddMessage adds any message type that implements ModelMessage to the aggregator. This is a generic function that can handle any valid message type in the system. For common message types, prefer using the specific Add methods (AddUserPrompt, AddAssistantMessage, etc.) as they provide better type safety and clarity.
Example:
agg := &Aggregator{...} msg := messages.New().UserPrompt("hello") AddMessage(agg, msg)
Types ¶
type AggregatedMessages ¶
type AggregatedMessages []messages.Message[messages.ModelMessage]
AggregatedMessages represents a collection of model messages that can be processed together. It provides a type-safe way to handle multiple messages while maintaining their order.
func (AggregatedMessages) Len ¶
func (a AggregatedMessages) Len() int
Len returns the number of messages in the collection.
type Aggregator ¶
type Aggregator struct {
// contains filtered or unexported fields
}
Aggregator manages a collection of messages and their associated usage statistics. It supports fork-join operations to allow parallel processing of message streams while maintaining message order and proper usage tracking.
func New ¶ added in v0.1.0
func New() *Aggregator
New creates and initializes a new Aggregator instance. It sets up: - A new unique identifier - An empty message collection - Zero-initialized usage statistics
Example:
agg := New() // agg is ready to accept messages and track usage
func (*Aggregator) AddAssistantMessage ¶
func (a *Aggregator) AddAssistantMessage(m messages.Message[messages.AssistantMessage])
AddAssistantMessage adds an assistant's response message to the aggregator. This is used for messages that represent responses or outputs from the assistant.
Example:
msg := messages.New().AssistantMessage("The weather is sunny.") agg.AddAssistantMessage(msg)
func (*Aggregator) AddToolCall ¶
func (a *Aggregator) AddToolCall(m messages.Message[messages.ToolCallMessage])
AddToolCall adds a tool call message to the aggregator. This is used when the assistant needs to invoke an external tool or service.
Example:
toolCall := messages.New().ToolCall("weather-api", []ToolCallData{...}) agg.AddToolCall(toolCall)
func (*Aggregator) AddToolResponse ¶
func (a *Aggregator) AddToolResponse(m messages.Message[messages.ToolResponse])
AddToolResponse adds a tool's response message to the aggregator. This is used to store the results returned from external tool invocations.
Example:
response := messages.New().ToolResponse("call-id", "weather-api", "Temperature: 72°F") agg.AddToolResponse(response)
func (*Aggregator) AddUsage ¶ added in v0.1.1
func (a *Aggregator) AddUsage(u *Usage)
func (*Aggregator) AddUserPrompt ¶
func (a *Aggregator) AddUserPrompt(m messages.Message[messages.UserMessage])
AddUserPrompt adds a user message to the aggregator. This is typically used for adding messages that represent user input or queries.
Example:
msg := messages.New().UserPrompt("What's the weather?") agg.AddUserPrompt(msg)
func (*Aggregator) Checkpoint ¶
func (a *Aggregator) Checkpoint() Checkpoint
Checkpoint creates a snapshot of the current aggregator state. This allows saving the current state of messages and usage statistics for later reference or restoration. The checkpoint includes: - The aggregator's unique ID - A deep copy of all current messages - The current usage statistics
Example:
agg := &Aggregator{...} checkpoint := agg.Checkpoint() // Save current state // ... make changes to agg ... // checkpoint still holds the original state
func (*Aggregator) Fork ¶
func (a *Aggregator) Fork() *Aggregator
Fork creates a new aggregator that starts with a copy of the current messages. The new aggregator gets: - A new unique ID - A copy of all current messages - An initLen set to the current message count This allows for parallel processing of message streams that can be joined later.
func (*Aggregator) ID ¶
func (a *Aggregator) ID() uuid.UUID
ID returns the unique identifier of this aggregator. This ID is generated when the aggregator is created or forked.
func (*Aggregator) Join ¶
func (a *Aggregator) Join(b *Aggregator)
Join combines messages from a forked aggregator back into this one. It:
- Appends only the messages that were added to the forked aggregator after it was forked (determined using b.initLen)
- Combines usage statistics from both aggregators
The join operation maintains message order by: 1. Keeping all original messages 2. Keeping any messages added to this aggregator after the fork 3. Appending only new messages from the forked aggregator (those after b.initLen)
Example:
original := &Aggregator{...} // Has messages [1,2] forked := original.Fork() // forked has [1,2] and initLen=2 original.Add(msg3) // original now has [1,2,3] forked.Add(msg4) // forked now has [1,2,4] original.Join(forked) // original ends with [1,2,3,4]
func (*Aggregator) Len ¶
func (a *Aggregator) Len() int
Len returns the total number of messages currently held by the aggregator.
func (*Aggregator) Messages ¶
func (a *Aggregator) Messages() AggregatedMessages
Messages returns a copy of all messages in the aggregator. The returned slice is a deep copy, so modifications to it won't affect the original messages in the aggregator.
func (*Aggregator) MessagesIter ¶
func (a *Aggregator) MessagesIter() iter.Seq[messages.Message[messages.ModelMessage]]
MessagesIter returns an iterator over all messages in the aggregator. This provides a memory-efficient way to process messages sequentially without creating a copy of the entire message slice.
func (*Aggregator) TurnLen ¶
func (a *Aggregator) TurnLen() int
TurnLen returns the number of messages added to the aggregator since it was forked.
func (*Aggregator) Usage ¶
func (a *Aggregator) Usage() Usage
Usage returns the current usage statistics for this aggregator. This includes token counts for prompts and completions, as well as detailed breakdowns of token usage by category.
type Checkpoint ¶
type Checkpoint struct {
// contains filtered or unexported fields
}
Checkpoint represents a snapshot of an aggregator's state at a specific point in time. It contains an immutable copy of the aggregator's state, including: - The unique identifier of the source aggregator - A snapshot of all messages at checkpoint time - The usage statistics at checkpoint time
Checkpoints are useful for: - Creating save points in long-running operations - Comparing states at different points in time - Rolling back to previous states if needed
func (*Checkpoint) ID ¶
func (c *Checkpoint) ID() uuid.UUID
ID returns the unique identifier of the aggregator that created this checkpoint. This ID matches the source aggregator's ID at the time the checkpoint was created.
func (Checkpoint) MarshalJSON ¶
func (c Checkpoint) MarshalJSON() ([]byte, error)
func (*Checkpoint) MergeInto ¶
func (c *Checkpoint) MergeInto(other *Aggregator)
MergeInto merges the checkpoint's state into another aggregator. This operation: - Appends messages from the checkpoint that were added after its fork point - Combines the checkpoint's usage statistics with the target aggregator's
This is useful when you want to apply a saved state to a different or new aggregator instance.
Example:
checkpoint := sourceAgg.Checkpoint() targetAgg := NewAggregator() checkpoint.MergeInto(targetAgg) // targetAgg now contains checkpoint's state
func (*Checkpoint) Messages ¶
func (c *Checkpoint) Messages() AggregatedMessages
Messages returns a copy of all messages that were present in the aggregator at the time this checkpoint was created. The returned slice is a deep copy, so modifications won't affect the checkpoint's stored messages.
func (*Checkpoint) UnmarshalJSON ¶
func (c *Checkpoint) UnmarshalJSON(data []byte) error
func (*Checkpoint) Usage ¶
func (c *Checkpoint) Usage() Usage
Usage returns the usage statistics that were recorded in the aggregator at the time this checkpoint was created. This includes all token counts and usage metrics up to the checkpoint time.
type CompletionTokensDetails ¶
type CompletionTokensDetails struct { // When using Predicted Outputs, the number of tokens in the prediction that // appeared in the completion. AcceptedPredictionTokens int64 `json:"accepted_prediction_tokens"` // Audio input tokens generated by the model. AudioTokens int64 `json:"audio_tokens"` // Tokens generated by the model for reasoning. ReasoningTokens int64 `json:"reasoning_tokens"` // When using Predicted Outputs, the number of tokens in the prediction that did // not appear in the completion. However, like reasoning tokens, these tokens are // still counted in the total completion tokens for purposes of billing, output, // and context window limits. RejectedPredictionTokens int64 `json:"rejected_prediction_tokens"` }
CompletionTokensDetails provides a detailed breakdown of token usage in model completions. This structure is particularly important for understanding how tokens are being used across different aspects of model output, including predictions and reasoning.
The token counts help in:
- Optimizing model usage and costs
- Debugging model behavior
- Monitoring prediction efficiency
- Tracking audio processing usage
func (*CompletionTokensDetails) AddUsage ¶
func (c *CompletionTokensDetails) AddUsage(details *CompletionTokensDetails)
AddUsage updates the CompletionTokensDetails with usage information. It increments the token counts for accepted predictions, audio, reasoning, and rejected predictions.
Parameters:
- details: A pointer to the CompletionTokensDetails struct to be updated.
- acceptedPredictionTokens: Tokens used in accepted predictions.
- audioTokens: Tokens used in audio processing.
- reasoningTokens: Tokens used in reasoning tasks.
- rejectedPredictionTokens: Tokens used in rejected predictions.
Example:
details := &CompletionTokensDetails{} AddUsage(details, 100, 50, 30, 20)
AddUsage combines token counts from another CompletionTokensDetails instance. This method is used internally by Usage.AddUsage to maintain accurate token counts across all completion aspects.
The method safely handles nil inputs and updates all token categories:
- Accepted prediction tokens
- Audio processing tokens
- Reasoning tokens
- Rejected prediction tokens
Example:
details := &CompletionTokensDetails{ReasoningTokens: 100} newDetails := &CompletionTokensDetails{ReasoningTokens: 50} details.AddUsage(newDetails) // details.ReasoningTokens is now 150
type PromptTokensDetails ¶
type PromptTokensDetails struct { // Audio input tokens present in the prompt. AudioTokens int64 `json:"audio_tokens"` // Cached tokens present in the prompt. CachedTokens int64 `json:"cached_tokens"` }
PromptTokensDetails tracks token usage specifically for prompt inputs. It separates tokens used for audio inputs and cached content, which is useful for optimizing prompt construction and monitoring caching efficiency.
This breakdown helps in:
- Understanding prompt composition
- Monitoring cache effectiveness
- Tracking audio input usage
- Optimizing prompt design
func (*PromptTokensDetails) AddUsage ¶
func (p *PromptTokensDetails) AddUsage(details *PromptTokensDetails)
AddUsage combines token counts from another PromptTokensDetails instance. This method is used internally by Usage.AddUsage to maintain accurate token counts for prompt-related usage.
The method safely handles nil inputs and updates:
- Audio input tokens
- Cached tokens
Example:
details := &PromptTokensDetails{AudioTokens: 100} newDetails := &PromptTokensDetails{AudioTokens: 50} details.AddUsage(newDetails) // details.AudioTokens is now 150
type Usage ¶
type Usage struct { // Number of tokens in the generated completion. CompletionTokens int64 `json:"completion_tokens"` // Number of tokens in the prompt. PromptTokens int64 `json:"prompt_tokens"` // Total number of tokens used in the request (prompt + completion). TotalTokens int64 `json:"total_tokens"` // Breakdown of tokens used in a completion. CompletionTokensDetails CompletionTokensDetails `json:"completion_tokens_details"` // Breakdown of tokens used in the prompt. PromptTokensDetails PromptTokensDetails `json:"prompt_tokens_details"` }
Usage tracks token consumption across different aspects of AI model interactions. It provides detailed breakdowns of token usage for both prompts and completions, which is essential for monitoring costs and optimizing resource usage.
Example usage:
usage := &Usage{ CompletionTokens: 150, PromptTokens: 200, TotalTokens: 350, CompletionTokensDetails: CompletionTokensDetails{ ReasoningTokens: 100, AudioTokens: 50, }, PromptTokensDetails: PromptTokensDetails{ AudioTokens: 150, CachedTokens: 50, }, }
func (*Usage) AddUsage ¶
AddUsage combines the token counts from another Usage instance with this one. This is useful for aggregating usage across multiple interactions or turns in a conversation.
The method safely handles nil inputs and updates all token counts and details. It's particularly useful for:
- Tracking cumulative usage across conversation turns
- Aggregating usage across multiple model calls
- Maintaining usage statistics for billing purposes
Example:
baseUsage := &Usage{CompletionTokens: 100} additionalUsage := &Usage{CompletionTokens: 50} baseUsage.AddUsage(additionalUsage) // baseUsage.CompletionTokens is now 150