Documentation ¶
Index ¶
- Constants
- Variables
- type Agent
- func (a *Agent) CheckID(logger *zap.Logger, timeout time.Duration) error
- func (a *Agent) EnsureUnregistered(logger *zap.Logger) (wasCurrent bool)
- func (a *Agent) Resume(logger *zap.Logger, timeout time.Duration) error
- func (a *Agent) SpawnRequestUpscale(logger *zap.Logger, timeout time.Duration, handleError func(error))
- func (a *Agent) Suspend(logger *zap.Logger, timeout time.Duration, handleError func(error))
- type AgentSet
- func (s *AgentSet) Current() *Agent
- func (s *AgentSet) CurrentIdStr() string
- func (s *AgentSet) Get(id uuid.UUID) (_ *Agent, ok bool)
- func (s *AgentSet) ReceivedUpscale()
- func (s *AgentSet) RegisterNewAgent(logger *zap.Logger, info *api.AgentDesc) (api.InformantProtoVersion, int, error)
- func (s *AgentSet) RequestUpscale(logger *zap.Logger)
- type CgroupConfig
- type CgroupManager
- func (c *CgroupManager) CurrentMemoryUsage() (uint64, error)
- func (c *CgroupManager) FetchMemoryHighBytes() (*uint64, error)
- func (c *CgroupManager) FetchState() (cgroup2.State, error)
- func (c *CgroupManager) Freeze() error
- func (c *CgroupManager) SetMemHighBytes(bytes uint64) error
- func (c *CgroupManager) SetMemLimits(limits memoryLimits) error
- func (c *CgroupManager) Thaw() error
- type CgroupState
- type FileCacheConfig
- type FileCacheState
- type NewStateOpts
- type State
- func (s *State) HealthCheck(ctx context.Context, logger *zap.Logger, info *api.AgentIdentification) (*api.InformantHealthCheckResp, int, error)
- func (s *State) NotifyUpscale(ctx context.Context, logger *zap.Logger, ...) (*struct{}, int, error)
- func (s *State) RegisterAgent(ctx context.Context, logger *zap.Logger, info *api.AgentDesc) (*api.InformantDesc, int, error)
- func (s *State) TryDownscale(ctx context.Context, logger *zap.Logger, target *api.AgentResourceMessage) (*api.DownscaleResult, int, error)
- func (s *State) UnregisterAgent(ctx context.Context, logger *zap.Logger, info *api.AgentDesc) (*api.UnregisterAgent, int, error)
- type StateConfig
Constants ¶
const ( MinProtocolVersion api.InformantProtoVersion = api.InformantProtoV2_0 MaxProtocolVersion api.InformantProtoVersion = api.InformantProtoV2_0 )
The VM informant currently supports v1.1 and v1.2 of the agent<->informant protocol.
If you update either of these values, make sure to also update VERSIONING.md.
const ( PrometheusPort uint16 = 9100 CheckDeadlockDelay time.Duration = 1 * time.Second CheckDeadlockTimeout time.Duration = 250 * time.Millisecond AgentBackgroundCheckDelay time.Duration = 10 * time.Second AgentBackgroundCheckTimeout time.Duration = 250 * time.Millisecond AgentResumeTimeout time.Duration = 100 * time.Millisecond AgentSuspendTimeout time.Duration = 5 * time.Second // may take a while; it /suspend intentionally waits AgentUpscaleTimeout time.Duration = 400 * time.Millisecond // does not include waiting for /upscale response )
Variables ¶
var ( // DefaultStateConfig is the default state passed to NewState DefaultStateConfig StateConfig = StateConfig{ SysBufferBytes: 100 * (1 << 20), } // DefaultCgroupConfig is the default CgroupConfig used for cgroup interaction logic DefaultCgroupConfig CgroupConfig = CgroupConfig{ OOMBufferBytes: 100 * (1 << 20), MemoryHighBufferBytes: 100 * (1 << 20), MaxUpscaleWaitMillis: 20, DoNotFreezeMoreOftenThanMillis: 1000, MemoryHighIncreaseByBytes: 10 * (1 << 20), MemoryHighIncreaseEveryMillis: 25, } // DefaultFileCacheConfig is the default FileCacheConfig used for managing the file cache DefaultFileCacheConfig FileCacheConfig = FileCacheConfig{ InMemory: true, ResourceMultiplier: 0.75, MinRemainingAfterCache: 640 * (1 << 20), SpreadFactor: 0.1, } )
Functions ¶
This section is empty.
Types ¶
type Agent ¶
type Agent struct {
// contains filtered or unexported fields
}
func (*Agent) CheckID ¶
CheckID checks that the Agent's ID matches what's expected
If the agent has already been registered, then a failure in this method will unregister the agent.
If the Agent is unregistered before the call to CheckID() completes, the request will be cancelled and this method will return context.Canceled.
func (*Agent) EnsureUnregistered ¶
EnsureUnregistered unregisters the Agent if it is currently registered, signalling the AgentSet to use a new Agent if it isn't already
Returns whether the agent was the current Agent in use.
func (*Agent) Resume ¶
Resume attempts to restore the Agent as the current one in use, sending a request to its /resume endpoint
If the Agent is unregistered before the call to Resume() completes, the request will be cancelled and this method will return context.Canceled.
If the request fails, the Agent will be unregistered.
func (*Agent) SpawnRequestUpscale ¶ added in v0.1.4
func (a *Agent) SpawnRequestUpscale(logger *zap.Logger, timeout time.Duration, handleError func(error))
SpawnRequestUpscale requests that the Agent increase the resource allocation to this VM
This method blocks until the request is picked up by the message queue, and returns without waiting for the request to complete (it'll do that on its own).
The timeout applies only once the request is in-flight.
This method MUST NOT be called while holding a.parent.lock; if that happens, it may deadlock.
func (*Agent) Suspend ¶ added in v0.1.4
Suspend signals to the Agent that it is not *currently* in use, sending a request to its /suspend endpoint
If the Agent is unregistered before the call to Suspend() completes, the request will be cancelled and this method will return context.Canceled.
If the request fails, the Agent will be unregistered.
type AgentSet ¶
type AgentSet struct {
// contains filtered or unexported fields
}
AgentSet is the global state handling various autoscaler-agents that we could connect to
func NewAgentSet ¶
NewAgentSet creates a new AgentSet and starts the necessary background tasks
On completion, the background tasks should be ended with the Stop method.
func (*AgentSet) CurrentIdStr ¶ added in v0.10.0
Returns the id of the AgentSet's current agent as a string. If the current agent is nil, returns "<nil>"
func (*AgentSet) ReceivedUpscale ¶ added in v0.1.4
func (s *AgentSet) ReceivedUpscale()
ReceivedUpscale marks any desired upscaling from a prior s.RequestUpscale() as resolved
Typically, (*CgroupState).ReceivedUpscale() is also called alongside this method.
func (*AgentSet) RegisterNewAgent ¶
func (s *AgentSet) RegisterNewAgent(logger *zap.Logger, info *api.AgentDesc) (api.InformantProtoVersion, int, error)
RegisterNewAgent instantiates our local information about the autsocaler-agent
Returns: protocol version, status code, error (if unsuccessful)
func (*AgentSet) RequestUpscale ¶ added in v0.1.4
RequestUpscale requests an immediate upscale for more memory, if there's an agent currently enabled
If there's no current agent, then RequestUpscale marks the upscale as desired, and will request upscaling from the next agent we connect to.
type CgroupConfig ¶ added in v0.1.4
type CgroupConfig struct { // OOMBufferBytes gives the target difference between the total memory reserved for the cgroup // and the value of the cgroup's memory.high. // // In other words, memory.high + OOMBufferBytes will equal the total memory that the cgroup may // use (equal to system memory, minus whatever's taken out for the file cache). OOMBufferBytes uint64 // MemoryHighBufferBytes gives the amount of memory, in bytes, below a proposed new value for // memory.high that the cgroup's memory usage must be for us to downscale // // In other words, we can downscale only when: // // memory.current + MemoryHighBufferBytes < (proposed) memory.high // // TODO: there's some minor issues with this approach -- in particular, that we might have // memory in use by the kernel's page cache that we're actually ok with getting rid of. MemoryHighBufferBytes uint64 // MaxUpscaleWaitMillis gives the maximum duration, in milliseconds, that we're allowed to pause // the cgroup for while waiting for the autoscaler-agent to upscale us MaxUpscaleWaitMillis uint // DoNotFreezeMoreOftenThanMillis gives a required minimum time, in milliseconds, that we must // wait before re-freezing the cgroup while waiting for the autoscaler-agent to upscale us. DoNotFreezeMoreOftenThanMillis uint // MemoryHighIncreaseByBytes gives the amount of memory, in bytes, that we should periodically // increase memory.high by while waiting for the autoscaler-agent to upscale us. // // This exists to avoid the excessive throttling that happens when a cgroup is above its // memory.high for too long. See more here: // https://github.com/neondatabase/autoscaling/issues/44#issuecomment-1522487217 MemoryHighIncreaseByBytes uint64 // MemoryHighIncreaseEveryMillis gives the period, in milliseconds, at which we should // repeatedly increase the value of the cgroup's memory.high while we're waiting on upscaling // and memory.high is still being hit. // // Technically speaking, this actually serves as a rate limit to moderate responding to // memory.high events, but these are roughly equivalent if the process is still allocating // memory. MemoryHighIncreaseEveryMillis uint }
CgroupConfig provides some configuration options for State cgroup handling
type CgroupManager ¶ added in v0.1.4
type CgroupManager struct { MemoryHighEvent util.CondChannelReceiver ErrCh <-chan error // contains filtered or unexported fields }
func NewCgroupManager ¶ added in v0.1.4
func NewCgroupManager(logger *zap.Logger, groupName string) (*CgroupManager, error)
func (*CgroupManager) CurrentMemoryUsage ¶ added in v0.1.4
func (c *CgroupManager) CurrentMemoryUsage() (uint64, error)
CurrentMemoryUsage returns the value at memory.current -- the cgroup's current memory usage.
func (*CgroupManager) FetchMemoryHighBytes ¶ added in v0.8.0
func (c *CgroupManager) FetchMemoryHighBytes() (*uint64, error)
func (*CgroupManager) FetchState ¶ added in v0.1.4
func (c *CgroupManager) FetchState() (cgroup2.State, error)
FetchState returns a cgroup2.State indicating whether the cgroup is currently frozen
func (*CgroupManager) Freeze ¶ added in v0.1.4
func (c *CgroupManager) Freeze() error
func (*CgroupManager) SetMemHighBytes ¶ added in v0.8.0
func (c *CgroupManager) SetMemHighBytes(bytes uint64) error
func (*CgroupManager) SetMemLimits ¶ added in v0.8.0
func (c *CgroupManager) SetMemLimits(limits memoryLimits) error
SetMemLimits sets the cgroup's memory.high and memory.max to the values provided by the memoryLimits.
func (*CgroupManager) Thaw ¶ added in v0.1.4
func (c *CgroupManager) Thaw() error
type CgroupState ¶ added in v0.1.4
type CgroupState struct {
// contains filtered or unexported fields
}
CgroupState provides the high-level cgroup handling logic, building upon the low-level plumbing provided by CgroupManager.
func (*CgroupState) ReceivedUpscale ¶ added in v0.1.4
func (s *CgroupState) ReceivedUpscale()
ReceivedUpscale notifies s.upscaleEventsRecvr
Typically, (*AgentSet).ReceivedUpscale() is also called alongside this method.
type FileCacheConfig ¶ added in v0.1.14
type FileCacheConfig struct { // InMemory indicates whether the file cache is *actually* stored in memory (e.g. by writing to // a tmpfs or shmem file). If true, the size of the file cache will be counted against the // memory available for the cgroup. InMemory bool // ResourceMultiplier gives the size of the file cache, in terms of the size of the resource it // consumes (currently: only memory) // // For example, setting ResourceMultiplier = 0.75 gives the cache a target size of 75% of total // resources. // // This value must be strictly between 0 and 1. ResourceMultiplier float64 // MinRemainingAfterCache gives the required minimum amount of memory, in bytes, that must // remain available after subtracting the file cache. // // This value must be non-zero. MinRemainingAfterCache uint64 // SpreadFactor controls the rate of increase in the file cache's size as it grows from zero // (when total resources equals MinRemainingAfterCache) to the desired size based on // ResourceMultiplier. // // A SpreadFactor of zero means that all additional resources will go to the cache until it // reaches the desired size. Setting SpreadFactor to N roughly means "for every 1 byte added to // the cache's size, N bytes are reserved for the rest of the system, until the cache gets to // its desired size". // // This value must be >= 0, and must retain an increase that is more than what would be given by // ResourceMultiplier. For example, setting ResourceMultiplier = 0.75 but SpreadFactor = 1 would // be invalid, because SpreadFactor would induce only 50% usage - never reaching the 75% as // desired by ResourceMultiplier. // // SpreadFactor is too large if (SpreadFactor+1) * ResourceMultiplier is >= 1. SpreadFactor float64 }
func (*FileCacheConfig) CalculateCacheSize ¶ added in v0.1.14
func (c *FileCacheConfig) CalculateCacheSize(total uint64) uint64
CalculateCacheSize returns the desired size of the cache, given the total memory.
func (*FileCacheConfig) Validate ¶ added in v0.1.14
func (c *FileCacheConfig) Validate() error
type FileCacheState ¶ added in v0.1.14
type FileCacheState struct {
// contains filtered or unexported fields
}
func (*FileCacheState) GetFileCacheSize ¶ added in v0.1.14
func (s *FileCacheState) GetFileCacheSize(ctx context.Context) (uint64, error)
GetFileCacheSize returns the current size of the file cache, in bytes
func (*FileCacheState) SetFileCacheSize ¶ added in v0.1.14
func (s *FileCacheState) SetFileCacheSize(ctx context.Context, logger *zap.Logger, sizeInBytes uint64) (uint64, error)
SetFileCacheSize sets the size of the file cache, returning the actual size it was set to
type NewStateOpts ¶ added in v0.1.4
type NewStateOpts struct {
// contains filtered or unexported fields
}
NewStateOpts are individual options provided to NewState
func WithCgroup ¶ added in v0.1.4
func WithCgroup(cgm *CgroupManager, config CgroupConfig) NewStateOpts
WithCgroup creates a NewStateOpts that sets its CgroupHandler
This function will panic if the provided CgroupConfig is invalid.
func WithPostgresFileCache ¶ added in v0.1.14
func WithPostgresFileCache(connStr string, config FileCacheConfig) NewStateOpts
WithPostgresFileCache creates a NewStateOpts that enables connections to the postgres file cache
type State ¶
type State struct {
// contains filtered or unexported fields
}
State is the global state of the informant
func NewState ¶
func NewState(logger *zap.Logger, agents *AgentSet, config StateConfig, opts ...NewStateOpts) (*State, error)
NewState instantiates a new State object, starting whatever background processes might be required
Optional configuration may be provided by NewStateOpts - see WithCgroup and WithPostgresFileCache.
func (*State) HealthCheck ¶ added in v0.7.0
func (s *State) HealthCheck(ctx context.Context, logger *zap.Logger, info *api.AgentIdentification) (*api.InformantHealthCheckResp, int, error)
HealthCheck is a dummy endpoint that allows the autoscaler-agent to check that (a) the informant is up and running, and (b) the agent is still registered.
Returns: body (if successful), status code, error (if unsuccessful)
func (*State) NotifyUpscale ¶
func (s *State) NotifyUpscale( ctx context.Context, logger *zap.Logger, newResources *api.AgentResourceMessage, ) (*struct{}, int, error)
NotifyUpscale signals that the VM's resource usage has been increased to the new amount
Returns: body (if successful), status code and error (if unsuccessful)
func (*State) RegisterAgent ¶
func (s *State) RegisterAgent(ctx context.Context, logger *zap.Logger, info *api.AgentDesc) (*api.InformantDesc, int, error)
RegisterAgent registers a new or updated autoscaler-agent
Returns: body (if successful), status code, error (if unsuccessful)
func (*State) TryDownscale ¶
func (s *State) TryDownscale(ctx context.Context, logger *zap.Logger, target *api.AgentResourceMessage) (*api.DownscaleResult, int, error)
TryDownscale tries to downscale the VM's current resource usage, returning whether the proposed amount is ok
Returns: body (if successful), status code and error (if unsuccessful)
func (*State) UnregisterAgent ¶
func (s *State) UnregisterAgent(ctx context.Context, logger *zap.Logger, info *api.AgentDesc) (*api.UnregisterAgent, int, error)
UnregisterAgent unregisters the autoscaler-agent given by info, if it is currently registered
If a different autoscaler-agent is currently registered, this method will do nothing.
Returns: body (if successful), status code and error (if unsuccessful)
type StateConfig ¶ added in v0.1.14
type StateConfig struct { // SysBufferBytes gives the estimated amount of memory, in bytes, that the kernel uses before // handing out the rest to userspace. This value is the estimated difference between the // *actual* physical memory and the amount reported by `grep MemTotal /proc/meminfo`. // // For more information, refer to `man 5 proc`, which defines MemTotal as "Total usable RAM // (i.e., physical RAM minus a few reserved bits and the kernel binary code)". // // We only use SysBufferBytes when calculating the system memory from the *external* memory // size, rather than the self-reported memory size, according to the kernel. // // TODO: this field is only necessary while we still have to trust the autoscaler-agent's // upscale resource amounts (because we might not *actually* have been upscaled yet). This field // should be removed once we have a better solution there. SysBufferBytes uint64 }