informant

package
v0.13.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 18, 2023 License: Apache-2.0 Imports: 26 Imported by: 0

Documentation

Index

Constants

The VM informant currently supports v1.1 and v1.2 of the agent<->informant protocol.

If you update either of these values, make sure to also update VERSIONING.md.

View Source
const (
	PrometheusPort uint16 = 9100

	CheckDeadlockDelay   time.Duration = 1 * time.Second
	CheckDeadlockTimeout time.Duration = 250 * time.Millisecond

	AgentBackgroundCheckDelay   time.Duration = 10 * time.Second
	AgentBackgroundCheckTimeout time.Duration = 250 * time.Millisecond

	AgentResumeTimeout  time.Duration = 100 * time.Millisecond
	AgentSuspendTimeout time.Duration = 5 * time.Second        // may take a while; it /suspend intentionally waits
	AgentUpscaleTimeout time.Duration = 400 * time.Millisecond // does not include waiting for /upscale response
)

Variables

View Source
var (
	// DefaultStateConfig is the default state passed to NewState
	DefaultStateConfig StateConfig = StateConfig{
		SysBufferBytes: 100 * (1 << 20),
	}

	// DefaultCgroupConfig is the default CgroupConfig used for cgroup interaction logic
	DefaultCgroupConfig CgroupConfig = CgroupConfig{
		OOMBufferBytes:        100 * (1 << 20),
		MemoryHighBufferBytes: 100 * (1 << 20),

		MaxUpscaleWaitMillis:           20,
		DoNotFreezeMoreOftenThanMillis: 1000,

		MemoryHighIncreaseByBytes:     10 * (1 << 20),
		MemoryHighIncreaseEveryMillis: 25,
	}

	// DefaultFileCacheConfig is the default FileCacheConfig used for managing the file cache
	DefaultFileCacheConfig FileCacheConfig = FileCacheConfig{
		InMemory:               true,
		ResourceMultiplier:     0.75,
		MinRemainingAfterCache: 640 * (1 << 20),
		SpreadFactor:           0.1,
	}
)

Functions

This section is empty.

Types

type Agent

type Agent struct {
	// contains filtered or unexported fields
}

func (*Agent) CheckID

func (a *Agent) CheckID(logger *zap.Logger, timeout time.Duration) error

CheckID checks that the Agent's ID matches what's expected

If the agent has already been registered, then a failure in this method will unregister the agent.

If the Agent is unregistered before the call to CheckID() completes, the request will be cancelled and this method will return context.Canceled.

func (*Agent) EnsureUnregistered

func (a *Agent) EnsureUnregistered(logger *zap.Logger) (wasCurrent bool)

EnsureUnregistered unregisters the Agent if it is currently registered, signalling the AgentSet to use a new Agent if it isn't already

Returns whether the agent was the current Agent in use.

func (*Agent) Resume

func (a *Agent) Resume(logger *zap.Logger, timeout time.Duration) error

Resume attempts to restore the Agent as the current one in use, sending a request to its /resume endpoint

If the Agent is unregistered before the call to Resume() completes, the request will be cancelled and this method will return context.Canceled.

If the request fails, the Agent will be unregistered.

func (*Agent) SpawnRequestUpscale added in v0.1.4

func (a *Agent) SpawnRequestUpscale(logger *zap.Logger, timeout time.Duration, handleError func(error))

SpawnRequestUpscale requests that the Agent increase the resource allocation to this VM

This method blocks until the request is picked up by the message queue, and returns without waiting for the request to complete (it'll do that on its own).

The timeout applies only once the request is in-flight.

This method MUST NOT be called while holding a.parent.lock; if that happens, it may deadlock.

func (*Agent) Suspend added in v0.1.4

func (a *Agent) Suspend(logger *zap.Logger, timeout time.Duration, handleError func(error))

Suspend signals to the Agent that it is not *currently* in use, sending a request to its /suspend endpoint

If the Agent is unregistered before the call to Suspend() completes, the request will be cancelled and this method will return context.Canceled.

If the request fails, the Agent will be unregistered.

type AgentSet

type AgentSet struct {
	// contains filtered or unexported fields
}

AgentSet is the global state handling various autoscaler-agents that we could connect to

func NewAgentSet

func NewAgentSet(logger *zap.Logger) *AgentSet

NewAgentSet creates a new AgentSet and starts the necessary background tasks

On completion, the background tasks should be ended with the Stop method.

func (*AgentSet) Current added in v0.10.0

func (s *AgentSet) Current() *Agent

Returns the current agent, which can be nil

func (*AgentSet) CurrentIdStr added in v0.10.0

func (s *AgentSet) CurrentIdStr() string

Returns the id of the AgentSet's current agent as a string. If the current agent is nil, returns "<nil>"

func (*AgentSet) Get

func (s *AgentSet) Get(id uuid.UUID) (_ *Agent, ok bool)

Get returns the requested Agent, if it exists

func (*AgentSet) ReceivedUpscale added in v0.1.4

func (s *AgentSet) ReceivedUpscale()

ReceivedUpscale marks any desired upscaling from a prior s.RequestUpscale() as resolved

Typically, (*CgroupState).ReceivedUpscale() is also called alongside this method.

func (*AgentSet) RegisterNewAgent

func (s *AgentSet) RegisterNewAgent(logger *zap.Logger, info *api.AgentDesc) (api.InformantProtoVersion, int, error)

RegisterNewAgent instantiates our local information about the autsocaler-agent

Returns: protocol version, status code, error (if unsuccessful)

func (*AgentSet) RequestUpscale added in v0.1.4

func (s *AgentSet) RequestUpscale(logger *zap.Logger)

RequestUpscale requests an immediate upscale for more memory, if there's an agent currently enabled

If there's no current agent, then RequestUpscale marks the upscale as desired, and will request upscaling from the next agent we connect to.

type CgroupConfig added in v0.1.4

type CgroupConfig struct {
	// OOMBufferBytes gives the target difference between the total memory reserved for the cgroup
	// and the value of the cgroup's memory.high.
	//
	// In other words, memory.high + OOMBufferBytes will equal the total memory that the cgroup may
	// use (equal to system memory, minus whatever's taken out for the file cache).
	OOMBufferBytes uint64

	// MemoryHighBufferBytes gives the amount of memory, in bytes, below a proposed new value for
	// memory.high that the cgroup's memory usage must be for us to downscale
	//
	// In other words, we can downscale only when:
	//
	//   memory.current + MemoryHighBufferBytes < (proposed) memory.high
	//
	// TODO: there's some minor issues with this approach -- in particular, that we might have
	// memory in use by the kernel's page cache that we're actually ok with getting rid of.
	MemoryHighBufferBytes uint64

	// MaxUpscaleWaitMillis gives the maximum duration, in milliseconds, that we're allowed to pause
	// the cgroup for while waiting for the autoscaler-agent to upscale us
	MaxUpscaleWaitMillis uint

	// DoNotFreezeMoreOftenThanMillis gives a required minimum time, in milliseconds, that we must
	// wait before re-freezing the cgroup while waiting for the autoscaler-agent to upscale us.
	DoNotFreezeMoreOftenThanMillis uint

	// MemoryHighIncreaseByBytes gives the amount of memory, in bytes, that we should periodically
	// increase memory.high by while waiting for the autoscaler-agent to upscale us.
	//
	// This exists to avoid the excessive throttling that happens when a cgroup is above its
	// memory.high for too long. See more here:
	// https://github.com/neondatabase/autoscaling/issues/44#issuecomment-1522487217
	MemoryHighIncreaseByBytes uint64

	// MemoryHighIncreaseEveryMillis gives the period, in milliseconds, at which we should
	// repeatedly increase the value of the cgroup's memory.high while we're waiting on upscaling
	// and memory.high is still being hit.
	//
	// Technically speaking, this actually serves as a rate limit to moderate responding to
	// memory.high events, but these are roughly equivalent if the process is still allocating
	// memory.
	MemoryHighIncreaseEveryMillis uint
}

CgroupConfig provides some configuration options for State cgroup handling

type CgroupManager added in v0.1.4

type CgroupManager struct {
	MemoryHighEvent util.CondChannelReceiver
	ErrCh           <-chan error
	// contains filtered or unexported fields
}

func NewCgroupManager added in v0.1.4

func NewCgroupManager(logger *zap.Logger, groupName string) (*CgroupManager, error)

func (*CgroupManager) CurrentMemoryUsage added in v0.1.4

func (c *CgroupManager) CurrentMemoryUsage() (uint64, error)

CurrentMemoryUsage returns the value at memory.current -- the cgroup's current memory usage.

func (*CgroupManager) FetchMemoryHighBytes added in v0.8.0

func (c *CgroupManager) FetchMemoryHighBytes() (*uint64, error)

func (*CgroupManager) FetchState added in v0.1.4

func (c *CgroupManager) FetchState() (cgroup2.State, error)

FetchState returns a cgroup2.State indicating whether the cgroup is currently frozen

func (*CgroupManager) Freeze added in v0.1.4

func (c *CgroupManager) Freeze() error

func (*CgroupManager) SetMemHighBytes added in v0.8.0

func (c *CgroupManager) SetMemHighBytes(bytes uint64) error

func (*CgroupManager) SetMemLimits added in v0.8.0

func (c *CgroupManager) SetMemLimits(limits memoryLimits) error

SetMemLimits sets the cgroup's memory.high and memory.max to the values provided by the memoryLimits.

func (*CgroupManager) Thaw added in v0.1.4

func (c *CgroupManager) Thaw() error

type CgroupState added in v0.1.4

type CgroupState struct {
	// contains filtered or unexported fields
}

CgroupState provides the high-level cgroup handling logic, building upon the low-level plumbing provided by CgroupManager.

func (*CgroupState) ReceivedUpscale added in v0.1.4

func (s *CgroupState) ReceivedUpscale()

ReceivedUpscale notifies s.upscaleEventsRecvr

Typically, (*AgentSet).ReceivedUpscale() is also called alongside this method.

type FileCacheConfig added in v0.1.14

type FileCacheConfig struct {
	// InMemory indicates whether the file cache is *actually* stored in memory (e.g. by writing to
	// a tmpfs or shmem file). If true, the size of the file cache will be counted against the
	// memory available for the cgroup.
	InMemory bool

	// ResourceMultiplier gives the size of the file cache, in terms of the size of the resource it
	// consumes (currently: only memory)
	//
	// For example, setting ResourceMultiplier = 0.75 gives the cache a target size of 75% of total
	// resources.
	//
	// This value must be strictly between 0 and 1.
	ResourceMultiplier float64

	// MinRemainingAfterCache gives the required minimum amount of memory, in bytes, that must
	// remain available after subtracting the file cache.
	//
	// This value must be non-zero.
	MinRemainingAfterCache uint64

	// SpreadFactor controls the rate of increase in the file cache's size as it grows from zero
	// (when total resources equals MinRemainingAfterCache) to the desired size based on
	// ResourceMultiplier.
	//
	// A SpreadFactor of zero means that all additional resources will go to the cache until it
	// reaches the desired size. Setting SpreadFactor to N roughly means "for every 1 byte added to
	// the cache's size, N bytes are reserved for the rest of the system, until the cache gets to
	// its desired size".
	//
	// This value must be >= 0, and must retain an increase that is more than what would be given by
	// ResourceMultiplier. For example, setting ResourceMultiplier = 0.75 but SpreadFactor = 1 would
	// be invalid, because SpreadFactor would induce only 50% usage - never reaching the 75% as
	// desired by ResourceMultiplier.
	//
	// SpreadFactor is too large if (SpreadFactor+1) * ResourceMultiplier is >= 1.
	SpreadFactor float64
}

func (*FileCacheConfig) CalculateCacheSize added in v0.1.14

func (c *FileCacheConfig) CalculateCacheSize(total uint64) uint64

CalculateCacheSize returns the desired size of the cache, given the total memory.

func (*FileCacheConfig) Validate added in v0.1.14

func (c *FileCacheConfig) Validate() error

type FileCacheState added in v0.1.14

type FileCacheState struct {
	// contains filtered or unexported fields
}

func (*FileCacheState) GetFileCacheSize added in v0.1.14

func (s *FileCacheState) GetFileCacheSize(ctx context.Context) (uint64, error)

GetFileCacheSize returns the current size of the file cache, in bytes

func (*FileCacheState) SetFileCacheSize added in v0.1.14

func (s *FileCacheState) SetFileCacheSize(ctx context.Context, logger *zap.Logger, sizeInBytes uint64) (uint64, error)

SetFileCacheSize sets the size of the file cache, returning the actual size it was set to

type NewStateOpts added in v0.1.4

type NewStateOpts struct {
	// contains filtered or unexported fields
}

NewStateOpts are individual options provided to NewState

func WithCgroup added in v0.1.4

func WithCgroup(cgm *CgroupManager, config CgroupConfig) NewStateOpts

WithCgroup creates a NewStateOpts that sets its CgroupHandler

This function will panic if the provided CgroupConfig is invalid.

func WithPostgresFileCache added in v0.1.14

func WithPostgresFileCache(connStr string, config FileCacheConfig) NewStateOpts

WithPostgresFileCache creates a NewStateOpts that enables connections to the postgres file cache

type State

type State struct {
	// contains filtered or unexported fields
}

State is the global state of the informant

func NewState

func NewState(logger *zap.Logger, agents *AgentSet, config StateConfig, opts ...NewStateOpts) (*State, error)

NewState instantiates a new State object, starting whatever background processes might be required

Optional configuration may be provided by NewStateOpts - see WithCgroup and WithPostgresFileCache.

func (*State) HealthCheck added in v0.7.0

func (s *State) HealthCheck(ctx context.Context, logger *zap.Logger, info *api.AgentIdentification) (*api.InformantHealthCheckResp, int, error)

HealthCheck is a dummy endpoint that allows the autoscaler-agent to check that (a) the informant is up and running, and (b) the agent is still registered.

Returns: body (if successful), status code, error (if unsuccessful)

func (*State) NotifyUpscale

func (s *State) NotifyUpscale(
	ctx context.Context,
	logger *zap.Logger,
	newResources *api.AgentResourceMessage,
) (*struct{}, int, error)

NotifyUpscale signals that the VM's resource usage has been increased to the new amount

Returns: body (if successful), status code and error (if unsuccessful)

func (*State) RegisterAgent

func (s *State) RegisterAgent(ctx context.Context, logger *zap.Logger, info *api.AgentDesc) (*api.InformantDesc, int, error)

RegisterAgent registers a new or updated autoscaler-agent

Returns: body (if successful), status code, error (if unsuccessful)

func (*State) TryDownscale

func (s *State) TryDownscale(ctx context.Context, logger *zap.Logger, target *api.AgentResourceMessage) (*api.DownscaleResult, int, error)

TryDownscale tries to downscale the VM's current resource usage, returning whether the proposed amount is ok

Returns: body (if successful), status code and error (if unsuccessful)

func (*State) UnregisterAgent

func (s *State) UnregisterAgent(ctx context.Context, logger *zap.Logger, info *api.AgentDesc) (*api.UnregisterAgent, int, error)

UnregisterAgent unregisters the autoscaler-agent given by info, if it is currently registered

If a different autoscaler-agent is currently registered, this method will do nothing.

Returns: body (if successful), status code and error (if unsuccessful)

type StateConfig added in v0.1.14

type StateConfig struct {
	// SysBufferBytes gives the estimated amount of memory, in bytes, that the kernel uses before
	// handing out the rest to userspace. This value is the estimated difference between the
	// *actual* physical memory and the amount reported by `grep MemTotal /proc/meminfo`.
	//
	// For more information, refer to `man 5 proc`, which defines MemTotal as "Total usable RAM
	// (i.e., physical RAM minus a few reserved bits and the kernel binary code)".
	//
	// We only use SysBufferBytes when calculating the system memory from the *external* memory
	// size, rather than the self-reported memory size, according to the kernel.
	//
	// TODO: this field is only necessary while we still have to trust the autoscaler-agent's
	// upscale resource amounts (because we might not *actually* have been upscaled yet). This field
	// should be removed once we have a better solution there.
	SysBufferBytes uint64
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL