overlord

package
v0.0.0-...-a24fe7a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 8, 2025 License: GPL-3.0 Imports: 35 Imported by: 27

README

Notes on state and changes

State is central to the consistency and integrity of any snap system. It’s maintained by snapd by managing the external on-disk state of snap installations and its own persistent working state of metadata, expectations, and in-progress operations.

Working persistent state is implemented by overlord/state.State with a global lock, State.Lock/Unlock, to govern updates. If state is modified after acquiring the lock, it’s atomically updated to disk when the lock is released.

State managers

State managers are used to manage both the working state and the on-disk snap state. They all implement the overlord.StateManager interface. Code-wise, together with a few other auxiliary components, they live in overlord and its subpackages. overlord.Overlord itself is responsible for the wiring and coordination of all of these.

In broad terms, state managers have assigned responsibilities for different subsystems, and these are in mostly orthogonal areas. They then participate in the management and bookkeeping of the state via various mechanisms.

During startup, and after construction, an optional StartUp method is invoked on each state manager. This is followed by the activation of an ensure loop which calls a state manager’s corresponding Ensure method at least once every 5 minutes.

The ensure loop is intended to initiate any automatic state management and corresponding transitions, state repair, and any other consistency-maintaining operations.

state.Change

A state.Change is a graph of state.Task structs and their inter-dependencies as edges. The purpose of both a state.Change and a state.Task is identified by their kind (which should be an explanatory string value).

Time-consuming and user-initiated operations, usually initiated from the API provided by the daemon package, should be performed using the state.Change functionality.

state.Change and state.Task instances use the working state to remain persistent, and they can carry input parameters, and their own state, accessible with Get and Set methods.

The goals of the state.Change mechanisms are such that operations should survive restarts and reboots and that, on error, snapd should try to bring back the external state to a previous good state if possible.

state.TaskRunner

The state.TaskRunner is responsible for state.Change and state.Task execution, and their state management. The do and undo logic of a state.Task is defined by Task kind using TaskRunner.AddHandler.

During execution, a Task goes through a series of statuses. These are represented by state.Status and will finish in a ready status of either DoneStatus, UndoneStatus, ErrorStatus or HoldStatus.

If errors are encountered, the TaskRunner will normally try to recursively execute the undo logic of any previously depended-upon Tasks with the exception of the Task that generated the error. It is instead expected that any desired undo logic should be part of its error paths.

Different Changes and independent Tasks are normally executed concurrently.

Tasks and State locking and consistency

Currently, the Task do and undo handlers are started without holding the State lock, but to simplify consistency, it's easier if a Task executes while holding the State lock.

Strictly, the State lock must only be released when performing slow operations, such as:

  • copying, compressing or uncompressing large amounts of on-disk data
  • network operations

So in practice, most handler code should start with:

st.Lock()
defer st.Unlock()

where st is the runtime state.State instance, accessible via Task.State() or the handler manager.

The deferred Unlock will implicitly commit any working state mutations at the end of the handler.

Due to potential restarts, the do or undo handler logic in a Task may be re-executed if it hasn't already completed. This necessitates the following considerations:

  • on-disk/external state manipulation should be idempotent or equivalent
  • working state manipulation should either be idempotent or designed to combine working state mutations with setting the next status of the task. This approach currently requires using Task.SetStatus before returning from the handler

If slow operations need to be performed, the required Unlock/Lock should happen before any working state manipulation.

If the State lock is released and reacquired in a handler, the code needs to consider that other code could have manipulated some relevant working state. There may be also cases where it’s neither possible nor desirable to hold the State lock for the entirety of a state manipulation, such as when a manipulation spans multiple subsystems, and so spans multiple tasks. For all such cases, and to simplify reasoning, snapd offers other coordination mechanisms with differing granularity to the State lock.

See also the comment in overlord/snapstate/handlers.go about state locking.

Conflicts and Task precondition blocking

At a higher level, it may be appropriate and simpler to manage whether at most one Change/sequence of Tasks is operating on a given snap at a time. As this could, for example, stop the system connecting an interface on a snap that’s being removed, or disable service manipulation while its snap is being installed.

While creating a new Change that will operate on a snap, snapd checks whether there are already any in-progress operations for the snap. If there are, a conflict error is returned rather than initiating the Change.

The central logic for such checking lives in overlord/snapstate/conflict.go.

Some tasks, or family of tasks, need to release the State lock but cannot run together with some other tasks. Such tasks include:

  • hook tasks where at most one should be running at a time for a given snap
  • interface-related tasks that might touch more than one snap at a time, beyond what conflicts can take care of, so preferably at most one of them should be running at a time

To address this, precondition predicates can be hooked into the TaskRunner via TaskRunner.AddBlocked.

Before running a task, the precondition predicates are invoked and, if none return a value of true, the task is run. The input for these predicates is any candidate-for-running task and the set of currently running tasks.

Documentation

Overview

Package overlord implements the overall control of a snappy system.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Overlord

type Overlord struct {
	// contains filtered or unexported fields
}

Overlord is the central manager of a snappy system, keeping track of all available state managers and related helpers.

func Mock

func Mock() *Overlord

Mock creates an Overlord without any managers and with a backend not using disk. Managers can be added with AddManager. For testing.

func MockWithState

func MockWithState(s *state.State) *Overlord

MockWithState creates an Overlord with the given state unless it is nil in which case it uses a state backend not using disk. Managers can be added with AddManager. For testing.

func New

func New(restartHandler restart.Handler) (*Overlord, error)

New creates a new Overlord with all its state managers. It can be provided with an optional restart.Handler.

func (*Overlord) AddManager

func (o *Overlord) AddManager(mgr StateManager)

AddManager adds a manager to the overlord created with Mock. For testing.

func (*Overlord) AssertManager

func (o *Overlord) AssertManager() *assertstate.AssertManager

AssertManager returns the assertion manager enforcing assertions under the overlord.

func (*Overlord) CanStandby

func (o *Overlord) CanStandby() bool

func (*Overlord) CommandManager

func (o *Overlord) CommandManager() *cmdstate.CommandManager

CommandManager returns the manager responsible for running odd jobs.

func (*Overlord) DeviceManager

func (o *Overlord) DeviceManager() *devicestate.DeviceManager

DeviceManager returns the device manager responsible for the device identity and policies.

func (*Overlord) HookManager

func (o *Overlord) HookManager() *hookstate.HookManager

HookManager returns the hook manager responsible for running hooks under the overlord.

func (*Overlord) InterfaceManager

func (o *Overlord) InterfaceManager() *ifacestate.InterfaceManager

InterfaceManager returns the interface manager maintaining interface connections under the overlord.

func (*Overlord) Loop

func (o *Overlord) Loop()

Loop runs a loop in a goroutine to ensure the current state regularly through StateEngine Ensure.

func (*Overlord) RestartManager

func (o *Overlord) RestartManager() *restart.RestartManager

RestartManager returns the manager responsible for restart state.

func (*Overlord) ServiceManager

func (o *Overlord) ServiceManager() *servicestate.ServiceManager

ServiceManager returns the manager responsible for services under the overlord.

func (*Overlord) Settle

func (o *Overlord) Settle(timeout time.Duration) error

Settle runs first a state engine Ensure and then wait for activities to settle. That's done by waiting for all managers' activities to settle while making sure no immediate further Ensure is scheduled. It then waits similarly for all ready changes to reach the clean state. Chiefly for tests. Cannot be used in conjunction with Loop. If timeout is non-zero and settling takes longer than timeout, returns an error. Calls StartUp as well.

func (*Overlord) SettleObserveBeforeCleanups

func (o *Overlord) SettleObserveBeforeCleanups(timeout time.Duration, beforeCleanups func()) error

SettleObserveBeforeCleanups runs first a state engine Ensure and then wait for activities to settle. That's done by waiting for all managers' activities to settle while making sure no immediate further Ensure is scheduled. It then waits similarly for all ready changes to reach the clean state, but calls once the provided callback before doing that. Chiefly for tests. Cannot be used in conjunction with Loop. If timeout is non-zero and settling takes longer than timeout, returns an error. Calls StartUp as well.

func (*Overlord) SnapManager

func (o *Overlord) SnapManager() *snapstate.SnapManager

SnapManager returns the snap manager responsible for snaps under the overlord.

func (*Overlord) SnapshotManager

func (o *Overlord) SnapshotManager() *snapshotstate.SnapshotManager

SnapshotManager returns the manager responsible for snapshots.

func (*Overlord) StartUp

func (o *Overlord) StartUp() error

StartUp proceeds to run any expensive Overlord or managers initialization. After this is done once it is a noop.

func (*Overlord) StartupTimeout

func (o *Overlord) StartupTimeout() (timeout time.Duration, reasoning string, err error)

StartupTimeout computes a usable timeout for the startup initializations by using a pessimistic estimate.

func (*Overlord) State

func (o *Overlord) State() *state.State

State returns the system state managed by the overlord.

func (*Overlord) StateEngine

func (o *Overlord) StateEngine() *StateEngine

StateEngine returns the stage engine used by overlord.

func (*Overlord) Stop

func (o *Overlord) Stop() error

Stop stops the ensure loop and the managers under the StateEngine.

func (*Overlord) TaskRunner

func (o *Overlord) TaskRunner() *state.TaskRunner

TaskRunner returns the shared task runner responsible for running tasks for all managers under the overlord.

type StateEngine

type StateEngine struct {
	// contains filtered or unexported fields
}

StateEngine controls the dispatching of state changes to state managers.

Most of the actual work performed by the state engine is in fact done by the individual managers registered. These managers must be able to cope with Ensure calls in any order, coordinating among themselves solely via the state.

func NewStateEngine

func NewStateEngine(s *state.State) *StateEngine

NewStateEngine returns a new state engine.

func (*StateEngine) AddManager

func (se *StateEngine) AddManager(m StateManager)

AddManager adds the provided manager to take part in state operations.

func (*StateEngine) Ensure

func (se *StateEngine) Ensure() error

Ensure asks every manager to ensure that they are doing the necessary work to put the current desired system state in place by calling their respective Ensure methods.

Managers must evaluate the desired state completely when they receive that request, and report whether they found any critical issues. They must not perform long running activities during that operation, though. These should be performed in properly tracked changes and tasks.

func (*StateEngine) StartUp

func (se *StateEngine) StartUp() error

StartUp asks all managers to perform any expensive initialization. It is a noop after the first invocation.

func (*StateEngine) State

func (se *StateEngine) State() *state.State

State returns the current system state.

func (*StateEngine) Stop

func (se *StateEngine) Stop()

Stop asks all managers to terminate activities running concurrently.

func (*StateEngine) Wait

func (se *StateEngine) Wait()

Wait waits for all managers current activities.

type StateManager

type StateManager interface {
	// Ensure forces a complete evaluation of the current state.
	// See StateEngine.Ensure for more details.
	Ensure() error
}

StateManager is implemented by types responsible for observing the system and manipulating it to reflect the desired state.

type StateStarterUp

type StateStarterUp interface {
	// StartUp asks manager to perform any expensive initialization.
	StartUp() error
}

StateStarterUp is optionally implemented by StateManager that have expensive initialization to perform before the main Overlord loop.

type StateStopper

type StateStopper interface {
	// Stop asks the manager to terminate all activities running
	// concurrently.  It must not return before these activities
	// are finished.
	Stop()
}

StateStopper is optionally implemented by StateManagers that have running activities that can be terminated.

type StateWaiter

type StateWaiter interface {
	// Wait asks manager to wait for all running activities to
	// finish.
	Wait()
}

StateWaiter is optionally implemented by StateManagers that have running activities that can be waited.

Directories

Path Synopsis
Package assertstate implements the manager and state aspects responsible for the enforcement of assertions in the system and manages the system-wide assertion database.
Package assertstate implements the manager and state aspects responsible for the enforcement of assertions in the system and manages the system-wide assertion database.
Package cmdstate implements a overlord.StateManager that excutes arbitrary commands as tasks.
Package cmdstate implements a overlord.StateManager that excutes arbitrary commands as tasks.
Package configstate implements the manager and state aspects responsible for the configuration of snaps.
Package configstate implements the manager and state aspects responsible for the configuration of snaps.
Package devicestate implements the manager and state aspects responsible for the device identity and policies.
Package devicestate implements the manager and state aspects responsible for the device identity and policies.
internal
Package internal (of devicestate) provides functions to access and set the device state for use only by devicestate, for convenience they are also exposed via devicestatetest for use in tests.
Package internal (of devicestate) provides functions to access and set the device state for use only by devicestate, for convenience they are also exposed via devicestatetest for use in tests.
Package hookstate implements the manager and state aspects responsible for the running of hooks.
Package hookstate implements the manager and state aspects responsible for the running of hooks.
ctlcmd
Package ctlcmd contains the various snapctl subcommands.
Package ctlcmd contains the various snapctl subcommands.
Package ifacestate implements the manager and state aspects responsible for the maintenance of interfaces the system.
Package ifacestate implements the manager and state aspects responsible for the maintenance of interfaces the system.
schema
Package schema holds structs for reading and writing interface-related state data.
Package schema holds structs for reading and writing interface-related state data.
Package install implements installation logic details for UC20+ systems.
Package install implements installation logic details for UC20+ systems.
Package restart implements requesting restarts from any part of the code that has access to state.
Package restart implements requesting restarts from any part of the code that has access to state.
Package snapstate implements the manager and state aspects responsible for the installation and removal of snaps.
Package snapstate implements the manager and state aspects responsible for the installation and removal of snaps.
backend
Package backend implements the low-level primitives to manage the snaps and their installation on disk.
Package backend implements the low-level primitives to manage the snaps and their installation on disk.
policy
Package policy implements fine grained decision-making for snapstate
Package policy implements fine grained decision-making for snapstate
sequence
Package sequence contains types representing a sequence of snap revisions (with components) that describe current and past states of the snap in the system.
Package sequence contains types representing a sequence of snap revisions (with components) that describe current and past states of the snap in the system.
Package state implements the representation of system state.
Package state implements the representation of system state.
Package storecontext supplies a pluggable implementation of store.DeviceAndAuthContext.
Package storecontext supplies a pluggable implementation of store.DeviceAndAuthContext.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL