statemachine

package
v1.11.0-beta.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2021 License: Apache-2.0 Imports: 1 Imported by: 0

README

OKT State Machine

Principle

An application can take several states and exposing the current state it has during a run is like pointing the location of a mobile object on a map.

Having a Graph to describe what is managed by the K8S Operator should be helpful for a human operator or external components in charge to bring some "observability" features to a K8S operator.

Traversing a Graph has also the advantage to maintain a consistent path of all traversed nodes. This can help too to better understand or investigate on what is happening during a run.

Offering a developement framework to build this graph in a consistent way for a K8S operator should allow to normalize the view on the application's lifecycle as long as it is operated by this operator implemented with this framework (using a standard like CNCF CloudEvent could enforce this normalization and interoperability).

A Graph contains nodes and leaf nodes. Evolving between nodes is constraint by defined transitions to pass on known verified events.

The principle is not to traverse all the graph during 1 reconciliation but, at each reconciliation, to:

  • get the current state of the application (a database or whatever application),
  • collect events that happened, and generate, if needed, the event telling to go to the state accordingly to the expected state in CR (add it to the collected events),
  • trigger (or not) a next state in regards to the collected events
  • in case of new state change, perform asynchroniously the actions related to this state
  • if the expected state is not the current state, requeue a controller runtime event for further Reconiliation...

The proposition is to implement an OKT resource to manage the application life cycle like we have resource of different kind in Kubernetes:

  • The application's state is evolving through the multiple Reconciliations.
  • The expected state is specified in the CR,
  • The current state is got from a specific Client call to pick up the information and map it into the corresponding state representation.

It based on:

  • A state machine based on what is offered by the OKT's GO module tools/statemachine
  • A Client implementing OKT's Client interface (CRUD) that communicate with the application
  • An OKT application resource that can be registered by the OKT Reconciler

Purpose

This utility allows to build a state machine from the base of a Live Cycle Graph you provide and that fit your application's needs.

A graph is a list of oriented nodes with 1 or more leafs.

1 action to do is attached to each node. During the action 1 or more events can be triggered and provided to the statemachine as an events list.

What is an event ? It is simply the name of the node to reach triggered at a moment in a specific context.

The State Machine will decide to go to the next possible node by testing sequentially each event from the current state.

One thing to know

A special state exists. It is DefaultState of type LCGSTATE as defined in the OKT state machine module.

It is used as a facility to designate the Next step without knowing its name during the execution of a node action.

So a LCGGraph can be browsed through a "normal" path or branch (from start to the end always going to the "Next" node) and there's some "debranching" events that will deroute from the normal path to go to antoher branch of the LCGraph. These "debranching" events are called "Triggers" and are specified at each node description when they exists (debranch on ErrorManagement for example).

LC Graph types

LCGState - A state node

LCGGraph - A graph (see below) with state nodes. Each state node can have several children. The children list is ordered. This allows to

LCGEvents - Events list generated during a the application lifecycle. Envents are named exactly as a state node. The event "2" (i.e. itoa(Running)) is the event raised to "go-to-running" state. Actually an event is a LCGState.

LCGChildren - A state node children list. The list is ordered by priority. The child with the higher priority will be triggered first if an event exist.

Note that Leaf nodes and the graph's entry node, have NO action attached to them.

Example:

const (
	Start oktsm.LCGState = iota  /// Is equal to 0
	Running     // Is equal to 1
	Servicing
	Stopping
	End         // The state End is a uniq leaf node of this Graph (described below)
)

var graph = oktsm.LCGGraph{
	Start: oktsm.LCGChildren{
		Running,
	},
	Running: oktsm.LCGChildren{
        Servicing,  // It has the higher priority in case of dilemne between Servicing or Stopping or DefaultState events
        Stopping,   // The last child is the default node (raised by both Stopping or DefaultState events) 
	},
	Servicing: oktsm.LCGChildren{
		Running,
 		End,  // Default
	},
	Stopping: oktsm.LCGChildren{
 		End,  // Default
	},
    End: oktsm.LCGChildren{},  // A leaf node without any action, that will close the statemachine run...
}

The story behind this implementation (/!\ not yet completed at this time)

Now, right after diving, with Story 1, into a "simple" implementation, I have to go further in the Operator's capability level and especially, I have to handle a way to treat the different "States" my application (a database for example or any application) will going through. For example, beyond the resource infrastucture management seen previously, I want now to deal with the fact that my database life is traversing some specific states as follow:

  • start - the database is being started but not yet available (when this action is completed a "go-to-running" event is generated)
  • running - now the database is ready to accept client connections (It is a stable state while no "go-to-servicing" nor "go-to-stopping" events are raised)
  • servicing - a service operation is in progress (a backup, a configuration change) that can affect user experience. Once done, a "go-to-running" or "go-to-end" can be generated
  • stopping - the database will stop its service, all client must disconnect
  • ended - the service is no longer available

For these steps, I wish an easy way to manage them thanks to change in my CR, and I'd like to have the CR status updated as well while they occurs. However, these steps are happening at the application level, not at infrastucture level (actually not completely, as we can imagine some dependancies between both). Here we are plenty in the need to drive the application lifecycle through my operator. But how will we manage that ?

In Story 1 we described a Reconciliation cycle triggered at each event and trying to traverse a list of steps (a branch) as follow :

CRChecker->ObjectsCreator->Mutator->Updator->ManageSuccess  (+ 2 "debranching" steps to ManageError & CRFinalizer)

Going from 1 step to the other is conditionned by the success of all actions taken during the step. Else we debranch to the ManageError step. All of this happen during 1 Reconciliation cycle.

For my application lifecycle, I have 1 graph (name it App LC Graph) of steps representing the applications states I want to manage. At each step some actions have to be done, that may take a while:

 Start->Running<->Servicing -> End
                ->Stopping  -> End

Going from 1 step to the other is conditionned by some conditions that may be met over N Reconciliation cycles.

I like the idea to have a clear view on the steps I defined previously, so I'll complete my work with the OperatorSDK and the OKT addon.

OKT comes with a statemachine feature that should help in defining these steps and let me focus on the code I need to implement at each step. To allow this, OKT provides:

  • a sidecar for my application to help me to get my database status and launch actions on it asynchronously.
  • an utility to modelize my graph of appication states into my CRD
  • a GO type to implement this graph and transition rules that condition how I validate the transition from one step to another

In my CR I set the wished state (i.e. Servicing) I want to reach, while the current application state (i.e. is maintained in the CR status with a new Condition).

Once the application added to the OKT registry (like any other resource), the OKT Reconciler knows that it has to manage this resource as follow:

  • on Start: Create() it!
  • on End: Delete() it!
  • on any other state: Update it!

As any other resource, it put in place an idempotent mecanism and detect changes (and thus will do nothing during a Reconciliation if there's nothing new). Here what will trigger a change:

  • a state change (in App LC Graph) due to a CR modification
  • a state change from the observation of a change at the application level. This observability has to be implemented by the application sidecar.

A state change (in the App LC Graph) is handled asynchronously to not impact the Controller with a too long task. On such case (long task) 1 or more requeueing orders are left to wait for the observable change once done.

It also maintain a Status condition in the CR that reflect the application current state and errors if any.

To sum up:

  • an application lifecycle is managed like an infrastucture resource from OKT's point of view,
  • a clear view on what is implemented in term of application lifecycle is provided thanks to the App LC Graph described by the CRD
    • Having all the operators in an organization built upon the same model should help human (or intelligent automates) operators to deal with several kind of K8S operators.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type LCGChildren

type LCGChildren []LCGState

LCGChildren defines the children nodes oredered by priority. The first has the most important priority. The last is de default, the one to choose when DefaultState has been added in the event list and no other event match with a child.

func (LCGChildren) Default

func (t LCGChildren) Default() (exists bool, dft LCGState)

func (LCGChildren) IsDefault

func (t LCGChildren) IsDefault(state LCGState) bool

type LCGEvents

type LCGEvents []LCGState

func (LCGEvents) Contains

func (e LCGEvents) Contains(state LCGState) bool

func (LCGEvents) IsTriggeringState

func (e LCGEvents) IsTriggeringState(state LCGState, isDefaultState bool) bool

type LCGGraph

type LCGGraph map[LCGState]LCGNodeInfo

func (LCGGraph) IsLeafNode

func (g LCGGraph) IsLeafNode(state LCGState) bool

func (LCGGraph) StateName

func (g LCGGraph) StateName(state LCGState) (name string)

type LCGNodeInfo

type LCGNodeInfo struct {
	Name     string
	Children LCGChildren
}

type LCGState

type LCGState int
const DefaultState LCGState = -1

func (LCGState) String

func (s LCGState) String() string

type LCGStateAction

type LCGStateAction interface {
	Enter(state LCGState) error
}

LCGStateAction calls The Hook "Enter" is called each time the machine enter in a state. This is the action to do on a state.

type Machine

type Machine struct {
	Graph   LCGGraph
	Actions LCGStateAction
	// contains filtered or unexported fields
}

func (*Machine) DisablePathInGraph

func (m *Machine) DisablePathInGraph()

func (*Machine) EnablePathInGraph

func (m *Machine) EnablePathInGraph()

func (*Machine) EnterNextState

func (m *Machine) EnterNextState(events LCGEvents) (entered bool, err error)

EnterNextState Trigger each event up to the first allowing to throw a new state.

func (*Machine) GetPathInGraph

func (m *Machine) GetPathInGraph() (path string)

func (*Machine) GetState

func (m *Machine) GetState() LCGState

GetState Get current state for this machine

func (Machine) IsOFF

func (m Machine) IsOFF() bool

func (*Machine) IsPathInGraphEnabled

func (m *Machine) IsPathInGraphEnabled() bool

func (*Machine) SetPathLengthLimit

func (m *Machine) SetPathLengthLimit(max uint)

SetPathLengthLimit Defines the size of the slice storing the path in graph, i.e. the maximum states to store. Nnote that loops in graph count for 2 states. This slice store also the separator (">") in addtion to the states. If not set, the max is by default limited to 512. You can define more if needed. Min is 5 and Maximum is 1024

func (*Machine) SetState

func (m *Machine) SetState(state LCGState) (entered bool)

SetState Set current state for this mathine Can not set the same state twice

Directories

Path Synopsis
* This file is intended to replace the reconciler engine file in `reconciler/engine/stepper.go` For the moment the test does not pass since a weird behaviour of the Build() function that should be fixed quickly.
* This file is intended to replace the reconciler engine file in `reconciler/engine/stepper.go` For the moment the test does not pass since a weird behaviour of the Build() function that should be fixed quickly.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL