statemachine

package

v1.11.0-beta.2 Latest Latest Go to latest Published: Aug 13, 2021 License: Apache-2.0 Imports: 1 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Orange-OpenSource/Operators-Karma-Tools

README ¶

OKT State Machine

Principle

An application can take several states and exposing the current state it has during a run is like pointing the location of a mobile object on a map.

Having a Graph to describe what is managed by the K8S Operator should be helpful for a human operator or external components in charge to bring some "observability" features to a K8S operator.

Traversing a Graph has also the advantage to maintain a consistent path of all traversed nodes. This can help too to better understand or investigate on what is happening during a run.

Offering a developement framework to build this graph in a consistent way for a K8S operator should allow to normalize the view on the application's lifecycle as long as it is operated by this operator implemented with this framework (using a standard like CNCF CloudEvent could enforce this normalization and interoperability).

A Graph contains nodes and leaf nodes. Evolving between nodes is constraint by defined transitions to pass on known verified events.

The principle is not to traverse all the graph during 1 reconciliation but, at each reconciliation, to:

get the current state of the application (a database or whatever application),
collect events that happened, and generate, if needed, the event telling to go to the state accordingly to the expected state in CR (add it to the collected events),
trigger (or not) a next state in regards to the collected events
in case of new state change, perform asynchroniously the actions related to this state
if the expected state is not the current state, requeue a controller runtime event for further Reconiliation...

The proposition is to implement an OKT resource to manage the application life cycle like we have resource of different kind in Kubernetes:

The application's state is evolving through the multiple Reconciliations.
The expected state is specified in the CR,
The current state is got from a specific Client call to pick up the information and map it into the corresponding state representation.

It based on:

A state machine based on what is offered by the OKT's GO module tools/statemachine
A Client implementing OKT's Client interface (CRUD) that communicate with the application
An OKT application resource that can be registered by the OKT Reconciler

Purpose

This utility allows to build a state machine from the base of a Live Cycle Graph you provide and that fit your application's needs.

A graph is a list of oriented nodes with 1 or more leafs.

1 action to do is attached to each node. During the action 1 or more events can be triggered and provided to the statemachine as an events list.

What is an event ? It is simply the name of the node to reach triggered at a moment in a specific context.

The State Machine will decide to go to the next possible node by testing sequentially each event from the current state.

One thing to know

A special state exists. It is DefaultState of type LCGSTATE as defined in the OKT state machine module.

It is used as a facility to designate the Next step without knowing its name during the execution of a node action.

So a LCGGraph can be browsed through a "normal" path or branch (from start to the end always going to the "Next" node) and there's some "debranching" events that will deroute from the normal path to go to antoher branch of the LCGraph. These "debranching" events are called "Triggers" and are specified at each node description when they exists (debranch on ErrorManagement for example).

LC Graph types

LCGState - A state node

LCGGraph - A graph (see below) with state nodes. Each state node can have several children. The children list is ordered. This allows to

LCGEvents - Events list generated during a the application lifecycle. Envents are named exactly as a state node. The event "2" (i.e. itoa(Running)) is the event raised to "go-to-running" state. Actually an event is a LCGState.

LCGChildren - A state node children list. The list is ordered by priority. The child with the higher priority will be triggered first if an event exist.

Note that Leaf nodes and the graph's entry node, have NO action attached to them.

Example:

const (
	Start oktsm.LCGState = iota  /// Is equal to 0
	Running     // Is equal to 1
	Servicing
	Stopping
	End         // The state End is a uniq leaf node of this Graph (described below)
)

var graph = oktsm.LCGGraph{
	Start: oktsm.LCGChildren{
		Running,
	},
	Running: oktsm.LCGChildren{
        Servicing,  // It has the higher priority in case of dilemne between Servicing or Stopping or DefaultState events
        Stopping,   // The last child is the default node (raised by both Stopping or DefaultState events) 
	},
	Servicing: oktsm.LCGChildren{
		Running,
 		End,  // Default
	},
	Stopping: oktsm.LCGChildren{
 		End,  // Default
	},
    End: oktsm.LCGChildren{},  // A leaf node without any action, that will close the statemachine run...
}

The story behind this implementation (/!\ not yet completed at this time)

Now, right after diving, with Story 1, into a "simple" implementation, I have to go further in the Operator's capability level and especially, I have to handle a way to treat the different "States" my application (a database for example or any application) will going through. For example, beyond the resource infrastucture management seen previously, I want now to deal with the fact that my database life is traversing some specific states as follow:

start - the database is being started but not yet available (when this action is completed a "go-to-running" event is generated)
running - now the database is ready to accept client connections (It is a stable state while no "go-to-servicing" nor "go-to-stopping" events are raised)
servicing - a service operation is in progress (a backup, a configuration change) that can affect user experience. Once done, a "go-to-running" or "go-to-end" can be generated
stopping - the database will stop its service, all client must disconnect
ended - the service is no longer available

For these steps, I wish an easy way to manage them thanks to change in my CR, and I'd like to have the CR status updated as well while they occurs. However, these steps are happening at the application level, not at infrastucture level (actually not completely, as we can imagine some dependancies between both). Here we are plenty in the need to drive the application lifecycle through my operator. But how will we manage that ?

In Story 1 we described a Reconciliation cycle triggered at each event and trying to traverse a list of steps (a branch) as follow :

CRChecker->ObjectsCreator->Mutator->Updator->ManageSuccess  (+ 2 "debranching" steps to ManageError & CRFinalizer)

Going from 1 step to the other is conditionned by the success of all actions taken during the step. Else we debranch to the ManageError step. All of this happen during 1 Reconciliation cycle.

For my application lifecycle, I have 1 graph (name it App LC Graph) of steps representing the applications states I want to manage. At each step some actions have to be done, that may take a while:

 Start->Running<->Servicing -> End
                ->Stopping  -> End

Going from 1 step to the other is conditionned by some conditions that may be met over N Reconciliation cycles.

I like the idea to have a clear view on the steps I defined previously, so I'll complete my work with the OperatorSDK and the OKT addon.

OKT comes with a statemachine feature that should help in defining these steps and let me focus on the code I need to implement at each step. To allow this, OKT provides:

a sidecar for my application to help me to get my database status and launch actions on it asynchronously.
an utility to modelize my graph of appication states into my CRD
a GO type to implement this graph and transition rules that condition how I validate the transition from one step to another

In my CR I set the wished state (i.e. Servicing) I want to reach, while the current application state (i.e. is maintained in the CR status with a new Condition).

Once the application added to the OKT registry (like any other resource), the OKT Reconciler knows that it has to manage this resource as follow:

on Start: Create() it!
on End: Delete() it!
on any other state: Update it!

As any other resource, it put in place an idempotent mecanism and detect changes (and thus will do nothing during a Reconciliation if there's nothing new). Here what will trigger a change:

a state change (in App LC Graph) due to a CR modification
a state change from the observation of a change at the application level. This observability has to be implemented by the application sidecar.

A state change (in the App LC Graph) is handled asynchronously to not impact the Controller with a too long task. On such case (long task) 1 or more requeueing orders are left to wait for the observable change once done.

It also maintain a Status condition in the CR that reflect the application current state and errors if any.

To sum up:

an application lifecycle is managed like an infrastucture resource from OKT's point of view,
a clear view on what is implemented in term of application lifecycle is provided thanks to the App LC Graph described by the CRD
- Having all the operators in an organization built upon the same model should help human (or intelligent automates) operators to deal with several kind of K8S operators.

Documentation ¶

Index ¶

type LCGChildren
- func (t LCGChildren) Default() (exists bool, dft LCGState)
- func (t LCGChildren) IsDefault(state LCGState) bool
type LCGEvents
- func (e LCGEvents) Contains(state LCGState) bool
- func (e LCGEvents) IsTriggeringState(state LCGState, isDefaultState bool) bool
type LCGGraph
- func (g LCGGraph) IsLeafNode(state LCGState) bool
- func (g LCGGraph) StateName(state LCGState) (name string)
type LCGNodeInfo
type LCGState
- func (s LCGState) String() string
type LCGStateAction
type Machine

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type LCGChildren ¶

type LCGChildren []LCGState

LCGChildren defines the children nodes oredered by priority. The first has the most important priority. The last is de default, the one to choose when DefaultState has been added in the event list and no other event match with a child.

func (LCGChildren) Default ¶

func (t LCGChildren) Default() (exists bool, dft LCGState)

func (LCGChildren) IsDefault ¶

func (t LCGChildren) IsDefault(state LCGState) bool

type LCGEvents ¶

type LCGEvents []LCGState

func (LCGEvents) Contains ¶

func (e LCGEvents) Contains(state LCGState) bool

func (LCGEvents) IsTriggeringState ¶

func (e LCGEvents) IsTriggeringState(state LCGState, isDefaultState bool) bool

type LCGGraph ¶

type LCGGraph map[LCGState]LCGNodeInfo

func (LCGGraph) IsLeafNode ¶

func (g LCGGraph) IsLeafNode(state LCGState) bool

func (LCGGraph) StateName ¶

func (g LCGGraph) StateName(state LCGState) (name string)

type LCGNodeInfo ¶

type LCGNodeInfo struct {
	Name     string
	Children LCGChildren
}

type LCGState ¶

type LCGState int

const DefaultState LCGState = -1

func (LCGState) String ¶

func (s LCGState) String() string

type LCGStateAction ¶

type LCGStateAction interface {
	Enter(state LCGState) error
}

LCGStateAction calls The Hook "Enter" is called each time the machine enter in a state. This is the action to do on a state.

type Machine ¶

type Machine struct {
	Graph   LCGGraph
	Actions LCGStateAction
	// contains filtered or unexported fields
}

func (*Machine) DisablePathInGraph ¶

func (m *Machine) DisablePathInGraph()

func (*Machine) EnablePathInGraph ¶

func (m *Machine) EnablePathInGraph()

func (*Machine) EnterNextState ¶

func (m *Machine) EnterNextState(events LCGEvents) (entered bool, err error)

EnterNextState Trigger each event up to the first allowing to throw a new state.

func (*Machine) GetPathInGraph ¶

func (m *Machine) GetPathInGraph() (path string)

func (*Machine) GetState ¶

func (m *Machine) GetState() LCGState

GetState Get current state for this machine

func (Machine) IsOFF ¶

func (m Machine) IsOFF() bool

func (*Machine) IsPathInGraphEnabled ¶

func (m *Machine) IsPathInGraphEnabled() bool

func (*Machine) SetPathLengthLimit ¶

func (m *Machine) SetPathLengthLimit(max uint)

SetPathLengthLimit Defines the size of the slice storing the path in graph, i.e. the maximum states to store. Nnote that loops in graph count for 2 states. This slice store also the separator (">") in addtion to the states. If not set, the max is by default limited to 512. You can define more if needed. Min is 5 and Maximum is 1024

func (*Machine) SetState ¶

func (m *Machine) SetState(state LCGState) (entered bool)

SetState Set current state for this mathine Can not set the same state twice

Source Files ¶

View all Source files

sm.go

Directories ¶

Path	Synopsis
examples * This file is intended to replace the reconciler engine file in `reconciler/engine/stepper.go` For the moment the test does not pass since a weird behaviour of the Build() function that should be fixed quickly.	* This file is intended to replace the reconciler engine file in `reconciler/engine/stepper.go` For the moment the test does not pass since a weird behaviour of the Build() function that should be fixed quickly.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL