esx

package
v0.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2022 License: Apache-2.0 Imports: 19 Imported by: 0

README

ESX Controller

The ESX controller does two things for Kubernetes nodes running on virtual machines managed by a VMware vCenter. Firstly it regularly checks whether a nodes underlying ESX host is or goes into maintenance mode. If so the label cloud.sap/esx-in-maintenance is set to true.

Secondly, to complete entering maintenance mode all virtual machines on an ESX host need to be turned off. By setting the cloud.sap/esx-reboot-ok label to true on every node (within the cluster) belonging to certain ESX host, which is entering maintenance mode, the controller will cordon, drain and shutdown these nodes (and will keep them shutdown). When the ESX host leaves maintenance mode the controller will turn the nodes on and uncordon them. This behavior only occurs, if the cloud.sap/esx-reboot-initiated annotation is set to true, so it does not interfere with other maintenance activities. The cloud.sap/esx-reboot-initiated annotation is managed by the controller based on the cloud.sap/esx-in-maintenance and cloud.sap/esx-reboot-ok labels.

Using the cloud.sap/esx-in-maintenance label together with the cloud.sap/esx-reboot-ok label enables ESX maintenances to be managed flexibly with the "main" maintenance controller.

It is assumed that the nodes names equal the names of the hosting virtual machines. The availability zone within a cloud region is assumed to be the last character of the failure-domain.beta.kubernetes.io/zone label. The ESX hosts are to be tracked on relevant nodes using the kubernetes.cloud.sap/host label.

Installation

The ESX controller is bundled within the maintenance controller binary. It needs to be enabled using the --enable-esx-maintenance flag.

Configuration

To be placed in ./config/esx.yaml.

intervals:
  # Defines how frequent the controller will check for ESX hosts entering maintenance mode
  check: # changing the check interval requires a pod restart to come into effect
    jitter: 0.1 # required
    period: 5m # required
  # Defines how long and frequent to check for pod deletions while draining
  podDeletion:
    period: 5s # required
    timeout: 2m # required
  # Defines how long to wait after a node has been drained
  # As node shutdowns are performed in a loop it helps staggering them.
  stagger: 20s # optional
vCenters:
  # Defines the urls to vCenters in different availability zones.
  # $AZ is replaced with the single character availability zone.
  templateUrl: https://some-vcenter-url-$AZ # required
  # Defines if a vCenters certificates should be checked
  insecure: # optional, defaults to false
  # Credentials for the vCenter per availability zone
  credentials: # required
    a:
      username: user # required
      password: pass # required

Documentation

Index

Constants

View Source
const AvailabilityZoneReplacer string = "$AZ"

Specifies the string in a vCenter URL, which is replaced by the availability zone.

Variables

This section is empty.

Functions

func ShouldShutdown

func ShouldShutdown(esx *Host) bool

Checks, if all Nodes on an ESX need maintenance and are allowed to be shutdown. If so the RebootInitated Annotation is set on the affected Nodes.

func ShouldStart

func ShouldStart(node *v1.Node) bool

Checks if the controller initiated the maintenance and the underlying ESX is not in maintenance.

Types

type CheckParameters

type CheckParameters struct {
	VCenters *VCenters
	Host     HostInfo
	Log      logr.Logger
}

type Config

type Config struct {
	Intervals struct {
		Check struct {
			Jitter float64       `config:"jitter" validate:"min=0.001"`
			Period time.Duration `config:"period" validate:"required"`
		} `config:"check" validate:"required"`
		PodDeletion struct {
			Period  time.Duration
			Timeout time.Duration
		} `config:"podDeletion" validate:"required"`
		Stagger time.Duration `config:"stagger"`
	} `config:"intervals" validate:"required"`
	VCenters VCenters `config:"vCenters" validate:"required"`
}

type Credential

type Credential struct {
	Username string `config:"username" validate:"required"`
	Password string `config:"password"`
}

type Host

type Host struct {
	HostInfo
	Nodes []v1.Node
}

func ParseHostList

func ParseHostList(nodes []v1.Node) ([]Host, error)

Assigns nodes to their underlying ESX.

type HostInfo

type HostInfo struct {
	Name             string
	AvailabilityZone string
}

type Maintenance

type Maintenance string
const InMaintenance Maintenance = "true"
const NoMaintenance Maintenance = "false"
const UnknownMaintenance Maintenance = "unknown"

func CheckForMaintenance

func CheckForMaintenance(ctx context.Context, params CheckParameters) (Maintenance, error)

Performs a check for the specified host if allowed by timestamps.

type Runnable

type Runnable struct {
	client.Client
	Log logr.Logger
}

func (*Runnable) CheckMaintenance

func (r *Runnable) CheckMaintenance(ctx context.Context, vCenters *VCenters, esx *Host) error

Checks the maintenance mode of the given ESX and attaches the according Maintenance label.

func (*Runnable) NeedLeaderElection

func (r *Runnable) NeedLeaderElection() bool

func (*Runnable) Reconcile

func (r *Runnable) Reconcile(ctx context.Context)

func (*Runnable) ShutdownNodes

func (r *Runnable) ShutdownNodes(ctx context.Context, vCenters *VCenters, esx *Host, conf *Config) error

Shuts down the nodes on the given ESX, if all nodes are with the RebootAllowed="true" label.

func (*Runnable) Start

func (r *Runnable) Start(ctx context.Context) error

func (*Runnable) StartNodes

func (r *Runnable) StartNodes(ctx context.Context, vCenters *VCenters, esx *Host, conf *Config)

Starts the nodes on the given ESX, if this controller shut them down and the underlying ESX is no longer in maintenance.

type VCenters

type VCenters struct {
	// URL to regional vCenters with the availability zone replaced by AvailabilityZoneReplacer.
	Template string `config:"templateUrl" validate:"required"`
	// If true the vCenters certificates are not validated.
	Insecure bool `config:"insecure"`
	// Pair of credentials per availability zone.
	Credentials map[string]Credential `config:"credentials" validate:"required"`
	// contains filtered or unexported fields
}

VCenters contains connection information to regional vCenters.

func (*VCenters) ClearCache

func (vc *VCenters) ClearCache(ctx context.Context)

func (*VCenters) Client

func (vc *VCenters) Client(ctx context.Context, availabilityZone string) (*govmomi.Client, error)

Returns a ready to use vCenter client for the given availability zone.

func (*VCenters) URL

func (vc *VCenters) URL(availabilityZone string) (*url.URL, error)

Gets an URL to connect to a vCenters in a specific availability zone.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL