rancher

package
v0.0.0-...-03e2795 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2024 License: Apache-2.0 Imports: 25 Imported by: 0

README

Cluster Autoscaler for Rancher with RKE2

This cluster autoscaler for Rancher scales nodes in clusters which use RKE2 provisioning (Rancher v2.6+). It uses a combination of the Rancher API and the underlying cluster-api types of RKE2.

Configuration

The cluster-autoscaler for Rancher needs a configuration file to work by using --cloud-config parameter. A cluster-autoscaler instance can target a single downstream RKE2 cluster specified in the config. An up-to-date example can be found in examples/config.yaml.

Configuration via environment variables

In order to override URL, token or clustername use following environment variables:

  • RANCHER_URL
  • RANCHER_TOKEN
  • RANCHER_CLUSTER_NAME
Permissions

The Rancher server account provided in the cloud-config requires the following permissions on the Rancher server:

  • Get/Update of the clusters.provisioning.cattle.io resource to autoscale
  • List of machines.cluster.x-k8s.io in the namespace of the cluster resource

Running the Autoscaler

The cluster-autoscaler can be run inside the RKE2 cluster, on the Rancher server cluster or on a completely separate machine. To run it outside the RKE2 cluster, make sure to provide a kubeconfig with --kubeconfig.

To start the autoscaler with the Rancher provider, the cloud provider needs to be specified:

cluster-autoscaler --cloud-provider=rancher --cloud-config=config.yaml

Enabling Autoscaling

In order for the autoscaler to function, the RKE2 cluster needs to be configured accordingly. The autoscaler works by adjusting the quantity of a machinePool dynamically. For the autoscaler to know the min/max size of a machinePool we need to set a few annotations using the machineDeploymentAnnotations field. That field has been chosen because updating it does not trigger a full rollout of a machinePool.

apiVersion: provisioning.cattle.io/v1
kind: Cluster
spec:
  rkeConfig:
    machinePools:
    - name: pool-1
      quantity: 1
      workerRole: true
      machineDeploymentAnnotations:
        cluster.provisioning.cattle.io/autoscaler-min-size: "1"
        cluster.provisioning.cattle.io/autoscaler-max-size: "3"

Optionally in order to enable scaling a machinePool from and to 0 nodes, we need to add a few more annotations to let the autoscaler know, which resources a single node in a pool provides:

apiVersion: provisioning.cattle.io/v1
kind: Cluster
spec:
  rkeConfig:
    machinePools:
    - name: pool-1
      machineDeploymentAnnotations:
        cluster.provisioning.cattle.io/autoscaler-min-size: "0"
        cluster.provisioning.cattle.io/autoscaler-max-size: "3"
        cluster.provisioning.cattle.io/autoscaler-resource-cpu: "1"
        cluster.provisioning.cattle.io/autoscaler-resource-ephemeral-storage: 50Gi
        cluster.provisioning.cattle.io/autoscaler-resource-memory: 4Gi

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BuildRancher

BuildRancher builds rancher cloud provider.

Types

type RancherCloudProvider

type RancherCloudProvider struct {
	// contains filtered or unexported fields
}

RancherCloudProvider implements CloudProvider interface for rancher

func (*RancherCloudProvider) Cleanup

func (provider *RancherCloudProvider) Cleanup() error

Cleanup cleans up all resources before the cloud provider is removed

func (*RancherCloudProvider) GPULabel

func (provider *RancherCloudProvider) GPULabel() string

GPULabel returns the label added to nodes with GPU resource.

func (*RancherCloudProvider) GetAvailableGPUTypes

func (provider *RancherCloudProvider) GetAvailableGPUTypes() map[string]struct{}

GetAvailableGPUTypes return all available GPU types cloud provider supports

func (*RancherCloudProvider) GetAvailableMachineTypes

func (provider *RancherCloudProvider) GetAvailableMachineTypes() ([]string, error)

GetAvailableMachineTypes get all machine types that can be requested from the cloud provider. Implementation optional.

func (*RancherCloudProvider) GetNodeGpuConfig

func (provider *RancherCloudProvider) GetNodeGpuConfig(node *corev1.Node) *cloudprovider.GpuConfig

GetNodeGpuConfig returns the label, type and resource name for the GPU added to node. If node doesn't have any GPUs, it returns nil.

func (*RancherCloudProvider) GetResourceLimiter

func (provider *RancherCloudProvider) GetResourceLimiter() (*cloudprovider.ResourceLimiter, error)

GetResourceLimiter returns struct containing limits (max, min) for resources (cores, memory etc.).

func (*RancherCloudProvider) HasInstance

func (provider *RancherCloudProvider) HasInstance(node *corev1.Node) (bool, error)

HasInstance returns whether a given node has a corresponding instance in this cloud provider

func (*RancherCloudProvider) Name

func (provider *RancherCloudProvider) Name() string

Name returns name of the cloud provider.

func (*RancherCloudProvider) NewNodeGroup

func (provider *RancherCloudProvider) NewNodeGroup(machineType string, labels map[string]string, systemLabels map[string]string,
	taints []corev1.Taint,
	extraResources map[string]resource.Quantity) (cloudprovider.NodeGroup, error)

NewNodeGroup builds a theoretical node group based on the node definition provided.

func (*RancherCloudProvider) NodeGroupForNode

func (provider *RancherCloudProvider) NodeGroupForNode(node *corev1.Node) (cloudprovider.NodeGroup, error)

NodeGroupForNode returns the node group for the given node.

func (*RancherCloudProvider) NodeGroups

func (provider *RancherCloudProvider) NodeGroups() []cloudprovider.NodeGroup

NodeGroups returns all node groups configured for this cloud provider.

func (*RancherCloudProvider) Pricing

Pricing returns pricing model for this cloud provider or error if not available.

func (*RancherCloudProvider) Refresh

func (provider *RancherCloudProvider) Refresh() error

Refresh is called before every main loop and can be used to dynamically update cloud provider state. In particular the list of node groups returned by NodeGroups can change as a result of CloudProvider.Refresh().

Directories

Path Synopsis
provisioning.cattle.io
v1

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL