equinixmetal

package
v0.0.0-...-30e57c9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 21, 2024 License: Apache-2.0 Imports: 30 Imported by: 0

README

Cluster Autoscaler for Equinix Metal

The cluster autoscaler for Equinix Metal worker nodes performs autoscaling within any specified nodepools. It will run as a Deployment in your cluster. The nodepools are specified using tags on Equinix Metal.

Note: Packet was acquired by Equinix in 2020 and renamed to Equinix Metal.

This README will go over some of the necessary steps required to get the cluster autoscaler up and running.

Permissions and credentials

The autoscaler needs a ServiceAccount with permissions for Kubernetes and requires credentials for interacting with Equinix Metal.

An example ServiceAccount is given in examples/cluster-autoscaler-svcaccount.yaml.

The credentials for authenticating with Equinix Metal are stored in a secret and provided as an env var to the container. examples/cluster-autoscaler-secret In the above file you can modify the following fields:

Secret Key Value
cluster-autoscaler-equinixmetal authtoken Your Equinix Metal API token. It must be base64 encoded.
cluster-autoscaler-cloud-config Global/project-id Your Equinix Metal project id
cluster-autoscaler-cloud-config Global/api-server The ip:port for you cluster's k8s api (e.g. K8S_MASTER_PUBLIC_IP:6443)
cluster-autoscaler-cloud-config Global/facility The Equinix Metal facility for the devices in your nodepool (eg: sv15)
cluster-autoscaler-cloud-config Global/plan The Equinix Metal plan (aka size/flavor) for new nodes in the nodepool (eg: c3.small.x86)
cluster-autoscaler-cloud-config Global/billing The billing interval for new nodes (default: hourly)
cluster-autoscaler-cloud-config Global/os The OS image to use for new nodes (default: ubuntu_18_04). If you change this also update cloudinit.
cluster-autoscaler-cloud-config Global/cloudinit The base64 encoded user data submitted when provisioning devices. In the example file, the default value has been tested with Ubuntu 18.04 to install Docker & kubelet and then to bootstrap the node into the cluster using kubeadm. The kubeadm, kubelet, kubectl are pinned to version 1.17.4. For a different base OS or bootstrap method, this needs to be customized accordingly
cluster-autoscaler-cloud-config Global/reservation The values "require" or "prefer" will request the next available hardware reservation for new devices in selected facility & plan. If no hardware reservations match, "require" will trigger a failure, while "prefer" will launch on-demand devices instead (default: none)
cluster-autoscaler-cloud-config Global/hostname-pattern The pattern for the names of new Equinix Metal devices (default: "k8s-{{.ClusterName}}-{{.NodeGroup}}-{{.RandString8}}" )

You can always update the secret with more nodepool definitions (with different plans etc.) as shown in the example, but you should always provide a default nodepool configuration.

Configure nodepool and cluster names using Equinix Metal tags

The Equinix Metal API does not yet have native support for groups or pools of devices. So we use tags to specify them. Each Equinix Metal device that's a member of the "cluster1" cluster should have the tag k8s-cluster-cluster1. The devices that are members of the "pool1" nodepool should also have the tag k8s-nodepool-pool1. Once you have a Kubernetes cluster running on Equinix Metal, use the Equinix Metal Portal or API to tag the nodes accordingly.

Autoscaler deployment

The deployment in examples/cluster-autoscaler-deployment.yaml can be used, but the arguments passed to the autoscaler will need to be changed to match your cluster.

Argument Usage
--cluster-name The name of your Kubernetes cluster. It should correspond to the tags that have been applied to the nodes.
--nodes Of the form min:max:NodepoolName. For multiple nodepools you can add the same argument multiple times. E.g. for pool1, pool2 you would add --nodes=0:10:pool1 and --nodes=0:10:pool2. In addition, each node provisioned by the autoscaler will have a label with key: pool and with value: NodepoolName. These labels can be useful when there is a need to target specific nodepools.
--expander=price This is an optional argument which allows the cluster-autoscaler to take into account the pricing of the Equinix Metal nodes when scaling with multiple nodepools.

Target Specific Nodepools (New!)

In case you want to target a specific nodepool(s) for e.g. a deployment, you can add a nodeAffinity with the key pool and with value the nodepool name that you want to target. This functionality is not backwards compatible, which means that nodes provisioned with older cluster-autoscaler images won't have the key pool. But you can overcome this limitation by manually adding the correct labels. Here are some examples:

Target a nodepool with a specific name:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: pool
          operator: In
          values:
          - pool3

Target a nodepool with a specific Equinix Metal instance:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: beta.kubernetes.io/instance-type
          operator: In
          values:
          - c3.small.x86

CCM and Controller node labels

CCM

By default, autoscaler assumes that you have an older deprecated version of packet-ccm installed in your cluster. If however, that is not the case and you've migrated to the new cloud-provider-equinix-metal CCM, then this must be told to autoscaler. This can be done via setting an environment variable in the deployment:

env:
  - name: INSTALLED_CCM
    value: cloud-provider-equinix-metal

NOTE: As a prerequisite, ensure that all worker nodes in your cluster have the prefix equinixmetal:// in the Node spec .spec.providerID. If there are any existing worker nodes with prefix packet://, then drain the node, remove the node and restart the kubelet on that worker node to re-register the node in the cluster, this would ensure that cloud-provider-equinix-metal CCM sets the uuid with prefix equinixmetal:// to the field .spec.ProviderID.

Controller node labels

Autoscaler assumes that control plane nodes in your cluster are identified by the label node-role.kubernetes.io/master. If for some reason, this assumption is not true in your case, then set the environment variable in the deployment:

env:
  - name: METAL_CONTROLLER_NODE_IDENTIFIER_LABEL
    value: <label>

Notes

The autoscaler will not remove nodes which have non-default kube-system pods. This prevents the node that the autoscaler is running on from being scaled down. If you are deploying the autoscaler into a cluster which already has more than one node, it is best to deploy it onto any node which already has non-default kube-system pods, to minimise the number of nodes which cannot be removed when scaling. For this reason in the provided example the autoscaler pod has a nodeaffinity which forces it to deploy on the control plane (previously referred to as master) node.

Changes
  1. It is now possible to use multiple nodepools, scale nodepools to 0 nodes and prioritize scaling of specific nodepools by taking into account the pricing of the Equinix Metal instances.

  2. In order to take advantage of the new features mentioned above, you might need to update the cloud-config and the autoscaler deployment as shown in the examples. For example, the default/global cloud-config is applied to all the nodepools and if you want to override it for a specific nodepool you have to modify the cloud-config according to the examples.

  3. You can target specific nodepools, as described above.

  4. Cloud inits in the examples have pinned versions for Kubernetes in order to minimize potential incompatibilities as a result of nodes provisioned with different Kubernetes versions.

  5. In the provided cluster-autoscaler deployment example, the autoscaler pod has a nodeaffinity which forces it to deploy on the control plane (previously referred to as master) node, so that the cluster-autoscaler can scale down all of the worker nodes. Without this change there was a possibility for the cluster-autoscaler to be deployed on a worker node that could not be downscaled.

Documentation

Index

Constants

View Source
const (
	// GPULabel is the label added to nodes with GPU resource.
	GPULabel = "cloud.google.com/gke-accelerator"
	// DefaultControllerNodeLabelKey is the label added to Master/Controller to identify as
	// master/controller node.
	DefaultControllerNodeLabelKey = "node-role.kubernetes.io/master"
	// ControllerNodeIdentifierEnv is the string for the environment variable.
	// Deprecated: This env var is deprecated in the favour packet's acquisition to equinix.
	// Please use 'ControllerNodeIdentifierMetalEnv'
	ControllerNodeIdentifierEnv = "PACKET_CONTROLLER_NODE_IDENTIFIER_LABEL"
	// ControllerNodeIdentifierMetalEnv is the string for the environment variable of controller node id labels for equinix metal.
	ControllerNodeIdentifierMetalEnv = "METAL_CONTROLLER_NODE_IDENTIFIER_LABEL"
)

Variables

View Source
var InstanceTypes = map[string]*instanceType{
	"a3.large.x86": {
		InstanceName: "a3.large.x86",
		CPU:          64,
		MemoryMb:     1048576,
		GPU:          0,
	},
	"c2.medium.x86": {
		InstanceName: "c2.medium.x86",
		CPU:          24,
		MemoryMb:     65536,
		GPU:          0,
	},
	"c3.large.arm64": {
		InstanceName: "c3.large.arm64",
		CPU:          80,
		MemoryMb:     262144,
		GPU:          0,
	},
	"c3.medium.x86": {
		InstanceName: "c3.medium.x86",
		CPU:          24,
		MemoryMb:     65536,
		GPU:          0,
	},
	"c3.small.x86": {
		InstanceName: "c3.small.x86",
		CPU:          8,
		MemoryMb:     32768,
		GPU:          0,
	},
	"g2.large.x86": {
		InstanceName: "g2.large.x86",
		CPU:          24,
		MemoryMb:     196608,
		GPU:          1,
	},
	"m2.xlarge.x86": {
		InstanceName: "m2.xlarge.x86",
		CPU:          28,
		MemoryMb:     393216,
		GPU:          0,
	},
	"m3.large.x86": {
		InstanceName: "m3.large.x86",
		CPU:          32,
		MemoryMb:     262144,
		GPU:          0,
	},
	"m3.small.x86": {
		InstanceName: "m3.small.x86",
		CPU:          8,
		MemoryMb:     65536,
		GPU:          0,
	},
	"n2.xlarge.x86": {
		InstanceName: "n2.xlarge.x86",
		CPU:          28,
		MemoryMb:     393216,
		GPU:          0,
	},
	"n3.xlarge.x86": {
		InstanceName: "n3.xlarge.x86",
		CPU:          32,
		MemoryMb:     524288,
		GPU:          0,
	},
	"s3.xlarge.x86": {
		InstanceName: "s3.xlarge.x86",
		CPU:          24,
		MemoryMb:     196608,
		GPU:          0,
	},
	"t3.small.x86": {
		InstanceName: "t3.small.x86",
		CPU:          4,
		MemoryMb:     16384,
		GPU:          0,
	},
	"x2.xlarge.x86": {
		InstanceName: "x2.xlarge.x86",
		CPU:          28,
		MemoryMb:     393216,
		GPU:          1,
	},
}

InstanceTypes is a map of equinix metal resources

Functions

func BuildCloudProvider

BuildCloudProvider is called by the autoscaler to build an Equinix Metal cloud provider.

The equinixMetalManager is created here, and the node groups are created based on the specs provided via the command line parameters.

func BuildGenericLabels

func BuildGenericLabels(nodegroup string, instanceType string) map[string]string

BuildGenericLabels builds basic labels for equinix metal nodes

func Contains

func Contains(a []string, x string) bool

Contains tells whether a contains x.

func Find

func Find(a []string, x string) int

Find returns the smallest index i at which x == a[i], or len(a) if there is no such index.

Types

type CloudInitTemplateData

type CloudInitTemplateData struct {
	BootstrapTokenID     string
	BootstrapTokenSecret string
	APIServerEndpoint    string
	NodeGroup            string
}

CloudInitTemplateData represents the variables that can be used in cloudinit templates

type ConfigFile

type ConfigFile struct {
	DefaultNodegroupdef ConfigNodepool             `gcfg:"global"`
	Nodegroupdef        map[string]*ConfigNodepool `gcfg:"nodegroupdef"`
}

ConfigFile is used to read and store information from the cloud configuration file

type ConfigNodepool

type ConfigNodepool struct {
	ClusterName       string `gcfg:"cluster-name"`
	ProjectID         string `gcfg:"project-id"`
	APIServerEndpoint string `gcfg:"api-server-endpoint"`
	Metro             string `gcfg:"metro"`
	Plan              string `gcfg:"plan"`
	OS                string `gcfg:"os"`
	Billing           string `gcfg:"billing"`
	CloudInit         string `gcfg:"cloudinit"`
	Reservation       string `gcfg:"reservation"`
	HostnamePattern   string `gcfg:"hostname-pattern"`
}

ConfigNodepool options only include the project-id for now

type Device

type Device struct {
	ID          string   `json:"id"`
	ShortID     string   `json:"short_id"`
	Hostname    string   `json:"hostname"`
	Description string   `json:"description"`
	State       string   `json:"state"`
	Tags        []string `json:"tags"`
}

Device represents an Equinix Metal device

type DeviceCreateRequest

type DeviceCreateRequest struct {
	Hostname              string                   `json:"hostname"`
	Plan                  string                   `json:"plan"`
	Metro                 string                   `json:"metro"`
	OS                    string                   `json:"operating_system"`
	BillingCycle          string                   `json:"billing_cycle"`
	ProjectID             string                   `json:"project_id"`
	UserData              string                   `json:"userdata"`
	Storage               string                   `json:"storage,omitempty"`
	Tags                  []string                 `json:"tags"`
	CustomData            string                   `json:"customdata,omitempty"`
	IPAddresses           []IPAddressCreateRequest `json:"ip_addresses,omitempty"`
	HardwareReservationID string                   `json:"hardware_reservation_id,omitempty"`
}

DeviceCreateRequest represents a request to create a new Equinix Metal device. Used by createNodes

type Devices

type Devices struct {
	Devices []Device `json:"devices"`
}

Devices represents a list of an Equinix Metal devices

type ErrorResponse

type ErrorResponse struct {
	Response    *http.Response
	Errors      []string `json:"errors"`
	SingleError string   `json:"error"`
}

ErrorResponse is the http response used on errors

func (*ErrorResponse) Error

func (r *ErrorResponse) Error() string

Error implements the error interface

type HostnameTemplateData

type HostnameTemplateData struct {
	ClusterName string
	NodeGroup   string
	RandString8 string
}

HostnameTemplateData represents the template variables used to construct host names for new nodes

type IPAddressCreateRequest

type IPAddressCreateRequest struct {
	AddressFamily int  `json:"address_family"`
	Public        bool `json:"public"`
}

IPAddressCreateRequest represents a request to create a new IP address within a DeviceCreateRequest

type NodeRef

type NodeRef struct {
	Name       string
	MachineID  string
	ProviderID string
	IPs        []string
}

NodeRef stores the name, machineID and providerID of a node.

type Price

type Price struct {
}

Price implements Price interface for Equinix Metal.

func (*Price) NodePrice

func (model *Price) NodePrice(node *apiv1.Node, startTime time.Time, endTime time.Time) (float64, error)

NodePrice returns a price of running the given node for a given period of time. All prices are in USD.

func (*Price) PodPrice

func (model *Price) PodPrice(pod *apiv1.Pod, startTime time.Time, endTime time.Time) (float64, error)

PodPrice returns a theoretical minimum price of running a pod for a given period of time on a perfectly matching machine.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL