huaweicloud

package

v0.0.0-...-ae22146 Latest Latest Go to latest Published: Dec 16, 2024 License: Apache-2.0 Imports: 30 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/kubernetes/autoscaler

README ¶

Cluster Autoscaler on Huawei Cloud

Overview

The cluster autoscaler works with self-built Kubernetes cluster on Huaweicloud ECS and specified Huaweicloud Auto Scaling Groups It runs as a Deployment on a worker node in the cluster. This README will go over some of the necessary steps required to get the cluster autoscaler up and running.

Deployment Steps

Build Image

Environment

Download Project

Get the latest autoscaler project and download it to ${GOPATH}/src/k8s.io.

This is used for building your image, so the machine you use here should be able to access GCR. Do not use a Huawei Cloud ECS.
Go environment

Make sure you have Go installed in the above machine.
Docker environment

Make sure you have Docker installed in the above machine.

Build and push the image

Execute the following commands in the directory of autoscaler/cluster-autoscaler of the autoscaler project downloaded previously. The following steps use Huawei SoftWare Repository for Container (SWR) as an example registry.

Build the cluster-autoscaler binary:
```
make build-in-docker
```
Build the docker image:
```
docker build -t {Image repository address}/{Organization name}/{Image name:tag} .
```
For example:
```
docker build -t swr.cn-north-4.myhuaweicloud.com/{Organization name}/cluster-autoscaler:dev .
```
Follow the Pull/Push Image section of Interactive Walkthroughs under the SWR console to find the image repository address and organization name, and also refer to My Images -> Upload Through Docker Client in SWR console.
Login to SWR:
```
docker login -u {Encoded username} -p {Encoded password} {SWR endpoint}
```
For example:
```
docker login -u cn-north-4@ABCD1EFGH2IJ34KLMN -p 1a23bc45678def9g01hi23jk4l56m789nop01q2r3s4t567u89v0w1x23y4z5678 swr.cn-north-4.myhuaweicloud.com
```
Follow the Pull/Push Image section of Interactive Walkthroughs under the SWR console to find the encoded username, encoded password and swr endpoint, and also refer to My Images -> Upload Through Docker Client in SWR console.

Push the docker image to SWR:

docker push {Image repository address}/{Organization name}/{Image name:tag}

For example:

docker push swr.cn-north-4.myhuaweicloud.com/{Organization name}/cluster-autoscaler:dev

For the cluster autoscaler to function normally, make sure the Sharing Type of the image is Public. If the cluster has trouble pulling the image, go to SWR console and check whether the Sharing Type of the image is Private. If it is, click Edit button on top right and set the Sharing Type to Public.

Build Kubernetes Cluster on ECS

1. Install kubelet, kubeadm and kubectl

Please see installation here

For example:

OS: CentOS 8

Note: The following example should be run on ECS that has access to the Google Container Registry (GCR)

cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/ doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

sudo systemctl enable --now kubelet

2. Install Docker

Please see installation here

For example:

OS: CentOS 8

Note: The following example should be run on ECS that has access to the Google Container Registry (GCR)

sudo yum install -y yum-utils

sudo yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo

sudo yum install docker-ce docker-ce-cli containerd.io

sudo systemctl start docker

3. Initialize Cluster

Create a Kubeadm.yaml file with the following content:

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  imagePullPolicy: IfNotPresent
  name: node
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.22.0
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: cgroupfs

note: replace the advertiseAddress to your ECS ip address

kubeadm init --config kubeadm.yaml

Modify these two files, Comment out line - --port=0:

sudo vim /etc/kubernetes/manifests/kube-controller-manager.yaml
sudo vim /etc/kubernetes/manifests/kube-scheduler.yaml

Restart Kubelet service

systemctl restart kubelet.service

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

4. Install Flannel Network

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

5. Generate Token

kubeadm token create --ttl 0

Generate a token that never expires. Remember this token since it will be used later.

Get hash key. Remember the key since it will be used later.

openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -pubkey | openssl rsa -pubin -outform DER 2>/dev/null | sha256sum | cut -d' ' -f1

6. Create OS Image with K8S Tools

Launch a new ECS instance, and install Kubeadm, Kubectl and docker.

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

sudo yum install -y kubeadm kubectl --disableexcludes=kubernetes

sudo yum install -y yum-utils

sudo yum-config-manager \
    --add-repo \
    http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

sudo yum install docker-ce docker-ce-cli containerd.io

Create a script to join the new instance into the k8s cluster.

cat <<EOF >/etc/rc.d/init.d/init-k8s.sh
#!bin/bash
#chkconfig: 2345 80 90
setenforce 0
swapoff -a
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
yum install -y kubelet
sudo systemctl enable --now kubelet

systemctl start docker
systemctl enable docker.service
kubeadm join --token $TOKEN $API_Server_EndPoint --discovery-token-ca-cert-hash sha256:$HASHKEY
EOF

Replace the $TOKEN with the one created above.

Replace the $API_Server_EndPoint, this could be find in the context file.

cat ~./.kube/config

# server: https://192.168.0.239:6443
# the API_Server_EndPoint is 192.168.0.239:6443

Add this script into chkconfig, to let it run automatically after the instance is started.
```
chmod +x /etc/rc.d/init.d/init-k8s.sh
chkconfig --add /etc/rc.d/init.d/init-k8s.sh
```

Copy ~/.kube/config from a control plane (previously referred to as master) node to this ECS ~./kube/config to setup kubectl on this instance.
Go to Huawei Cloud Image Management Service and click on Create Image. Select type System disk image, select your ECS instance as Source, then give it a name and then create.
Remember this ECS instance ID since it will be used later.

7. Create AS Group

Follow the Huawei cloud instruction to create an AS Group.
Create an AS Configuration, and select private image which we just created. Make sure the AS Configuration with EIP automatically assign.

While creating the AS Configuration, add the following script into Advanced Settings.

#!bin/bash

IDS=$(ls /var/lib/cloud/instances/)
while true
do
    for ID in $IDS
    do
        if [ $ID != $ECS_INSTANCE_ID ]; then
            /usr/bin/kubectl --kubeconfig ~/.kube/config patch node $HOSTNAME -p "{\"spec\":{\"providerID\":\"$ID\"}}"
        fi
    done
sleep 30
done

Replace the $ECS_INSTANCE_ID

Bind the AS Group with this AS Configuration

Deploy Cluster Autoscaler

Configure credentials

The autoscaler needs a ServiceAccount which is granted permissions to the cluster's resources and a Secret which stores credential (AK/SK in this case) information for authenticating with Huawei cloud.

Examples of ServiceAccount and Secret are provided in examples/cluster-autoscaler-svcaccount.yaml and examples/cluster-autoscaler-secret.yaml. Modify the Secret object yaml file with your credentials.

The following parameters are required in the Secret object yaml file:

as-endpoint

Find the as endpoint for different regions here,

For example, for region cn-north-4, the endpoint is
```
as.cn-north-4.myhuaweicloud.com
```
ecs-endpoint

Find the ecs endpoint for different regions here,

For example, for region cn-north-4, the endpoint is
```
ecs.cn-north-4.myhuaweicloud.com
```
project-id

Follow this link to find the project-id: Obtaining a Project ID
access-key and secret-key

Create and find the Huawei cloud access-key and secret-key required by the Secret object yaml file by referring to Access Keys and My Credentials.

Configure deployment

An example deployment file is provided at examples/cluster-autoscaler-deployment.yaml. Change the image to the image you just pushed, the cluster-name to the cluster's id and nodes to your own configurations of the node pool with format

{Minimum number of nodes}:{Maximum number of nodes}:{Node pool name}

The above parameters should match the parameters of the AS Group you created.

More configuration options can be added to the cluster autoscaler, such as scale-down-delay-after-add, scale-down-unneeded-time, etc. See available configuration options here.

Deploy cluster autoscaler on the cluster

Log in to a machine which can manage the cluster with kubectl.

Make sure the machine has kubectl access to the cluster.

Create the Service Account:

kubectl create -f cluster-autoscaler-svcaccount.yaml

Create the Secret:

kubectl create -f cluster-autoscaler-secret.yaml

Create the cluster autoscaler deployment:

kubectl create -f cluster-autoscaler-deployment.yaml

Testing

Now the cluster autoscaler should be successfully deployed on the cluster. Check it by executing

kubectl get pods -n kube-system

To see whether it functions correctly, deploy a Service to the cluster, and increase and decrease workload to the Service. Cluster autoscaler should be able to autoscale the AS Group to accommodate the load.

A simple testing method is like this:

Create a Service: listening for http request
Create HPA policy for pods to be autoscaled
- Install metrics server by yourself and create an HPA policy by executing something like this:
```
kubectl autoscale deployment [Deployment name] --cpu-percent=10 --min=1 --max=20
```
  The above command creates an HPA policy on the deployment with target average cpu usage of 10%. The number of pods will grow if average cpu usage is above 10%, and will shrink otherwise. The min and max parameters set the minimum and maximum number of pods of this deployment.

Generate load to the above service

Example tools for generating workload to an http service are:

Use hey command

Use busybox image:

kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox /bin/sh

# send an infinite loop of queries to the service
while true; do wget -q -O- {Service access address}; done

Feel free to use other tools which have a similar function.

Wait for pods to be added: as load increases, more pods will be added by HPA
Wait for nodes to be added: when there's insufficient resource for additional pods, new nodes will be added to the cluster by the cluster autoscaler
Stop the load
Wait for pods to be removed: as load decreases, pods will be removed by HPA
Wait for nodes to be removed: as pods being removed from nodes, several nodes will become underutilized or empty, and will be removed by the cluster autoscaler

Support & Contact Info

Interested in Cluster Autoscaler on Huawei Cloud? Want to talk? Have questions, concerns or great ideas?

Please reach out to us at shiqi.wang1@huawei.com.

Documentation ¶

Index ¶

Constants
func BuildHuaweiCloud(opts config.AutoscalingOptions, do cloudprovider.NodeGroupDiscoveryOptions, ...) cloudprovider.CloudProvider
type AutoScalingGroup
type AutoScalingService
type CloudConfig
type CloudServiceManager
type ElasticCloudServerService

Constants ¶

View Source

const (
	// GPULabel is the label added to nodes with GPU resource.
	GPULabel = "cloud.google.com/gke-accelerator"
)

Variables ¶

This section is empty.

Functions ¶

func BuildHuaweiCloud ¶

func BuildHuaweiCloud(opts config.AutoscalingOptions, do cloudprovider.NodeGroupDiscoveryOptions, rl *cloudprovider.ResourceLimiter) cloudprovider.CloudProvider

BuildHuaweiCloud is called by the autoscaler/cluster-autoscaler/builder to build a huaweicloud cloud provider.

Types ¶

type AutoScalingGroup ¶

type AutoScalingGroup struct {
	// contains filtered or unexported fields
}

AutoScalingGroup represents a HuaweiCloud's 'Auto Scaling Group' which also can be treated as a node group.

func (*AutoScalingGroup) AtomicIncreaseSize ¶

func (asg *AutoScalingGroup) AtomicIncreaseSize(delta int) error

AtomicIncreaseSize is not implemented.

func (*AutoScalingGroup) Autoprovisioned ¶

func (asg *AutoScalingGroup) Autoprovisioned() bool

Autoprovisioned returns true if the node group is autoprovisioned. An autoprovisioned group was created by CA and can be deleted when scaled to 0.

Always return false because the node group should maintained by user.

func (*AutoScalingGroup) Create ¶

func (asg *AutoScalingGroup) Create() (cloudprovider.NodeGroup, error)

Create creates the node group on the cloud provider side. Implementation optional.

func (*AutoScalingGroup) Debug ¶

func (asg *AutoScalingGroup) Debug() string

Debug returns a string containing all information regarding this node group.

func (*AutoScalingGroup) DecreaseTargetSize ¶

func (asg *AutoScalingGroup) DecreaseTargetSize(delta int) error

DecreaseTargetSize decreases the target size of the node group. This function doesn't permit to delete any existing node and can be used only to reduce the request for new nodes that have not been yet fulfilled. Delta should be negative. It is assumed that cloud provider will not delete the existing nodes when there is an option to just decrease the target. Implementation required.

func (*AutoScalingGroup) Delete ¶

func (asg *AutoScalingGroup) Delete() error

Delete deletes the node group on the cloud provider side. This will be executed only for autoprovisioned node groups, once their size drops to 0. Implementation optional.

func (*AutoScalingGroup) DeleteNodes ¶

func (asg *AutoScalingGroup) DeleteNodes(nodes []*apiv1.Node) error

DeleteNodes deletes nodes from this node group. Error is returned either on failure or if the given node doesn't belong to this node group. This function should wait until node group size is updated. Implementation required.

func (*AutoScalingGroup) Exist ¶

func (asg *AutoScalingGroup) Exist() bool

Exist checks if the node group really exists on the cloud provider side. Allows to tell the theoretical node group from the real one. Implementation required.

func (*AutoScalingGroup) ForceDeleteNodes ¶

func (asg *AutoScalingGroup) ForceDeleteNodes(nodes []*apiv1.Node) error

ForceDeleteNodes deletes nodes from the group regardless of constraints.

func (*AutoScalingGroup) GetOptions ¶

func (asg *AutoScalingGroup) GetOptions(defaults config.NodeGroupAutoscalingOptions) (*config.NodeGroupAutoscalingOptions, error)

GetOptions returns NodeGroupAutoscalingOptions that should be used for this particular NodeGroup. Returning a nil will result in using default options.

func (*AutoScalingGroup) Id ¶

func (asg *AutoScalingGroup) Id() string

Id returns an unique identifier of the node group.

func (*AutoScalingGroup) IncreaseSize ¶

func (asg *AutoScalingGroup) IncreaseSize(delta int) error

IncreaseSize increases the size of the node group. To delete a node you need to explicitly name it and use DeleteNode. This function should wait until node group size is updated. Implementation required.

func (*AutoScalingGroup) MaxSize ¶

func (asg *AutoScalingGroup) MaxSize() int

MaxSize returns maximum size of the node group.

func (*AutoScalingGroup) MinSize ¶

func (asg *AutoScalingGroup) MinSize() int

MinSize returns minimum size of the node group.

func (*AutoScalingGroup) Nodes ¶

func (asg *AutoScalingGroup) Nodes() ([]cloudprovider.Instance, error)

Nodes returns a list of all nodes that belong to this node group. It is required that Instance objects returned by this method have Id field set. Other fields are optional. This list should include also instances that might have not become a kubernetes node yet.

func (*AutoScalingGroup) String ¶

func (asg *AutoScalingGroup) String() string

String dumps current groups meta data.

func (*AutoScalingGroup) TargetSize ¶

func (asg *AutoScalingGroup) TargetSize() (int, error)

TargetSize returns the current target size of the node group. It is possible that the number of nodes in Kubernetes is different at the moment but should be equal to Size() once everything stabilizes (new nodes finish startup and registration or removed nodes are deleted completely). Implementation required.

Target size is desire instance number of the auto scaling group, and not equal to current instance number if the auto scaling group is in increasing or decreasing process.

func (*AutoScalingGroup) TemplateNodeInfo ¶

func (asg *AutoScalingGroup) TemplateNodeInfo() (*framework.NodeInfo, error)

TemplateNodeInfo returns a framework.NodeInfo structure of an empty (as if just started) node. This will be used in scale-up simulations to predict what would a new node look like if a node group was expanded. The returned NodeInfo is expected to have a fully populated Node object, with all of the labels, capacity and allocatable information as well as all pods that are started on the node by default, using manifest (most likely only kube-proxy). Implementation optional.

type AutoScalingService ¶

type AutoScalingService interface {
	// ListScalingGroups list all scaling groups.
	ListScalingGroups() ([]AutoScalingGroup, error)

	// GetDesireInstanceNumber gets the desire instance number of specific auto scaling group.
	GetDesireInstanceNumber(groupID string) (int, error)

	// GetInstances gets the instances in an auto scaling group.
	GetInstances(groupID string) ([]cloudprovider.Instance, error)

	// IncreaseSizeInstance increases the instance number of specific auto scaling group.
	// The delta should be non-negative.
	// IncreaseSizeInstance wait until instance number is updated.
	IncreaseSizeInstance(groupID string, delta int) error

	// GetAsgForInstance returns auto scaling group for the given instance.
	GetAsgForInstance(instanceID string) (*AutoScalingGroup, error)

	// RegisterAsg registers auto scaling group to manager
	RegisterAsg(asg *AutoScalingGroup)

	// DeleteScalingInstances is used to delete instances from auto scaling group by instanceIDs.
	DeleteScalingInstances(groupID string, instanceIds []string) error
	// contains filtered or unexported methods
}

AutoScalingService represents the auto scaling service interfaces. It should contains all request against auto scaling service.

type CloudConfig ¶

type CloudConfig struct {
	Global struct {
		ECSEndpoint string `gcfg:"ecs-endpoint"`
		ASEndpoint  string `gcfg:"as-endpoint"`
		ProjectID   string `gcfg:"project-id"`
		AccessKey   string `gcfg:"access-key"`
		SecretKey   string `gcfg:"secret-key"`
	}
}

CloudConfig is the cloud config file for huaweicloud.

type CloudServiceManager ¶

type CloudServiceManager interface {
	// ElasticCloudServerService represents the elastic cloud server interfaces.
	ElasticCloudServerService

	// AutoScalingService represents the auto scaling service interfaces.
	AutoScalingService
}

CloudServiceManager represents the cloud service interfaces. It should contains all requests against cloud services.

type ElasticCloudServerService ¶

type ElasticCloudServerService interface {
	// DeleteServers deletes a group of server by ID.
	DeleteServers(serverIDs []string) error
}

ElasticCloudServerService represents the elastic cloud server interfaces. It should contains all request against elastic cloud server service.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
huaweicloud-sdk-go-v3
core
core/auth
core/auth/basic
core/auth/cache
core/auth/env
core/auth/global
core/auth/iam
core/auth/signer
core/config
core/converter
core/def
core/httphandler
core/impl
core/region
core/request
core/response
core/sdkerr
core/sdktime
core/utils
services/as/v1
services/as/v1/model
services/ecs/v2
services/ecs/v2/model

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL