coscheduling

package
v0.18.800 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 28, 2020 License: Apache-2.0 Imports: 15 Imported by: 1

README

Overview

This folder holds the coscheduling plugin implementations based on Lightweight coscheduling based on back-to-back queue sorting.

Maturity Level

  • 💡 Sample (for demonstrating and inspiring purpose)
  • 👶 Alpha (used in companies for pilot projects)
  • 👦 Beta (used in companies and developed actively)
  • 👨 Stable (used in companies for production workloads)

Tutorial

PodGroup

We use a special label named pod-group.scheduling.sigs.k8s.io/name to define a PodGroup. Pods that set this label and use the same value belong to the same PodGroup.

labels:
     pod-group.scheduling.sigs.k8s.io/name: nginx
     pod-group.scheduling.sigs.k8s.io/min-available: "2"

We will calculate the sum of the Running pods and the Waiting pods (assumed but not bind) in scheduler, if the sum is greater than or equal to the minAvailable, the Waiting pods will be created.

Pods in the same PodGroup with different priorities might lead to unintended behavior, so need to ensure Pods in the same PodGroup with the same priority.

Expectation
  1. If 2 PodGroups with different priorities come in, the PodGroup with high priority has higher precedence.
  2. If 2 PodGroups with same priority come in when there are limited resources, the PodGroup created first one has higher precedence.
Config
  1. queueSort, permit and unreserve must be enabled in coscheduling.
  2. preFilter is enhanced feature to reduce the overall scheduling time for the whole group. It will check the total number of pods belonging to the same PodGroup. If the total number is less than minAvailable, the pod will reject in preFilter, then the scheduling cycle will interrupt. And the preFilter is user selectable according to the actual situation of users. If the minAvailable of PodGroup is relatively small, for example less than 5, you can disable this plugin. But if the minAvailable of PodGroup is relatively large, please enable this plugin to reduce the overall scheduling time.
apiVersion: kubescheduler.config.k8s.io/v1alpha2
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: false
clientConnection:
  kubeconfig: "REPLACE_ME_WITH_KUBE_CONFIG_PATH"
profiles:
- schedulerName: default-scheduler
  plugins:
    queueSort:
      enabled:
        - name: Coscheduling
      disabled:
        - name: "*"
    preFilter:
      enabled:
        - name: Coscheduling
    permit:
      enabled:
        - name: Coscheduling
    unreserve:
      enabled:
        - name: Coscheduling
Demo

Suppose we have a cluster which can only afford 3 nginx pods. We create a ReplicaSet with replicas=6, and set the value of minAvailable to 3.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 6
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
        pod-group.scheduling.sigs.k8s.io/name: nginx
        pod-group.scheduling.sigs.k8s.io/min-available: "3"
    spec:
      containers:
      - name: nginx
        image: nginx
        resources:
          limits:
            cpu: 3000m
            memory: 500Mi
          requests:
            cpu: 3000m
            memory: 500Mi
$ kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
nginx-4jw2m   0/1     Pending   0          55s
nginx-4mn52   1/1     Running   0          55s
nginx-c9gv8   1/1     Running   0          55s
nginx-frm24   0/1     Pending   0          55s
nginx-hsflk   0/1     Pending   0          55s
nginx-qtj5f   1/1     Running   0          55s

If min-available is set to 4 at this time, all nginx pods are in pending state because the resource does not meet the requirements of minavailable

$ kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
nginx-4vqrk   0/1     Pending   0          3s
nginx-bw9nn   0/1     Pending   0          3s
nginx-gnjsv   0/1     Pending   0          3s
nginx-hqhhz   0/1     Pending   0          3s
nginx-n47r7   0/1     Pending   0          3s
nginx-n7vtq   0/1     Pending   0          3s

Documentation

Index

Constants

View Source
const (
	// Name is the name of the plugin used in Registry and configurations.
	Name = "Coscheduling"
	// PodGroupName is the name of a pod group that defines a coscheduling pod group.
	PodGroupName = "pod-group.scheduling.sigs.k8s.io/name"
	// PodGroupMinAvailable specifies the minimum number of pods to be scheduled together in a pod group.
	PodGroupMinAvailable = "pod-group.scheduling.sigs.k8s.io/min-available"
)

Variables

This section is empty.

Functions

func GetPodGroupLabels

func GetPodGroupLabels(pod *v1.Pod) (string, int, error)

GetPodGroupLabels checks if the pod belongs to a PodGroup. If so, it will return the podGroupName, minAvailable of the PodGroup. If not, it will return "" and 0.

func New

New initializes a new plugin and returns it.

Types

type Args

type Args struct {
	// PermitWaitingTime is the wait timeout in seconds.
	PermitWaitingTimeSeconds int64
	// PodGroupGCInterval is the period to run gc of PodGroup in seconds.
	PodGroupGCIntervalSeconds int64
	// If the deleted PodGroup stays longer than the PodGroupExpirationTime,
	// the PodGroup will be deleted from PodGroupInfos.
	PodGroupExpirationTimeSeconds int64
}

Args defines the scheduling parameters for Coscheduling plugin.

type Coscheduling

type Coscheduling struct {
	// contains filtered or unexported fields
}

Coscheduling is a plugin that implements the mechanism of gang scheduling.

func (*Coscheduling) Less

func (cs *Coscheduling) Less(podInfo1, podInfo2 *framework.PodInfo) bool

Less is used to sort pods in the scheduling queue. 1. Compare the priorities of Pods. 2. Compare the initialization timestamps of PodGroups/Pods. 3. Compare the keys of PodGroups/Pods, i.e., if two pods are tied at priority and creation time, the one without podGroup will go ahead of the one with podGroup.

func (*Coscheduling) Name

func (cs *Coscheduling) Name() string

Name returns name of the plugin. It is used in logs, etc.

func (*Coscheduling) Permit

func (cs *Coscheduling) Permit(ctx context.Context, state *framework.CycleState, pod *v1.Pod, nodeName string) (*framework.Status, time.Duration)

Permit is the functions invoked by the framework at "Permit" extension point.

func (*Coscheduling) PreFilter

func (cs *Coscheduling) PreFilter(ctx context.Context, state *framework.CycleState, pod *v1.Pod) *framework.Status

PreFilter performs the following validations.

  1. Validate if minAvailables and priorities of all the pods in a PodGroup are the same.
  2. Validate if the total number of pods belonging to the same `PodGroup` is less than `minAvailable`. If so, the scheduling process will be interrupted directly to avoid the partial Pods and hold the system resources until a timeout. It will reduce the overall scheduling time for the whole group.

func (*Coscheduling) PreFilterExtensions

func (cs *Coscheduling) PreFilterExtensions() framework.PreFilterExtensions

PreFilterExtensions returns nil.

func (*Coscheduling) Unreserve

func (cs *Coscheduling) Unreserve(ctx context.Context, state *framework.CycleState, pod *v1.Pod, nodeName string)

Unreserve rejects all other Pods in the PodGroup when one of the pods in the group times out.

type PodGroupInfo

type PodGroupInfo struct {
	// contains filtered or unexported fields
}

PodGroupInfo is a wrapper to a PodGroup with additional information. A PodGroup's priority, timestamp and minAvailable are set according to the values of the PodGroup's first pod that is added to the scheduling queue.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL