noderesourcetopology

package

v0.20.11 Latest Latest Go to latest Published: Oct 1, 2021 License: Apache-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/muffin-rice/scheduler-plugins

Links

Open Source Insights

README ¶

Overview

This folder holds the Topology-aware scheduler plugin implementations based on Topology aware scheduler plugin based on NodeResourceTopology CRD. This plugin enables scheduling decisions based on worker node hardware topology overcoming the issue described here.

Document capturing the NodeResourceTopology API Custom Resource Definition Standard can be found here.

Maturity Level

💡 Sample (for demonstrating and inspiring purpose)
👶 Alpha (used in companies for pilot projects)
👦 Beta (used in companies and developed actively)
👨 Stable (used in companies for production workloads)

Tutorial

Expectation

In case the cumulative count of node resource allocatable appear to be the same for both the nodes in the cluster, topology aware scheduler plugin uses the CRD instance corresponding to the nodes to obtain the resource topology information to make a topology-aware scheduling decision.

Config

Enable the "NodeResourceTopologyMatch" Filter and Score plugins via SchedulerConfigConfiguration.

apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: false
clientConnection:
  kubeconfig: "/etc/kubernetes/scheduler.conf"
profiles:
- schedulerName: topo-aware-scheduler
  plugins:
    filter:
      enabled:
      - name: NodeResourceTopologyMatch
    score:
      enabled:
      - name: NodeResourceTopologyMatch
# optional plugin configs
  pluginConfig:
  - name: NodeResourceTopologyMatch
    args:
      kubeconfigpath: "/etc/kubernetes/scheduler.conf"
      namespaces:
        - default
        - production
        - test-namespace
      # other strategies are MostAllocatable and BalancedAllocation
      scoringStrategy:
        type: "LeastAllocatable"

Demo

Let us assume we have two nodes in a cluster deployed with sample-device-plugin with the hardware topology described by the diagram below:

Setup

The hardware topology corresponding to both the nodes is represented by the below CRD instances. These CRD instances are supposed to be created by Node Agents like Resource Topology Exporter (RTE) or Node feature Discovery (NFD). Please refer to issue Exposing Hardware Topology through CRDs in NFD and Design document which captures details of enhancing NFD to expose node resource topology through CRDs. Noderesourcetopology plugin works with namespaces, in this case each CRD could be namespace specific. The default namespace is used if it was omitted in the plugin's configuration.

# Worker Node A CRD spec
apiVersion: topology.node.k8s.io/v1alpha1
kind: NodeResourceTopology
metadata:
  name: worker-node-A
  namespace: test-namespace
topologyPolicies: ["SingleNUMANodeContainerLevel"]
zones:
  - name: numa-node-0
    type: Node
    resources:
      - name: cpu
        capacity: 4
        allocatable: 3
      - name: example.com/deviceA
        capacity: 1
        allocatable: 1
      - name: example.com/deviceB
        capacity: 2
        allocatable: 2
  - name: numa-node-1
    type: Node
    resources:
      - name: cpu
        capacity: 4
        allocatable: 3
      - name: example.com/deviceA
        capacity: 2
        allocatable: 2
      - name: example.com/deviceB
        capacity: 1
        allocatable: 1

# Worker Node B CRD spec
apiVersion: topology.node.k8s.io/v1alpha1
kind: NodeResourceTopology
metadata:
  name: worker-node-B
  namespace: test-namespace
topologyPolicies: ["SingleNUMANodeContainerLevel"]
zones:
  - name: numa-node-0
    type: Node
    resources:
      - name: cpu
        capacity: 4
        allocatable: 3
      - name: example.com/deviceA
        capacity: 3
        allocatable: 3
  - name: numa-node-1
    type: Node
    resources:
      - name: cpu
        capacity: 4
        allocatable: 3
      - name: example.com/deviceB
        capacity: 3
        allocatable: 3

Verify if the CRD has been created by running
1. In case NFD/RTE is deployed in the cluster ensure that the CRD and CRD instances are created by running
```
  $ kubectl get noderesourcetopologies.topology.node.k8s.io
```
2. Alternatively, in case you are just interested in simply testing the scheduler plugin, use the manifest in the manifest directory to deploy the CRD and CRs as follows:
  1. Deploy the Custom Resource Definition manifest
```
$ kubectl create -f crd.yaml
```
  2. Check if the noderesourcetopologies.topology.node.k8s.io CRD is created
```
 $ kubectl get crd
 $ kubectl get noderesourcetopologies.topology.node.k8s.io
```
  3. Deploy the CRs representative of the hardware topology of the worker-node-A and worker-node-B:
```
 $ kubectl create -f ns.yaml
 $ kubectl create -f worker-node-A.yaml
 $ kubectl create -f worker-node-B.yaml
```
    NOTE: In case you are testing this demo by creating CRs manually, ensure that the names of the nodes in the cluster match the CR names.
Copy cluster kubeconfig file to /etc/kubernetes/scheduler.conf
Build the image locally
```
$  make local-image
```

Push the built image to the image registry:

$ docker push <IMAGE_REGISTRY>/scheduler-plugins/kube-scheduler:latest

Deploy the topology-aware scheduler plugin config
```
$ kubectl  create -f scheduler-configmap.yaml
```

Deploy the Scheduler plugin

$ kubectl  create  -f cluster-role.yaml
serviceaccount/topo-aware-scheduler created
clusterrole.rbac.authorization.k8s.io/noderesourcetoplogy-handler created
clusterrolebinding.rbac.authorization.k8s.io/topo-aware-scheduler-as-kube-scheduler created
clusterrolebinding.rbac.authorization.k8s.io/my-scheduler-as-volume-scheduler created
rolebinding.rbac.authorization.k8s.io/topo-aware-scheduler-as-kube-scheduler created
clusterrolebinding.rbac.authorization.k8s.io/noderesourcetoplogy created

$ kubectl create -f deploy.yaml
deployment.apps/topo-aware-scheduler created

Check if the scheduler plugin is deployed correctly by running the following

$ kubectl get pods -n kube-system -o wide
NAME                                         READY   STATUS    RESTARTS   AGE   IP            NODE                 NOMINATED NODE   READINES
topo-aware-scheduler-764c475854-vpmcw        1/1     Running   0          2s    10.244.0.14   kind-control-plane   <none>           <none>

Deploy the pod to be scheduled with topology-aware scheduler plugin by populating the schedulerName: topo-aware-scheduler

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-deployment
spec:
  selector:
      matchLabels:
        name: test
  template:
    metadata:
      labels:
        name: test
    spec:
      schedulerName: topo-aware-scheduler
      containers:
      - name: test-deployment-1-container-1
        image: nginx:1.7.9
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            cpu: 1
            memory: 100Mi
            example.com/deviceA: 1
            example.com/deviceB: 1
          requests:
            cpu: 1
            memory: 100Mi
            example.com/deviceA: 1
            example.com/deviceB: 1

$ kubectl create -f test-deployment.yaml
deployment.apps/test-deployment created

The test-deployment pod should be scheduled on the worker-node-A node

$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP           NODE                 NOMINATED NODE   READINESS GATES
device-plugin-a-ds-9bpsj           1/1     Running   0          3h13m   172.17.0.3   worker-node-B          <none>           <none>
device-plugin-a-ds-dv55t           1/1     Running   0          3h13m   172.17.0.2   worker-node-A          <none>           <none>
device-plugin-b-ds-8t7lh           1/1     Running   0          3h13m   172.17.0.2   worker-node-A          <none>           <none>
device-plugin-b-ds-lt4pr           1/1     Running   0          3h13m   172.17.0.3   worker-node-B          <none>           <none>
test-deployment-6dccf65ddb-pkg9j   1/1     Running   0          18s     172.17.0.2   worker-node-A          <none>           <none>

Documentation ¶

Index ¶

Constants
func MakeTopologyResInfo(name, capacity, available string) topologyv1alpha1.ResourceInfo
func New(args runtime.Object, handle framework.Handle) (framework.Plugin, error)
type NUMANode
type NUMANodeList
type PolicyHandler
type PolicyHandlerMap
type TopologyMatch

Constants ¶

View Source

const (
	// Name is the name of the plugin used in the plugin registry and configurations.
	Name = "NodeResourceTopologyMatch"
)

Variables ¶

This section is empty.

Functions ¶

func MakeTopologyResInfo ¶

func MakeTopologyResInfo(name, capacity, available string) topologyv1alpha1.ResourceInfo

func New ¶

func New(args runtime.Object, handle framework.Handle) (framework.Plugin, error)

New initializes a new plugin and returns it.

Types ¶

type NUMANode ¶

type NUMANode struct {
	NUMAID    int
	Resources v1.ResourceList
}

type NUMANodeList ¶

type NUMANodeList []NUMANode

type PolicyHandler ¶

type PolicyHandler func(pod *v1.Pod, zoneMap topologyv1alpha1.ZoneList) *framework.Status

type PolicyHandlerMap ¶

type PolicyHandlerMap map[topologyv1alpha1.TopologyManagerPolicy]tmScopeHandler

type TopologyMatch ¶

type TopologyMatch struct {
	// contains filtered or unexported fields
}

TopologyMatch plugin which run simplified version of TopologyManager's admit handler

func (*TopologyMatch) EventsToRegister ¶

func (tm *TopologyMatch) EventsToRegister() []framework.ClusterEvent

EventsToRegister returns the possible events that may make a Pod failed by this plugin schedulable. NOTE: if in-place-update (KEP 1287) gets implemented, then PodUpdate event should be registered for this plugin since a Pod update may free up resources that make other Pods schedulable.

func (*TopologyMatch) Filter ¶

func (tm *TopologyMatch) Filter(ctx context.Context, cycleState *framework.CycleState, pod *v1.Pod, nodeInfo *framework.NodeInfo) *framework.Status

Filter Now only single-numa-node supported

func (*TopologyMatch) Name ¶

func (tm *TopologyMatch) Name() string

Name returns name of the plugin. It is used in logs, etc.

func (*TopologyMatch) Score ¶

func (tm *TopologyMatch) Score(ctx context.Context, state *framework.CycleState, pod *v1.Pod, nodeName string) (int64, *framework.Status)

func (*TopologyMatch) ScoreExtensions ¶

func (tm *TopologyMatch) ScoreExtensions() framework.ScoreExtensions

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL