rmd-operator

module

v0.1.1 Latest Latest Go to latest Published: Aug 26, 2020 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/intel/rmd-operator

Links

Open Source Insights

README ¶

Intel RMD Operator

Kubernetes Operator designed to provision and manage Intel Resource Management Daemon (RMD) instances in a Kubernetes cluster.

Prerequisites

Node Feature Discovery (NFD) should be deployed in the cluster before running the operator. Once NFD has applied labels to nodes with capabilities compatible with RMD, such as Intel L3 Cache Allocation Technology, the operator can deploy RMD on those nodes. Note: NFD is recommended, but not essential. Node labels can also be applied manually. See the NFD repo for a full list of features labels.
A working RMD container image from the RMD repo compatible with the RMD Operator (see compatiblilty table below).

Compatibility

RMD Version	RMD Operator Version
v0.1	N/A
v0.2	v0.1
v0.3	v0.2

Setup

Debug Mode

To use the operator with RMD in debug mode, the port number of build/manifests/rmd-pod.yaml must be set to 8081 before building the operator. Debug mode is advised for testing only.

TLS Enablement

To use the operator with RMD with TLS enabled, the port number of build/manifests/rmd-pod.yaml must be set to 8443 before building the operator. The certificates provided in this repository are taken from the RMD repository and should be used for testing only. The user can generate their own certs for production and replace with those existing. The client certs for the RMD operator are stored in the following locations:

CA: build/certs/public/ca.pem

Public Key: build/certs/public/cert.pem

Private Key: build/certs/private/key.pem

Build

Note: The operator deploys pods with the RMD container. The Dockerfile for this container is located on the RMD repo and is out of scope for this project.

The operator supports RMD v0.2 only.

The pod spec used by the operator to deploy the RMD container is located at build/manifests/rmd-pod.yaml. Alterations to the image name/tag should be made here.

Build binaries and create docker images for the operator and the node agent:

make all

Note: The Docker images built are intel-rmd-operator:latest and intel-rmd-node-agent:latest. Once built, these images should be stored in a remote docker repository for use throughout the cluster.

Deploy

The deploy directory contains all specifications for the required RBAC objects. These objects can be inspected and deployed individually or created all at once using rbac.yaml:

kubectl create -f deploy/rbac.yaml

Create RmdNodeState CRD:

kubectl create -f deploy/crds/intel.com_rmdnodestates_crd.yaml

Create RmdWorkloads CRD:

kubectl create -f deploy/crds/intel.com_rmdworkloads_crd.yaml

Create Operator Pod:

kubectl create -f deploy/operator.yaml

Note: For the operator to deploy and run RMD instances, an up to date RMD docker image is required.

Custom Resource Definitions (CRDs)

RmdWorkload

The RmdWorkload custom resource is the object used to define a workload for RMD. RmdWorkload objects can be created directly via the RmdWorkload spec or automatically via the pod spec.

Direct configuration affords the user more control over specific cores and specific nodes on which they wish to configure a particular RmdWorkload. This section describes the direct configuration approach.

Automatic configuration utilizes pod annotations and the intel.com/l3_cache_ways extended resource to create an RmdWorkload for the same CPUs that are allocated to the pod. The automatic configuration approach is described later. This approach has a number of limitations and is less stable than direct configuration.

Examples

See samples directory for RmdWorkload templates.

Cache

See samples/rmd-workload-guaranteed-cache.yaml

apiVersion: intel.com/v1alpha1
kind: RmdWorkload
metadata:
    name: rmdworkload-guaranteed-cache
spec:
    coreIds: ["0","1","2","3"]
    cache:
        max: 2
        min: 2
    nodes: ["worker-node-1", "worker-node-2"]

This workload requests cache from the guaranteed group for CPUs 0 to 3 on nodes "worker-node-1" and "worker-node-2". See intel/rmd for details on cache pools/groups. Note: Replace "worker-node-1" and "worker-node-2" in nodes field with the actual node name(s) you wish to target with your RmdWorkload spec.

Creating this workload is the equivalent of:

$ curl -H "Content-Type: application/json" --request POST --data \
         '{"core_ids":["0","1","2","3"],
           "cache" : {"max": 2, "min": 2 } }' \
         https://hostname:port/v1/workloads

P-State

See samples/rmd-workload-guaranteed-cache-pstate.yaml

apiVersion: intel.com/v1alpha1
kind: RmdWorkload
metadata:
    name: rmdworkload-guaranteed-cache-pstate
spec:
    coreIds: ["4","5","6","7"]
    cache:
        max: 2
        min: 2
    pstate:
        ratio: "3.0"
        monitoring: "on"
    nodes: ["worker-node-1", "worker-node-2"]

This workload expands on the previous example with manually specified parameters with P-State plugin enabled.

Creating this workload is the equivalent of:

$ curl -H "Content-Type: application/json" --request POST --data \
         '{"core_ids":["4","5","6","7"],
           "cache" : {"max": 2, "min": 2 } }' \
           "pstate" : {"ratio": 3.0, "monitoring" : "on"} }' \
         https://hostname:port/v1/workloads

Create RmdWorkload

kubectl create -f samples/rmd-workload-guaranteed-cache.yaml

List RmdWorkloads

kubectl get rmdworkloads

Display a particular RmdWorkload:

kubectl describe rmdworkload rmd-workload-guaranteed-cache

Name:         rmdworkload-guaranteed-cache
Namespace:    default
API Version:  intel.com/v1alpha1
Kind:         RmdWorkload
Spec:
  Cache:
    Max:  2
    Min:  2
  Core Ids:
    0
    1
    2
    3
  Nodes:
    worker-node-1
    worker-node-2
Status:
  Workload States:
    worker-node-1:
      Cos Name:  0_1_2_3-guarantee
      Id:        2
      Response:  Success: 200
      Status:    Successful
    worker-node-2:
      Cos Name:  0_1_2_3-guarantee
      Id:        2
      Response:  Success: 200
      Status:    Successful

This displays the RmdWorkload object including the spec as defined above and the status of the workload. Here, the status shows that this workload was configured successfully on nodes "worker-node-1" and "worker-node-2".

Delete RmdWorkload

When the user deletes an RmdWorkload object, a delete request is sent to the RMD API on every RMD instance on which that RmdWorkload is configured.

kubectl delete rmdworkload rmd-workload-guaranteed-cache

Note: If the user only wishes to delete the RmdWorkload from a specific node, that node should be removed from the RmdWorkload spec's "nodes" field and then apply the RmdWorkload object.

kubectl apply -f samples/rmd-workload-guaranteed-cache.yaml

RmdNodeState

The RmdNodeState custom resource is created for each node in the cluster which has RMD running. The purpose of this object is to allow the user to view all running workloads on a particular node at any given time. Each RmdNodeState object will be named according to its corresponding node (ie rmd-node-state-<node-name>).

List all RmdNodeStates on the cluster

kubectl get rmdnodestates

Display a particular RmdNodeState such as the example above

kubectl describe rmdnodestate rmd-node-state-worker-node-1

Name:         rmd-node-state-worker-node-1
Namespace:    default
API Version:  intel.com/v1alpha1
Kind:         RmdNodeState
Spec:
  Node:      worker-node-1
  Node UID:  75d03574-6991-4292-8f16-af43a8bfa9a6
Status:
  Workloads:
    rmdworkload-guaranteed-cache:
      Cache Max:  2
      Cache Min:  2
      Core IDs:   0,1,2,3
      Cos Name:   0_1_2_3-guarantee
      ID:         1
      Origin:     REST
      Status:     Successful
    rmdworkload-guaranteed-cache-pstate:
      Cache Max:  2
      Cache Min:  2
      Core IDs:   4,5,6,7
      Cos Name:   4_5_6_7-guarantee
      ID:         2
      Origin:     REST
      Status:     Successful

This example displays the RmdNodeState for worker-node-1. It shows that this node currently has two RMD workloads configured successfully.

Pod Requesting Cache Ways

It is also possible for the operator to create an RmdWorkload automatically by interpreting resource requests and annotations in the pod spec.

Warning: Automatic creation of workloads may be unstable and is not recommended in production for the RMD Operator v0.1. However, testing and feedback is welcomed to help stabilize this approach for future releases.

Under this approach, the user creates a pod with a container requesting exclusive CPUs from the Kubelet CPU Manager and available cache ways. The pod must also contain RMD specific pod annotations to describe the desired RmdWorkload. It is then the responsiblity of the operator and the node agent to do the following:

Extract the RMD related data passed to the pod spec by the user.
Discover which CPUs have been allocated to the container by the CPU Manager.
Create the RmdWorkload object based on this information.

The following criteria must be met in order for the operator to succesfully create an RmdWorkload for a container based on the pod spec.

The container must request extended resource intel.com/l3_cache_ways.
The container must also request exclusive CPUs from CPU Manager.
Pod annotations pertaining to the container requesting cache ways must be prefixed with that container's name. See example and table below.

Example

See samples/pod-guaranteed-cache.yaml

apiVersion: v1
kind: Pod
metadata:
  generateName: guaranteed-cache-pod-
  labels:
    name: nginx
  annotations:
    nginx1_cache_min: "2"
spec:
  containers:
  - name: nginx1
    image: nginx
    resources:
      requests:
        memory: "64Mi"
        cpu: 3
        intel.com/l3_cache_ways: 2
      limits:
        memory: "64Mi"
        cpu: 3
        intel.com/l3_cache_ways: 2

This pod spec has one container requesting 3 exclusive CPUs and 2 cache ways. The number of cache ways requested is also interpreted as the value for max cache for the RmdWorkload. The min cache value is specified in the pod annotations. The naming convention for RMD workload related annotations must follow the table below.

Pod Annotations Naming Convention

Note: Annotations must be prefixed with the relevant container name as shown below.

Specification	Container Name	Required Annotation Name
Min Cache	nginx1	nginx1_cache_min
Policy	nginx1	nginx1_policy
P-State Ratio	nginx1	nginx1_pstate_ratio
P-State Monitoring	nginx1	nginx1_pstate_monitoring

Failure to follow the provided annotation naming convention will result in failure to create the desired workload.

Create Pod

kubectl create -f sample/pod-guaranteed-cache.yaml

Display RmdWorkload

If successful, the RmdWorkload will be created with the naming convention rmd-workload-<pod-name>

kubectl describe rmdworkload rmd-workload-guaranteed-cache-pod-86676

Name:         rmd-workload-guaranteed-cache-pod-86676                                                    
Namespace:    default                                                                                    
API Version:  intel.com/v1alpha1                                                                         
Kind:         RmdWorkload                                                                                
Spec:
  Cache:
    Max:  2
    Min:  2
  Core Ids:
    1
    2
    49
  Nodes:
    worker-node-1
  Policy:
  Pstate:
    Monitoring:
    Ratio:
Status:
  Workload States:
    worker-node-1:
      Cos Name:  1_2_49-guarantee
      Id:        3
      Response:  Success: 200
      Status:    Successful

This output displays the RmdWorkload which has been created succesfully based on the pod spec created above. Note that CPUs 1,2 and 49 have been allocated to the container by the CPU Manager. As this RmdWorkload was created automatically via the pod spec, the user has no control over which CPUs are used by the container. In order to explicitly define which CPUs are to be allocated cache ways, the RmdWorkload must be created directly via the RmdWorkload spec and not the pod spec.

Delete Pod and RmdWorkload

When an RmdWorkload is created by the operator based on a pod spec, that pod object becomes the owner of the RmdWorkload object it creates. Therefore when a pod that owns an RmdWorkload is deleted, its RmdWorkload child is automatically garbage collected and thus removed from RMD.

kubectl delete pod rmd-workload-guaranteed-cache-pod-86676

Limitations in Creating RmdWorkloads via Pod Spec

Only one container per pod may request l3 cache ways.
Automatic configuration is only achievable with the native Kubernetes CPU Manager static policy.
The user has no control over which CPUs are configured with the automatically created RmdWorkload policy as the CPU Manager is in charge of CPU allocation.

Creating an RmdWorkload automatically via a pod spec is far less reliable than creating directly via an RmdWorkload spec. This is because the user no longer has the ability to explicitly define the specific CPUs on which the RmdWorkload will ultimately be configured. CPU allocation for containers is the responsibility of the CPU Manager in Kubelet. As a result, the RmdWorkload will only be created after the pod is admitted. Once the RmdWorkload is created by the operator, the RmdWorkload information is sent to RMD in the form of an HTTPS post request. Should the post to RMD fail at this point for any reason, the operator will then terminate the pod and by association the RmdWorkload.

To discover why the pod was terminated by the operator it is necessary to check the opertor pod's logs.

Example

kubectl logs intel-rmd-operator-6464fcfb94-4cvqn | grep guaranteed-cache-pod-9wbk9 -A 1

Example Output

{"level":"info","ts":1591601591.2043474,"logger":"controller_rmdworkload","msg":"Workload not found on RMD instance, create.","Request.Namespace":"default","Request.Name":"rmd-workload-guaranteed-cache-pod-2dnwh"} {"level":"error","ts":1591601591.2067816,"logger":"controller_rmdworkload.postWorkload","msg":"Failed to post workload to RMD","Response:":"Fail: Failed to validate workload. Reason: Workload validation in database failed. Details: CPU list 6 has been assigned\n","error":"Response status code error"...

Workflows

Direct Configuration

direct-config

Automatic Configuration

auto-config

Directories ¶

Path	Synopsis
cmd
manager
nodeagent
pkg
apis
apis/intel Package intel contains intel API versions.	Package intel contains intel API versions.
apis/intel/v1alpha1 Package v1alpha1 contains API Schema definitions for the intel v1alpha1 API group +k8s:deepcopy-gen=package,register +groupName=intel.com Package v1alpha1 contains API Schema definitions for the intel v1alpha1 API group +k8s:deepcopy-gen=package,register +groupName=intel.com	Package v1alpha1 contains API Schema definitions for the intel v1alpha1 API group +k8s:deepcopy-gen=package,register +groupName=intel.com Package v1alpha1 contains API Schema definitions for the intel v1alpha1 API group +k8s:deepcopy-gen=package,register +groupName=intel.com
controller
controller/node
controller/rmdnodestate
controller/rmdworkload
nodeagent
rmd
version

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL