intel-resource-drivers-for-kubernetes

module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 1, 2024 License: Apache-2.0

README

Intel GPU resource driver

CAUTION: This is an beta / non-production software, do not use on production clusters.

Glossary

About resource driver

Intel GPU resource driver is a better alternative for Intel GPU device plugin, facilitating workload offloading by providing GPU access on Kubernetes cluster worker nodes.

Supported GPU devices (with Linux kernel Intel i915 GPU driver):

  • Intel® Data Center GPU Max Series
  • Intel® Data Center GPU Flex Series
  • Intel® Arc A-Series
  • Intel® Iris® Xe MAX
  • Intel® Integrated graphics
About Dynamic Resource Allocation

Dynamic Resource Allocation (DRA) is a resource management framework in Kubernetes (1.26+), that allows management of special resources in cluster (typically HW accelerators) by vendor-provided resource drivers (typically a controller and a node-agent / kubelet-plugin) in a common way.

Resource drivers are meant to handle discovery, allocation, accounting of specific resources as well as their preparation for Pod before Pod startup, and cleanup after the Pod has completed successfully and the resource is no longer needed. More info is in the KEP

Intel GPU resource driver consists of the controller and kubelet plugin. Controller makes allocation decisions and kubelet plugin ensures that the allocated GPUs and SR-IOV Virtual Functions are prepared and available for Pods.

Requirements

  • Kubernetes 1.28+, with DynamicResourceAllocation feature-flag enabled, and other cluster parameters
  • Container runtime needs to support CDI:
    • CRI-O at least v1.23.0
    • Containerd at least v1.7 (any release candidate will do)

Supported Kubernetes Versions

Supported Kubernetes versions are listed below:

Branch Kubernetes branch/version Status
v0.1.0-beta Kubernetes 1.26 branch v1.26.x unsupported
v0.1.1-beta Kubernetes 1.27 branch v1.27.x unsupported
v0.2.0 Kubernetes 1.28 branch v1.28.x unsupported
v0.3.0 Kubernetes 1.28+ supported
v0.4.0 Kubernetes 1.28+ supported

Documentation

Release process

Project's release cadence is quarterly. During the release process the issue is created in Github to track progress based on release task template.

Once the content is available in the main branch and validation PASSes, release branch will be created (e.g. release-v0.2.0). The HEAD of release branch will also be tagged with the corresponding tag (e.g. v0.2.0).

During the release creation, the project's documentation, deployment files etc. will be changed to point to the newly created version.

Patch releases (e.g. 0.2.1) are done on a need basis if there are security issues or minor fixes for specific supported version. Fixes are always cherry-picked from the main branch to the release branches.

Directories

Path Synopsis
cmd
pkg
intel.com/resource/gpu/clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
intel.com/resource/gpu/clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
intel.com/resource/gpu/clientset/versioned/typed/gpu/v1alpha2
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
intel.com/resource/gpu/clientset/versioned/typed/gpu/v1alpha2/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL