pci-pt-operator

command module

v0.0.0-...-1a9a9d0 Latest Latest Go to latest Published: Aug 3, 2022 License: Apache-2.0 Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/harvester/pci-pt-operator

README ¶

PCI Passthrough Operator

PCI Passthrough Operator is a Kubernetes operator that:

Discovers PCI Devices for nodes in your cluster and
Allows users to prepare devices for PCI Passthrough, for use with KubeVirt-managed virtual machines.

API

This operator introduces these CRDs:

PCIDevice
PCIDeviceClaim

PCIDevice

This custom resource represents PCI Devices on the host. The motivation behind getting a list of PCIDevice objects for a node is to have a cloud-native equivalent to the lspci command.

For example, if I have a 3 node cluster:

NAME     STATUS   ROLES                       AGE   VERSION
node1    Ready    control-plane,etcd,master   26h   v1.24.3+k3s1
node2    Ready    control-plane,etcd,master   26h   v1.24.3+k3s1
node3    Ready    control-plane,etcd,master   26h   v1.24.3+k3s1

And I wanted to see which PCI Devices were on node1, I would have to use ssh and get it like this:

user@host % ssh node1
user@node1 ~$ lspci
00:1c.0 PCI bridge: Intel Corporation Device 06b8 (rev f0)
00:1c.7 PCI bridge: Intel Corporation Device 06bf (rev f0)
00:1d.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #9 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Device 068e

But as more nodes are added to the cluster, this kind of manual work gets tedious. The solution is to have a DaemonSet that runs lspci on each node and then synchronizes the results with the rest of the cluster.

CRD

The PCIDevice CR looks like this:

apiVersion: devices.harvesterhci.io/v1beta1
kind: PCIDevice
metadata:
  name: pcidevice-sample
status:
  address: "00:1f.6"
  vendorId: "8086"
  deviceId: "0d4c"
  node:
    systemUUID: "30363150-3530-584d-5132-303730435a33"
    name: "titan"
  description: "Ethernet controller: Intel Corporation Ethernet Connection (11) I219-LM"
  kernelDriverInUse: "e1000e"
  kernelModules:
  - "e1000e"

PCIDeviceClaim

This custom resource is created to store the request to prepare a device for PCI Passthrough. It has pciAddress and a nodeSystemUUID, since each request is unique for a device on a particular node.

CRD

The PCIDeviceClaim CR looks like this:

apiVersion: devices.harvesterhci.io/v1beta1
kind: PCIDeviceClaim
metadata:
  name: pcideviceclaim-sample
spec:
  pciAddress: "00:1f.6"
  nodeName:  "titan"
status:
  result: Succeeded

The PCIDeviceClaim is created with a target PCI address, for the device that the user wants to prepare for PCI Passthrough. Then the status.result is set to InProgress while it's in progress, then either Succeeded or Failed, depending on whether the device was successfully prepared, and is currently bound to the vfio-pci driver.

Controllers

There is be a DaemonSet that runs the PCIDevice controller on each node. The controller reconciles the stored list of PCI Devices for that node to the actual current list of PCI devices for that node.

The PCIDeviceClaim controller will process the requests by attempting to set up devices for PCI Passthrough. The steps involved are:

Load vfio-pci kernel module
Unbind current driver from device
Create a driver_override for the device
Bind the vfio-pci driver to the device

Once the device is confirmed to have been bound to vfio-pci, the PCIDeviceClaim controller will delete the request.

The PCIDevice controller will pick up on the new currently active driver automatically, as part of it's normal operation.

Alternatives considered

Node Feature Discovery

NFD detects all kinds of features, like CPU features, USB devices, PCI devices, etc. It needs to be configured, and the output is a node label that tells whether a given device is present or not.

This only detects the presence or absence of device, not the number of them.

  "feature.node.kubernetes.io/pci-<device label>.present": "true",

Another reason not to use these simple labels is that we want to be able to allow our customers to set custom RBAC rules that restrict who can use which device in the cluster. We can do that with a custom PCIDevice CRD, but it's not clear how to do that with node labels.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
cmd
pci-devices-daemon
controllers
pkg
apis/pcidevices/v1beta1
client/clientset/versioned This package has the automatically generated clientset.	This package has the automatically generated clientset.
client/clientset/versioned/fake This package has the automatically generated fake clientset.	This package has the automatically generated fake clientset.
client/clientset/versioned/scheme This package contains the scheme of the automatically generated clientset.	This package contains the scheme of the automatically generated clientset.
client/clientset/versioned/typed/devices.harvesterhci.io/v1beta1 This package has the automatically generated typed clients.	This package has the automatically generated typed clients.
client/clientset/versioned/typed/devices.harvesterhci.io/v1beta1/fake Package fake has the automatically generated clients.	Package fake has the automatically generated clients.
daemon An instance of PCI Devices Daemon will run on every node.	An instance of PCI Devices Daemon will run on every node.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL