node-disk-manager
disk manager help to manage host disks, implementing disk partition and file system formatting.
Building
make
Running
./bin/node-disk-manager
Features
- Disk provisioning as Longhorn disks with a simple boolean.
- Disk formatting if needed with a simple boolean.
- Disk discovery, including existing block devices, and hot plugged disks.
- Support multiple storage controller (IDE/SATA/SCSI/Virtio).
- Support virtual disks (WWN on the disk is required for unique identification).
- Device mapper and LVM are not yet supported.
- The behaviour of multipath devices is undefined.
Architecture
The Node Disk Manager (a.k.a. NDM) is a simple Kubernetes controller,
following the famous controller pattern. It leverages Rancher's wrangler
framework to construct a controller.
NDM is a single binary built with Golang and designed as a Kubernetes DaemonSet.
You can find more information about how NDM is shipped with Harvester from this
helm chart definition.
NDM has two main functionalities: disk discovery and disk provisioning. Each
is handled by dedicated components in this project. We'll discuss each topic
separately later. First, let us learn about the custom resource for NDM:
blockdevices.
blockdevices
Custom Resource
A blockdevice
is a Kubernetes custom resource (CR) that represents a
block device on a node. blockdevice
CR records several lower-level block
device information from the operating system, for example, file system status,
mount point, and UUIDs. These details are all stored in status.deviceStatus
.
The name of a blockdevice
is a global identifier across nodes within the
whole cluster. At this moment, we recommend disk you want to provision to have
at least WWN on it. It helps the system to globally identify the blockdevice
resource and link to real block device of the operating system.
Besides its name
field, the most important fields you need to know is
spec.fileSystem.provisioned
and spec.fileSystem.forceFormatted
. The format
implies that a user expects the block device to be provisioned as Longhorn disk
for further usage. And the latter just indicates that NDM would perform a disk
formatting if not yet done before.
Disk Discovery
As a daemonset workload, each NDM instance takes charge of disk on its own node.
There are two components collecting the information of disks on the node, as
well as creating, updating, or deleting corresponding blockdevice CR.
The first is scanner
. It scans all supported block devices on the system and
creates a new one if not exists, or deletes old one if is already removed from
the system. For block devices that need to update, it simply enqueue the
blockdevice
CR to let blockdevice controller handle the update path to prevent
any possible race condition. Scanner also periodically scans the system to inform
the controller to update info if needed.
The other key component is udev
, which utilizes Linux's dynamic device
management mechanism. udev
, as a supplement of scanner, mostly behaves the same
as scanner, but instantly for responding to hot-plugged devices.
There is a module filter
. It comprises several filter functions, which
get their own predicates to determine which block device should be collected by
scanner and udev.
Disk Provisioning
The controller of NDM listens for changes of blockdevice
CR and perform
corresponding actions, namely
- Format disk
- Mount/Unmount filesystem
- Provision/Unprovision disk to/from Longhorn
- Update device status details
Which actual action to perform is determines by the combination of
spec.fileSystem
, device formatting and mounting status, and
status.provisionPhase
. The last one indicates whether the block device is
currently used by Longhorn.
To avoid any race condition, the controller must be the only component that
updates existing blockdevice
CR. Other components who need an update must
enqueue the CR instead.
Appendix
We recommend user use the SCSI device, which contains the WWN
to test the NDM.
Here we give the Sample XML for libvirt
to create a SCSI device with WWN
.
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/tmp/libvirt_disks/harvester_harvester-node-0-sda.qcow2'/>
<target dev='sda' bus='scsi'/>
<wwn>0x5000c50015ac3bd9</wwn>
</disk>
NOTE: If we create w/o WWN, NDM will use filesystem UUID as a unique identifier.
That has some limitations. For example, the UUID will be missed if the filesystem metadata is broken.
License
Copyright (c) 2022 Rancher Labs, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.