Kubermatic operating-system-manager
Operating System Manager is responsible for creating and managing the required configurations for worker nodes in a Kubernetes cluster. It decouples operating system configurations into dedicated and isolable resources for better modularity and maintainability.
These isolated and extensible resources allow a high degree of customization. This is useful for hybrid, edge, and air-gapped environments.
Configurations for worker nodes comprise of set of scripts used to prepare the node, install packages, configure networking, storage etc. These configurations prepare the nodes for running kubelet
.
Overview
Problem Statement
Machine-Controller is used to manage the worker nodes in KubeOne clusters. It depends on user-data plugins to generate the required configurations for worker nodes. Each operating system requires its own user-data plugin. These configs are then injected into the worker nodes using provisioning utilities such as cloud-init or ignition. Eventually the nodes are bootstrapped.
Over time, it has been observed that this workflow has certain limitations.
Machine Controller Limitations
- Machine Controller expects ALL the supported user-data plugins to exist and be ready. User might only be interested in a subset of the available operating systems. For example, user might only want to work with
ubuntu
.
- The user-data plugins have templates defined in-code. Which is not ideal since code changes are required to update those templates. Then those code changes need to become a part of the subsequent releases for machine-controller and KubeOne. So we need a complete release cycle to ship those changes to customers.
- Managing configs for multiple cloud providers, OS flavors and OS versions, adds a lot of complexity and redundancy in machine-controller.
- Since the templates are defined in-code, there is no way for an end user to customize them to suit their use-cases.
- Each cloud provider sets some sort of limits for the size of
user-data
, machine won't be created in case of non-compliance. For example, at the time of writing this, AWS has set a hard limit of 16KB.
- Better support for air-gapped environments is required.
Operating System Manager was created to overcome these limitations.
Solution
Operating System Manager was created to solve the above mentioned issues. It decouples operating system configurations into dedicated and isolable resources for better modularity and maintainability.
Architecture
OSM introduces the following new resources which are Kubernetes Custom Resource Definitions:
OperatingSystemProfile
A resource that contains scripts for bootstrapping and provisioning the worker nodes, along with information about what operating systems and versions are supported for given scripts. Additionally, OSPs support templating so you can include some information from MachineDeployment or the OSM deployment itself.
Default OSPs for supported operating systems are provided/installed automatically by KubeOne. End users can create custom OSPs as well to fit their own use-cases. OSPs are immutable by design and any modifications to an existing OSP requires a version bump in .spec.version
.
Its dedicated controller runs in the seed cluster, in user cluster namespace, and operates on the OperatingSystemProfile custom resource. It is responsible for installing the default OSPs in user-cluster namespace.
OperatingSystemConfig
Immutable resource that contains the actual configurations that are going to be used to bootstrap and provision the worker nodes. It is a subset of OperatingSystemProfile. OperatingSystemProfile is a template while OperatingSystemConfig is an instance rendered with data from OperatingSystemProfile, MachineDeployment, and flags provided at OSM command-line level.
OperatingSystemConfigs have a 1-to-1 relation with the MachineDeployment. A dedicated controller watches the MachineDeployments and generates the OSCs in kube-system
and secrets in cloud-init-settings
namespaces in the cluster. Machine Controller then waits for the bootstrapping- and provisioning-secrets to become available. Once they are ready, it will extract the configurations from those secrets and pass them as user-data
to the to-be-provisioned machines.
Its dedicated controller runs in the seed cluster, in user cluster namespace, and is responsible for generating the OSCs in seed and secrets in cloud-init-settings
namespace in the user cluster.
For each MachineDeployment we have two types of configurations, which are stored in secrets:
- Bootstrap: Configuration used for initially setting up the machine and fetching the provisioning configuration.
- Provisioning: Configuration with the actual
cloud-config
that is used to provision the worker machine.
Single vs management/worker cluster mode
Conventionally OSM operates within a single cluster and expects all of the required resources like machine controller, MachineDeployments etc. to exist within the same cluster.
Along with that, OSM also supports environments where workloads are divided into management and worker clusters. This is useful since it helps with completely abstracting away OSM from the users of worker cluster; OSM will be running in the management cluster.
To use management/worker cluster mode, simply pass on the kubeconfig for management cluster using kubeconfig
and worker cluster using the worker-cluster-kubeconfig
flags at OSM level. With this topology the OSP and OSC exist within the management cluster while only the bootstrap and provisioning secrets are created in the worker clusters.
Air-gapped Environment
This controller was designed by keeping air-gapped environments in mind. Customers can use their own VM images by creating custom OSP profiles to provision nodes in a cluster that doesn't have outbound internet access.
More work is being done to make it even easier to use OSM in air-gapped environments.
Support
Information about supported OS versions can be found here.
Deploy OSM
- Install cert-manager for generating certificates used by webhooks since they serve using HTTPS
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.7.1/cert-manager.yaml
- Run
kubectl create namespace cloud-init-settings
to create namespace where secrets against OSC are stored
- Run
kubectl apply -f deploy/crd/
to install CRDs
- Run
kubectl apply -f deploy/
to deploy OSM
Development
Local Development
To run OSM locally:
- Either use a kind cluster or actual cluster and make sure that the correct context is loaded
- Run
kubectl apply -f deploy/crds
to install CRDs
- Create relevant OperatingSystemProfile resources. Check sample for reference.
- Run
make run
Testing
Simply run make test
Troubleshooting
If you encounter issues file an issue or talk to us on the #kubermatic channel on the Kubermatic Slack.
Contributing
Thanks for taking the time to join our community and start contributing!
Feedback and discussion are available on the mailing list.
Before you start
Pull requests
- We welcome pull requests. Feel free to dig through the issues and jump in.
Changelog
See the list of releases to find out about feature changes.