kubernetes-controller-sharding

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 12, 2023 License: Apache-2.0

README ΒΆ

Kubernetes Controller Sharding

Horizontally Scalable Kubernetes Controllers πŸš€

TL;DR πŸ“–

Make Kubernetes controllers horizontally scalable by distributing reconciliation of API objects across multiple controller instances. Remove the limitation to have only a single active replica (leader) per controller.

See Getting Started With Controller Sharding for a quick start with this project.

About ℹ️

I started this project as part of my Master's studies in Computer Science at the DHBW Center for Advanced Studies (CAS). I completed a study project ("half-time thesis") about this topic. I'm currently working on my Master's thesis to evolve the project based on the first iteration.

This repository contains the implementation belonging to the scientific work: the actual sharding implementation, a sample operator using controller sharding, a monitoring and continuous profiling setup, and some tools for development and evaluation purposes.

Motivation πŸ’‘

Typically, Kubernetes controllers use a leader election mechanism to determine a single active controller instance (leader). When deploying multiple instances of the same controller, there will only be one active instance at any given time, other instances will be on standby. This is done to prevent multiple controller instances from performing uncoordinated and conflicting actions (reconciliations) on a single object concurrently.

If the current leader goes down and loses leadership (e.g. network failure, rolling update) another instance takes over leadership and becomes the active instance. Such a setup can be described as an "active-passive HA setup". It minimizes "controller downtime" and facilitates fast fail-overs. However, it cannot be considered as "horizontal scaling" as work is not distributed among multiple instances.

This restriction imposes scalability limitations for Kubernetes controllers. I.e., the rate of reconciliations, amount of objects, etc. is limited by the machine size that the active controller runs on and the network bandwidth it can use. In contrast to usual stateless applications, one cannot increase the throughput of the system by adding more instances (scaling horizontally) but only by using bigger instances (scaling vertically).

Introduction πŸš€

This project allows scaling Kubernetes controllers horizontally by removing the restriction of having only one active replica per controller (allows active-active setups). It distributes reconciliation of Kubernetes objects across multiple controller instances, while still ensuring that only a single controller instance acts on a single object at any given time. For this, the project applies proven sharding mechanisms used in distributed databases to Kubernetes controllers.

The project introduces a sharder component that implements sharding in a generic way and can be applied to any Kubernetes controller (independent of the used programming language and controller framework). The sharder component is installed into the cluster along with a ClusterRing custom resource. A ClusterRing declares a virtual ring of sharded controller instances and specifies API resources that should be distributed across shards in the ring. It configures sharding on the cluster-scope level (i.e., objects in all namespaces), hence the ClusterRing name.

The watch cache is an expensive part of a controller regarding network transfer, CPU (decoding), and memory (local copy of all objects). When running multiple instances of a controller, the individual instances must thus only watch the subset of objects they are responsible for. Otherwise, the setup would only multiply the resource consumption. The sharder assigns objects to instances via the shard label. Each shard then uses a label selector with its own instance name to watch only the objects that are assigned to it.

Alongside the actual sharding implementation, this project contains a setup for simple development, testing, and evaluation of the sharding mechanism. This includes an example operator that uses controller sharding (webhosting-operator). See Getting Started With Controller Sharding for more details.

To support sharding in your Kubernetes controller, only three aspects need to be implemented:

  • announce ring membership and shard health: maintain individual shard Leases instead of performing leader election on a single Lease
  • only watch, cache, and reconcile objects assigned to the respective shard: add a shard-specific label selector to watches
  • acknowledge object movements during rebalancing: remove the drain and shard label when the drain label is set and stop reconciling the object

See Implement Sharding in Your Controller for more information and examples.

Design πŸ“

Sharding Architecture

See Design for more details on the sharding architecture and design decisions.

Discussion πŸ’¬

Feel free to contact me on the Kubernetes Slack (get an invitation): @timebertt.

TODO πŸ§‘β€πŸ’»

  • implement more tests: unit, integration, e2e tests
  • add ClusterRing API validation
  • implement a custom generic sharding-exporter

Directories ΒΆ

Path Synopsis
cmd
hack
pkg
apis/config/v1alpha1
Package v1alpha1 contains API Schema definitions for the config v1alpha1 API group +kubebuilder:object:generate=true +groupName=config.sharding.timebertt.dev
Package v1alpha1 contains API Schema definitions for the config v1alpha1 API group +kubebuilder:object:generate=true +groupName=config.sharding.timebertt.dev
apis/sharding/v1alpha1
Package v1alpha1 contains API Schema definitions for the sharding v1alpha1 API group +kubebuilder:object:generate=true +groupName=sharding.timebertt.dev
Package v1alpha1 contains API Schema definitions for the sharding v1alpha1 API group +kubebuilder:object:generate=true +groupName=sharding.timebertt.dev
sharding/leases
Package leases implements logic for determining the state of shards based on their membership Lease object.
Package leases implements logic for determining the state of shards based on their membership Lease object.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL