lws

module
v0.4.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 18, 2024 License: Apache-2.0

README

The LeaderWorkerSet API (LWS)

GoReport Widget Latest Release

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication. It aims to address common deployment patterns of AI/ML inference workloads, especially multi-host inference workloads where the LLM will be sharded and run across multiple devices on multiple nodes. The initial design and proposal can be found at: http://bit.ly/k8s-LWS.

Conceptual view

image

Feature overview

  • Group of Pods as a unit: Supports a tightly managed group of pods that represent a “super pod”
    • Unique pod identity: Each pod in the group has a unique index from 0 to n-1.
    • Parallel creation: Pods in the group will have the same lifecycle and be created in parallel.
  • Dual-template, one for leader and one for the workers: A replica is a group of a single leader and a set of workers, and allow to specify a template for the workers and optionally use a second one for the leader pod.
  • Multiple groups with identical specifications: Supports creating multiple “replicas” of the above mentioned group. Each group is a single unit for rolling update, scaling, and maps to a single exclusive topology for placement.
  • A scale subresource: A scale endpoint is exposed for HPA to dynamically scale the number replicas (aka number of groups)
  • Rollout and Rolling update: Supports performing rollout and rolling update at the group level, which means the groups are upgraded one by one as a unit (i.e. the pods within a group are updated together).
  • Topology-aware placement: Opt-in support for pods in the same group to be co-located in the same topology.
  • All-or-nothing restart for failure handling: Opt-in support for all pods in the group to be recreated if one pod in the group failed or one container in the pods is restarted.

Installation

Read the installation guide to learn more.

Examples

Read the examples to learn more.

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

You can reach the maintainers of this project at:

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

Directories

Path Synopsis
api
leaderworkerset/v1
Package v1 contains API Schema definitions for the leaderworkerset v1 API group +kubebuilder:object:generate=true +groupName=leaderworkerset.x-k8s.io
Package v1 contains API Schema definitions for the leaderworkerset v1 API group +kubebuilder:object:generate=true +groupName=leaderworkerset.x-k8s.io
client-go
clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
clientset/versioned/typed/leaderworkerset/v1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
clientset/versioned/typed/leaderworkerset/v1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
docs
hack
pkg
test

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL