auto-node-repair

module
v0.0.0-...-73e42e7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 4, 2020 License: Apache-2.0

README

Auto Node Repair

  • Kubernetes controller to repair nodes which are NotReady by replacing them with new fresh nodes. This is done by manipulating AutoScalingGroups to repair the nodes.
  • Currently supports only AWS cloud provider.

Archived component no longer maintained

  • This component is not used by gardener anymore and no longer maintained. It was archived in the gardener-attic.
How does it work ?
  • Control loop for each Auto Scaling Group configured for a shoot cluster :
    • Identify Nodes which are NotReady since configurable amount of time (~10 minutes).
    • Create new nodes and wait until they are Ready
    • Cordon and drain all NotReady nodes.
    • Delete the NotReady nodes.
  • Apply this approach for each ASG in a shoot cluster one by one.
  • For a given ASG, create excess Nodes in parallel but cordon, drain and delete Nodes one by one.
  • If ASG does not have sufficient capacity for excess Nodes, first delete the NotReady nodes then create new one.
Make commands
Command Implication
Make compile Build the go code locally
Make release Deploy image into Gcloud
Steps for development

Use the deploy/kubernetes/deployment.yaml to deploy the auto-node-repair into the cluster. Refer to this file for more details.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL