state

package
v1.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 7, 2024 License: Apache-2.0 Imports: 34 Imported by: 1

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ClusterStateNodesCount = opmetrics.NewPrometheusGauge(
		crmetrics.Registry,
		prometheus.GaugeOpts{
			Namespace: metrics.Namespace,
			Subsystem: stateSubsystem,
			Name:      "node_count",
			Help:      "Current count of nodes in cluster state",
		},
		[]string{},
	)
	ClusterStateSynced = opmetrics.NewPrometheusGauge(
		crmetrics.Registry,
		prometheus.GaugeOpts{
			Namespace: metrics.Namespace,
			Subsystem: stateSubsystem,
			Name:      "synced",
			Help:      "Returns 1 if cluster state is synced and 0 otherwise. Synced checks that nodeclaims and nodes that are stored in the APIServer have the same representation as Karpenter's cluster state",
		},
		[]string{},
	)
	ClusterStateUnsyncedTimeSeconds = opmetrics.NewPrometheusGauge(
		crmetrics.Registry,
		prometheus.GaugeOpts{
			Namespace: metrics.Namespace,
			Subsystem: stateSubsystem,
			Name:      "unsynced_time_seconds",
			Help:      "The time for which cluster state is not synced",
		},
		[]string{},
	)
	PodSchedulingDecisionSeconds = opmetrics.NewPrometheusHistogram(
		crmetrics.Registry,
		prometheus.HistogramOpts{
			Namespace: metrics.Namespace,
			Subsystem: metrics.PodSubsystem,
			Name:      "scheduling_decision_duration_seconds",
			Help:      "The time it takes for Karpenter to first try to schedule a pod after it's been seen.",
			Buckets:   metrics.DurationBuckets(),
		},
		[]string{},
	)
)

Functions

func IsPodBlockEvictionError added in v1.0.0

func IsPodBlockEvictionError(err error) bool

func RequireNoScheduleTaint

func RequireNoScheduleTaint(ctx context.Context, kubeClient client.Client, addTaint bool, nodes ...*StateNode) error

RequireNoScheduleTaint will add/remove the karpenter.sh/disruption:NoSchedule taint from the candidates. This is used to enforce no taints at the beginning of disruption, and to add/remove taints while executing a disruption action. nolint:gocyclo

Types

type Cluster

type Cluster struct {
	// contains filtered or unexported fields
}

Cluster maintains cluster state that is often needed but expensive to compute.

func NewCluster

func NewCluster(clk clock.Clock, client client.Client, cloudProvider cloudprovider.CloudProvider) *Cluster

func (*Cluster) AckPods added in v1.1.0

func (c *Cluster) AckPods(pods ...*corev1.Pod)

AckPods marks the pod as acknowledged for scheduling from the provisioner. This is only done once per-pod.

func (*Cluster) ClearPodSchedulingMappings added in v1.1.0

func (c *Cluster) ClearPodSchedulingMappings(podKey types.NamespacedName)

func (*Cluster) ConsolidationState

func (c *Cluster) ConsolidationState() time.Time

ConsolidationState returns a timestamp of the last time that the cluster state with respect to consolidation changed. If nothing changes, this timestamp resets after five minutes to force watchers that use this to defer work to occasionally revalidate that nothing external (e.g. an instance type becoming available) has changed that now makes it possible for them to operate. Time was chosen as the type here as it allows comparisons using the built-in monotonic clock.

func (*Cluster) DeleteDaemonSet

func (c *Cluster) DeleteDaemonSet(key types.NamespacedName)

func (*Cluster) DeleteNode

func (c *Cluster) DeleteNode(name string)

func (*Cluster) DeleteNodeClaim

func (c *Cluster) DeleteNodeClaim(name string)

func (*Cluster) DeletePod

func (c *Cluster) DeletePod(podKey types.NamespacedName)

func (*Cluster) ForEachNode

func (c *Cluster) ForEachNode(f func(n *StateNode) bool)

ForEachNode calls the supplied function once per node object that is being tracked. It is not safe to store the state.StateNode object, it should be only accessed from within the function provided to this method.

func (*Cluster) ForPodsWithAntiAffinity

func (c *Cluster) ForPodsWithAntiAffinity(fn func(p *corev1.Pod, n *corev1.Node) bool)

ForPodsWithAntiAffinity calls the supplied function once for each pod with required anti affinity terms that is currently bound to a node. The pod returned may not be up-to-date with respect to status, however since the anti-affinity terms can't be modified, they will be correct.

func (*Cluster) GetDaemonSetPod

func (c *Cluster) GetDaemonSetPod(daemonset *appsv1.DaemonSet) *corev1.Pod

func (*Cluster) IsNodeNominated

func (c *Cluster) IsNodeNominated(providerID string) bool

IsNodeNominated returns true if the given node was expected to have a pod bound to it during a recent scheduling batch

func (*Cluster) MarkForDeletion

func (c *Cluster) MarkForDeletion(providerIDs ...string)

TODO remove this when v1alpha5 APIs are deprecated. With v1 APIs Karpenter relies on the existence of the karpenter.sh/disruption taint to know when a node is marked for deletion. MarkForDeletion marks the node as pending deletion in the internal cluster state

func (*Cluster) MarkPodSchedulingDecisions added in v1.1.0

func (c *Cluster) MarkPodSchedulingDecisions(podErrors map[*corev1.Pod]error, pods ...*corev1.Pod)

MarkPodSchedulingDecisions keeps track of when we first tried to schedule a pod to a node. This also marks when the pod is first seen as schedulable for pod metrics. We'll only emit a metric for a pod if we haven't done it before.

func (*Cluster) MarkUnconsolidated

func (c *Cluster) MarkUnconsolidated() time.Time

MarkUnconsolidated marks the cluster state as being unconsolidated. This should be called in any situation where something in the cluster has changed such that the cluster may have moved from a non-consolidatable to a consolidatable state.

func (*Cluster) Nodes

func (c *Cluster) Nodes() StateNodes

Nodes creates a DeepCopy of all state nodes. NOTE: This is very inefficient so this should only be used when DeepCopying is absolutely necessary

func (*Cluster) NominateNodeForPod

func (c *Cluster) NominateNodeForPod(ctx context.Context, providerID string)

NominateNodeForPod records that a node was the target of a pending pod during a scheduling batch

func (*Cluster) PodAckTime added in v1.1.0

func (c *Cluster) PodAckTime(podKey types.NamespacedName) time.Time

PodAckTime will return the time the pod was first seen in our cache.

func (*Cluster) PodSchedulingDecisionTime added in v1.1.0

func (c *Cluster) PodSchedulingDecisionTime(podKey types.NamespacedName) time.Time

PodSchedulingDecisionTime returns when Karpenter first decided if a pod could schedule a pod in scheduling simulations. This returns 0, false if Karpenter never made a decision on the pod.

func (*Cluster) PodSchedulingSuccessTime added in v1.1.0

func (c *Cluster) PodSchedulingSuccessTime(podKey types.NamespacedName) time.Time

PodSchedulingSuccessTime returns when Karpenter first thought it could schedule a pod in its scheduling simulation. This returns 0, false if the pod was never considered in scheduling as a pending pod.

func (*Cluster) Reset

func (c *Cluster) Reset()

Reset the cluster state for unit testing

func (*Cluster) Synced

func (c *Cluster) Synced(ctx context.Context) (synced bool)

Synced validates that the NodeClaims and the Nodes that are stored in the apiserver have the same representation in the cluster state. This is to ensure that our view of the cluster is as close to correct as it can be when we begin to perform operations utilizing the cluster state as our source of truth

func (*Cluster) UnmarkForDeletion

func (c *Cluster) UnmarkForDeletion(providerIDs ...string)

TODO remove this when v1alpha5 APIs are deprecated. With v1 APIs Karpenter relies on the existence of the karpenter.sh/disruption taint to know when a node is marked for deletion. UnmarkForDeletion removes the marking on the node as a node the controller intends to delete

func (*Cluster) UpdateDaemonSet

func (c *Cluster) UpdateDaemonSet(ctx context.Context, daemonset *appsv1.DaemonSet) error

func (*Cluster) UpdateNode

func (c *Cluster) UpdateNode(ctx context.Context, node *corev1.Node) error

func (*Cluster) UpdateNodeClaim

func (c *Cluster) UpdateNodeClaim(nodeClaim *v1.NodeClaim)

func (*Cluster) UpdatePod

func (c *Cluster) UpdatePod(ctx context.Context, pod *corev1.Pod) error

type PodBlockEvictionError added in v1.0.0

type PodBlockEvictionError struct {
	// contains filtered or unexported fields
}

func NewPodBlockEvictionError added in v1.0.0

func NewPodBlockEvictionError(err error) *PodBlockEvictionError

type StateNode

type StateNode struct {
	Node      *corev1.Node
	NodeClaim *v1.NodeClaim
	// contains filtered or unexported fields
}

StateNode is a cached version of a node in the cluster that maintains state which is expensive to compute every time it's needed. This currently contains node utilization across all the allocatable resources, but will soon be used to compute topology information. +k8s:deepcopy-gen=true nolint: revive

func NewNode

func NewNode() *StateNode

func (*StateNode) Allocatable

func (in *StateNode) Allocatable() corev1.ResourceList

func (*StateNode) Annotations

func (in *StateNode) Annotations() map[string]string

func (*StateNode) Available

func (in *StateNode) Available() corev1.ResourceList

Available is allocatable minus anything allocated to pods.

func (*StateNode) Capacity

func (in *StateNode) Capacity() corev1.ResourceList

func (*StateNode) DaemonSetLimits

func (in *StateNode) DaemonSetLimits() corev1.ResourceList

func (*StateNode) DaemonSetRequests

func (in *StateNode) DaemonSetRequests() corev1.ResourceList

func (*StateNode) DeepCopy

func (in *StateNode) DeepCopy() *StateNode

DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new StateNode.

func (*StateNode) DeepCopyInto

func (in *StateNode) DeepCopyInto(out *StateNode)

DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.

func (*StateNode) HostName

func (in *StateNode) HostName() string

func (*StateNode) HostPortUsage

func (in *StateNode) HostPortUsage() *scheduling.HostPortUsage

func (*StateNode) Initialized

func (in *StateNode) Initialized() bool

func (*StateNode) Labels

func (in *StateNode) Labels() map[string]string

func (*StateNode) Managed

func (in *StateNode) Managed() bool

func (*StateNode) MarkedForDeletion

func (in *StateNode) MarkedForDeletion() bool

func (*StateNode) Name

func (in *StateNode) Name() string

func (*StateNode) Nominate

func (in *StateNode) Nominate(ctx context.Context)

func (*StateNode) Nominated

func (in *StateNode) Nominated() bool

func (*StateNode) PodLimits

func (in *StateNode) PodLimits() corev1.ResourceList

func (*StateNode) PodRequests

func (in *StateNode) PodRequests() corev1.ResourceList

func (*StateNode) Pods

func (in *StateNode) Pods(ctx context.Context, kubeClient client.Client) ([]*corev1.Pod, error)

Pods gets the pods assigned to the Node based on the kubernetes api-server bindings

func (*StateNode) ProviderID

func (in *StateNode) ProviderID() string

ProviderID is the key that is used to map this StateNode If the Node and NodeClaim have a providerID, this should map to a real providerID If the Node does not have a providerID, this will map to the node name

func (*StateNode) Registered

func (in *StateNode) Registered() bool

func (*StateNode) ReschedulablePods added in v0.34.0

func (in *StateNode) ReschedulablePods(ctx context.Context, kubeClient client.Client) ([]*corev1.Pod, error)

ReschedulablePods gets the pods assigned to the Node that are reschedulable based on the kubernetes api-server bindings

func (*StateNode) Taints

func (in *StateNode) Taints() []corev1.Taint

func (*StateNode) ValidateNodeDisruptable added in v1.0.0

func (in *StateNode) ValidateNodeDisruptable(ctx context.Context, kubeClient client.Client) error

ValidateNodeDisruptable returns an error if the StateNode cannot be disrupted This checks all associated StateNode internals, node labels, and do-not-disrupt annotations on the node. ValidateNodeDisruptable takes in a recorder to emit events on the nodeclaims when the state node is not a candidate

func (*StateNode) ValidatePodsDisruptable added in v1.0.0

func (in *StateNode) ValidatePodsDisruptable(ctx context.Context, kubeClient client.Client, pdbs pdb.Limits) ([]*corev1.Pod, error)

ValidatePodDisruptable returns an error if the StateNode contains a pod that cannot be disrupted This checks associated PDBs and do-not-disrupt annotations for each pod on the node. ValidatePodDisruptable takes in a recorder to emit events on the nodeclaims when the state node is not a candidate

func (*StateNode) VolumeUsage

func (in *StateNode) VolumeUsage() *scheduling.VolumeUsage

type StateNodes

type StateNodes []*StateNode

StateNodes is a typed version of a list of *Node nolint: revive

func (StateNodes) Active

func (n StateNodes) Active() StateNodes

Active filters StateNodes that are not in a MarkedForDeletion state

func (StateNodes) Deleting

func (n StateNodes) Deleting() StateNodes

Deleting filters StateNodes that are in a MarkedForDeletion state

func (StateNodes) Pods

func (n StateNodes) Pods(ctx context.Context, kubeClient client.Client) ([]*corev1.Pod, error)

Pods gets the pods assigned to all StateNodes based on the kubernetes api-server bindings

func (StateNodes) ReschedulablePods added in v0.34.0

func (n StateNodes) ReschedulablePods(ctx context.Context, kubeClient client.Client) ([]*corev1.Pod, error)

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL