cache

package

v0.0.0-...-6111fc0 Latest Latest Go to latest Published: Dec 2, 2024 License: Apache-2.0 Imports: 30 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/kubewharf/godel-scheduler

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
type ClusterCollectable
- func NewEmptyClusterCollectable(owner string) *ClusterCollectable
- func (c ClusterCollectable) UpdateMetrics()
type Collectable
type NodeCollectable
- func Clone(c NodeCollectable) NodeCollectable
type SchedulerCache
- func New(handler commoncache.CacheHandler) SchedulerCache
type Snapshot
- func NewEmptySnapshot(handler commoncache.CacheHandler) *Snapshot

Constants ¶

View Source

const (
	Guaranteed string = "guaranteed"
	BestEffort string = "besteffort"
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type ClusterCollectable ¶

type ClusterCollectable struct {
	generationstore.RawStore
	// contains filtered or unexported fields
}

ClusterCollectable aggregates all the resource metrics we collected and flush them to the Prometheus.

func NewEmptyClusterCollectable ¶

func NewEmptyClusterCollectable(owner string) *ClusterCollectable

func (ClusterCollectable) UpdateMetrics ¶

func (c ClusterCollectable) UpdateMetrics()

type Collectable ¶

type Collectable interface {
	GetGuaranteedAllocatable() *api.Resource
	GetGuaranteedRequested() *api.Resource
	GetBestEffortAllocatable() *api.Resource
	GetBestEffortRequested() *api.Resource
	GetNode() *v1.Node
}

Collectable defines required node resources, currently only Guaranteed and BestEffort are supported. This is intended to be designed as a subset of framework.NodeInfo.

type NodeCollectable ¶

type NodeCollectable interface {
	Collectable
	generationstore.StoredObj
}

NodeCollectable composites Collectable and generationstore.StoreObj

func Clone ¶

func Clone(c NodeCollectable) NodeCollectable

type SchedulerCache ¶

type SchedulerCache interface {
	commoncache.ClusterEventsHandler

	// Dump takes a snapshot of the current cache. This is used for debugging
	// purposes only and shouldn't be confused with UpdateSnapshot function.
	// This method is expensive, and should be only used in non-critical path.
	Dump() *commoncache.Dump

	// PodCount returns the number of pods in the cache (including those from deleted nodes).
	PodCount() (int, error)

	// GetPod returns the pod from the cache with the same namespace and the
	// same name of the specified pod.
	GetPod(pod *v1.Pod) (*v1.Pod, error)

	// IsAssumedPod returns true if the pod is assumed and not expired.
	IsAssumedPod(pod *v1.Pod) (bool, error)

	// IsCachedPod returns true if this pod is cached by one of the following situations:
	//  1. in memory scheduling decisions made by the current scheduler.
	//  2. `assumed` pod state made by any scheduler.
	//  3. `bound` pod state made by any scheduler.
	IsCachedPod(pod *v1.Pod) (bool, error)

	// SetNodeInPartition sets node in partition of scheduler
	SetNodeInPartition(nodeName string) error

	// SetNodeOutOfPartition sets node out of partition of scheduler
	SetNodeOutOfPartition(nodeName string) error

	// NodeInThisPartition returns whether this node is in this scheduler's partition
	NodeInThisPartition(nodeName string) bool

	// UpdateSnapshot updates the passed infoSnapshot to the current contents of SchedulerCache.
	// The node info contains aggregated information of pods scheduled (including assumed to be)
	// on this node.
	// The snapshot only includes Nodes that are not deleted at the time this function is called.
	// nodeinfo.GetNode() is guaranteed to be not nil for all the nodes in the snapshot.
	UpdateSnapshot(snapshot *Snapshot) error

	// GetUnitStatus return the UnitStatus
	GetUnitStatus(unitKey string) unitstatus.UnitStatus
	// SetUnitSchedulingStatus get the scheduling status
	GetUnitSchedulingStatus(unitKey string) unitstatus.SchedulingStatus
	// SetUnitSchedulingStatus set the scheduling status
	SetUnitSchedulingStatus(unitKey string, status unitstatus.SchedulingStatus)

	// AssumePod assumes a pod scheduled and aggregates the pod's information into its node.
	// The implementation also decides the policy to expire pod before being confirmed (receiving Add event).
	// After expiration, its information would be subtracted.
	AssumePod(podInfo *framework.CachePodInfo) error
	// ForgetPod removes an assumed pod from cache.
	ForgetPod(podInfo *framework.CachePodInfo) error
	// FinishReserving signals that cache for assumed pod can be expired
	FinishReserving(pod *v1.Pod) error

	// ScrapeCollectable updates store with cache.nodeStore incrementally
	ScrapeCollectable(store generationstore.RawStore)

	// for resource reservation.
	AddReservation(request *schedulingv1a1.Reservation) error
	UpdateReservation(oldRequest, newRequest *schedulingv1a1.Reservation) error
	DeleteReservation(request *schedulingv1a1.Reservation) error
}

SchedulerCache collects pods' information and provides node-level aggregated information. It's intended for generic scheduler to do efficient lookup. SchedulerCache's operations are pod centric. It does incremental updates based on pod events. Pod events are sent via network. We don't have guaranteed delivery of all events: We use Reflector to list and watch from remote. Reflector might be slow and do a relist, which would lead to missing events.

State Machine of a pod's events in scheduler's cache:

+-------------------------------------------+  +----+
|                            Add            |  |    |
|                                           |  |    | Update
+      Assume                Add            v  v    |

Initial +--------> Assumed +------------+---> Added <--+

^                +   +               |       +
|                |   |               |       |
|                |   |           Add |       | Remove
|                |   |               |       |
|                |   |               +       |
+----------------+   +-----------> Expired   +----> Deleted
      Forget             Expire

Note that an assumed pod can expire, because if we haven't received Add event notifying us for a while, there might be some problems and we shouldn't keep the pod in cache anymore.

Note that "Initial", "Expired", and "Deleted" pods do not actually exist in cache. Based on existing use cases, we are making the following assumptions:

No pod would be assumed twice
A pod could be added without going through scheduler. In this case, we will see Add but not Assume event.
If a pod wasn't added, it wouldn't be removed or updated.
Both "Expired" and "Deleted" are valid end states. In case of some problems, e.g. network issue, a pod might have changed its state (e.g. added and deleted) without delivering notification to the cache.

func New ¶

func New(handler commoncache.CacheHandler) SchedulerCache

New returns a SchedulerCache implementation. It automatically starts a go routine that manages expiration of assumed pods. "ttl" is how long the assumed pod will get expired. "stop" is the channel that would close the background goroutine. "schedulerName" identifies the scheduler

type Snapshot ¶

type Snapshot struct {
	commonstore.CommonStoresSwitch
	// contains filtered or unexported fields
}

Snapshot is a snapshot of s NodeInfo and NodeTree order. The scheduler takes a snapshot at the beginning of each scheduling cycle and uses it for its operations in that cycle.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func NewEmptySnapshot ¶

func NewEmptySnapshot(handler commoncache.CacheHandler) *Snapshot

NewEmptySnapshot initializes a Snapshot struct and returns it.

func (*Snapshot) AssumePod ¶

func (s *Snapshot) AssumePod(podInfo *framework.CachePodInfo) error

AssumePod add pod and remove victims in snapshot.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) FindStore ¶

func (s *Snapshot) FindStore(storeName commonstore.StoreName) commonstore.Store

func (*Snapshot) ForgetPod ¶

func (s *Snapshot) ForgetPod(podInfo *framework.CachePodInfo) error

ForgetPod remove pod and add-back victims in snapshot.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) Get ¶

func (s *Snapshot) Get(nodeName string) (framework.NodeInfo, error)

Get returns the NodeInfo of the given node name.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) GetNodeInfo ¶

func (s *Snapshot) GetNodeInfo(nodeName string) framework.NodeInfo

GetNodeInfo returns a NodeInfo according to the nodeName.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) HavePodsWithAffinityList ¶

func (s *Snapshot) HavePodsWithAffinityList() []framework.NodeInfo

HavePodsWithAffinityList returns the list of nodes with at least one pod with inter-pod affinity

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) HavePodsWithRequiredAntiAffinityList ¶

func (s *Snapshot) HavePodsWithRequiredAntiAffinityList() []framework.NodeInfo

HavePodsWithRequiredAntiAffinityList returns the list of nodes with at least one pod with required inter-pod anti-affinity

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) InPartitionList ¶

func (s *Snapshot) InPartitionList() []framework.NodeInfo

InPartitionList returns the list of nodes which are in the partition of the scheduler Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) Len ¶

func (s *Snapshot) Len() int

func (*Snapshot) List ¶

func (s *Snapshot) List() []framework.NodeInfo

List returns the list of nodes in the snapshot.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) MakeBasicNodeGroup ¶

func (s *Snapshot) MakeBasicNodeGroup() framework.NodeGroup

func (*Snapshot) NodeInfos ¶

func (s *Snapshot) NodeInfos() framework.NodeInfoLister

NodeInfos returns a NodeInfoLister. Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) NumNodes ¶

func (s *Snapshot) NumNodes() int

NumNodes returns the number of nodes in the snapshot.

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

func (*Snapshot) OutOfPartitionList ¶

func (s *Snapshot) OutOfPartitionList() []framework.NodeInfo

OutOfPartitionList returns the list of nodes which are out of the partition of the scheduler

Note: Snapshot operations are lock-free. Our premise for removing lock: even if read operations are concurrent, write operations(AssumePod/ForgetPod/AddOneVictim) should always be serial.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
commonstores
_example_store
load_aware_store
movement_store
node_store
pdb_store
pod_store
podgroup_store
preemption_store
reservation_store
unit_status_store
debugger
fake
isolatedcache

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL