scheduler

package
v0.0.0-...-bd24502 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 26, 2024 License: Apache-2.0 Imports: 28 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultCleanRootInterval        = 10000 * time.Millisecond // sleep between queue removal checks
	DefaultCleanExpiredAppsInterval = 24 * time.Hour           // sleep between apps removal checks
)

Variables

This section is empty.

Functions

func CreateCheckInfo

func CreateCheckInfo(succeeded bool, name, description, message string) dao.HealthCheckInfo

func GetSchedulerHealthStatus

func GetSchedulerHealthStatus(metrics *metrics.SchedulerMetrics, schedulerContext *ClusterContext) dao.SchedulerHealthDAOInfo

Types

type ClusterContext

type ClusterContext struct {
	locking.RWMutex
	// contains filtered or unexported fields
}

func NewClusterContext

func NewClusterContext(rmID, policyGroup string, config []byte) (*ClusterContext, error)

Create a new cluster context to be used outside of the event system. test only

func (*ClusterContext) GetApplication

func (cc *ClusterContext) GetApplication(appID, partitionName string) *objects.Application

Get the scheduling application based on the ID from the partition. Returns nil if the partition or app cannot be found. Visible for tests

func (*ClusterContext) GetLastHealthCheckResult

func (cc *ClusterContext) GetLastHealthCheckResult() *dao.SchedulerHealthDAOInfo

func (*ClusterContext) GetNode

func (cc *ClusterContext) GetNode(nodeID, partitionName string) *objects.Node

Get a scheduling node based on its name from the partition. Returns nil if the partition or node cannot be found. Visible for tests

func (*ClusterContext) GetPartition

func (cc *ClusterContext) GetPartition(partitionName string) *PartitionContext

func (*ClusterContext) GetPartitionMapClone

func (cc *ClusterContext) GetPartitionMapClone() map[string]*PartitionContext

func (*ClusterContext) GetPartitionWithoutClusterID

func (cc *ClusterContext) GetPartitionWithoutClusterID(partitionName string) *PartitionContext

func (*ClusterContext) GetPolicyGroup

func (cc *ClusterContext) GetPolicyGroup() string

Get the config name.

func (*ClusterContext) GetQueue

func (cc *ClusterContext) GetQueue(queueName string, partitionName string) *objects.Queue

Get the scheduling queue based on the queue path name from the partition. Returns nil if the partition or queue cannot be found. Visible for tests

func (*ClusterContext) GetRMInfoMapClone

func (cc *ClusterContext) GetRMInfoMapClone() map[string]*RMInformation

func (*ClusterContext) GetStartTime

func (cc *ClusterContext) GetStartTime() time.Time

func (*ClusterContext) GetUUID

func (cc *ClusterContext) GetUUID() string

func (*ClusterContext) NeedPreemption

func (cc *ClusterContext) NeedPreemption() bool

func (*ClusterContext) SetLastHealthCheckResult

func (cc *ClusterContext) SetLastHealthCheckResult(c *dao.SchedulerHealthDAOInfo)

func (*ClusterContext) SetRMInfo

func (cc *ClusterContext) SetRMInfo(rmID string, rmBuildInformation map[string]string)

func (*ClusterContext) Stop

func (cc *ClusterContext) Stop()

func (*ClusterContext) UpdateRMSchedulerConfig

func (cc *ClusterContext) UpdateRMSchedulerConfig(rmID string, config []byte) error

Locked version of the configuration update called outside of event system. Updates the current config via the config loader. Used in test only, normal updates use the internal call

type HealthChecker

type HealthChecker struct {
	locking.RWMutex
	// contains filtered or unexported fields
}

func NewHealthChecker

func NewHealthChecker(schedulerContext *ClusterContext) *HealthChecker

func (*HealthChecker) GetPeriod

func (c *HealthChecker) GetPeriod() time.Duration

func (*HealthChecker) IsEnabled

func (c *HealthChecker) IsEnabled() bool

func (*HealthChecker) Restart

func (c *HealthChecker) Restart()

func (*HealthChecker) Start

func (c *HealthChecker) Start()

Start executes healthCheck service in the background

func (*HealthChecker) Stop

func (c *HealthChecker) Stop()

type PartitionContext

type PartitionContext struct {
	ID   string
	RmID string // the RM the partition belongs to
	Name string // name of the partition

	// The partition write lock must not be held while manipulating an application.
	// Scheduling is running continuously as a lock free background task. Scheduling an application
	// acquires a write lock of the application object. While holding the write lock a list of nodes is
	// requested from the partition. This requires a read lock on the partition.
	// If the partition write lock is held while manipulating an application a dead lock could occur.
	// Since application objects handle their own locks there is no requirement to hold the partition lock
	// while manipulating the application.
	// Similarly adding, updating or removing a node or a queue should only hold the partition write lock
	// while manipulating the partition information not while manipulating the underlying objects.
	locking.RWMutex
	// contains filtered or unexported fields
}

func (*PartitionContext) AddApplication

func (pc *PartitionContext) AddApplication(app *objects.Application) error

AddApplication adds a new application to the partition. Runs the placement rules for the queue resolution. Creates a new dynamic queue if the queue does not yet exists. NOTE: this is a lock free call. It must NOT be called holding the PartitionContext lock.

func (*PartitionContext) AddNode

func (pc *PartitionContext) AddNode(node *objects.Node) error

AddNode adds the node to the partition. Updates the partition and root queue resources if the node is added successfully to the partition. NOTE: this is a lock free call. It must NOT be called holding the PartitionContext lock.

func (*PartitionContext) AddRejectedApplication

func (pc *PartitionContext) AddRejectedApplication(rejectedApplication *objects.Application, rejectedMessage string)

func (*PartitionContext) GetAllocatedResource

func (pc *PartitionContext) GetAllocatedResource() *resources.Resource

func (*PartitionContext) GetApplication

func (pc *PartitionContext) GetApplication(appID string) *objects.Application

func (*PartitionContext) GetApplications

func (pc *PartitionContext) GetApplications() []*objects.Application

GetApplications returns a slice of the current applications tracked by the partition.

func (*PartitionContext) GetCompletedApplications

func (pc *PartitionContext) GetCompletedApplications() []*objects.Application

GetCompletedApplications returns a slice of the completed applications tracked by the partition.

func (*PartitionContext) GetCurrentState

func (pc *PartitionContext) GetCurrentState() string

func (*PartitionContext) GetFullNodeIterator

func (pc *PartitionContext) GetFullNodeIterator() objects.NodeIterator

Create an ordered node iterator based on the node sort policy set for this partition. The iterator is nil if there are no nodes available.

func (*PartitionContext) GetNode

func (pc *PartitionContext) GetNode(nodeID string) *objects.Node

Get a node from the partition by nodeID.

func (*PartitionContext) GetNodeIterator

func (pc *PartitionContext) GetNodeIterator() objects.NodeIterator

Create an ordered node iterator based on the node sort policy set for this partition. The iterator is nil if there are no unreserved nodes available.

func (*PartitionContext) GetNodeSortingPolicyType

func (pc *PartitionContext) GetNodeSortingPolicyType() policies.SortingPolicy

func (*PartitionContext) GetNodeSortingResourceWeights

func (pc *PartitionContext) GetNodeSortingResourceWeights() map[string]float64

func (*PartitionContext) GetNodes

func (pc *PartitionContext) GetNodes() []*objects.Node

GetNodes returns a slice of all nodes unfiltered from the iterator

func (*PartitionContext) GetPartitionQueues

func (pc *PartitionContext) GetPartitionQueues() dao.PartitionQueueDAOInfo

GetPartitionQueues builds the queue info for the whole queue structure to pass to the webservice

func (*PartitionContext) GetPlacementRules

func (pc *PartitionContext) GetPlacementRules() []*dao.RuleDAO

GetPlacementRules returns the current active rule set as dao to expose to the webservice

func (*PartitionContext) GetQueue

func (pc *PartitionContext) GetQueue(name string) *objects.Queue

GetQueue returns queue from the structure based on the fully qualified name. Wrapper around the unlocked version getQueueInternal() Visible by tests

func (*PartitionContext) GetRejectedApplications

func (pc *PartitionContext) GetRejectedApplications() []*objects.Application

GetRejectedApplications returns a slice of the rejected applications tracked by the partition.

func (*PartitionContext) GetStateTime

func (pc *PartitionContext) GetStateTime() time.Time

func (*PartitionContext) GetTotalAllocationCount

func (pc *PartitionContext) GetTotalAllocationCount() int

func (*PartitionContext) GetTotalNodeCount

func (pc *PartitionContext) GetTotalNodeCount() int

func (*PartitionContext) GetTotalPartitionResource

func (pc *PartitionContext) GetTotalPartitionResource() *resources.Resource

func (*PartitionContext) UpdateAllocation

func (pc *PartitionContext) UpdateAllocation(alloc *objects.Allocation) (requestCreated bool, allocCreated bool, err error)

UpdateAllocation adds or updates an Allocation. If the Allocation has no NodeID specified, it is considered a pending allocation and processed appropriate. This call is idempotent, and can be called multiple times with the same allocation (such as on change updates from the shim) Upon successfully processing, two flags are returned: requestCreated (if a new request was added) and allocCreated (if an allocation was satisifed). This can be used by callers that need this information to take further action. NOTE: this is a lock free call. It must NOT be called holding the PartitionContext lock.

type RMInformation

type RMInformation struct {
	RMBuildInformation map[string]string
}

type Scheduler

type Scheduler struct {
	// contains filtered or unexported fields
}

Main Scheduler service that starts the needed sub services

func NewScheduler

func NewScheduler() *Scheduler

func (*Scheduler) GetClusterContext

func (s *Scheduler) GetClusterContext() *ClusterContext

Visible by tests

func (*Scheduler) HandleEvent

func (s *Scheduler) HandleEvent(ev interface{})

Implement methods for Scheduler events

func (*Scheduler) MultiStepSchedule

func (s *Scheduler) MultiStepSchedule(nAlloc int)

The scheduler for testing which runs nAlloc times the normal schedule routine. Visible by tests

func (*Scheduler) StartService

func (s *Scheduler) StartService(handlers handler.EventHandlers, manualSchedule bool)

Start service

func (*Scheduler) Stop

func (s *Scheduler) Stop()

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL