Documentation ¶
Overview ¶
Package respool is responsible for
- Building and maintaining the in-memory resource pool tree and perform operations it.
- Providing API handlers to operate on the resource pool hierarchy.
- Performing admission control of gangs in the resource pool.
Each leaf resource pool maintains 2 queues 1. Pending queue: containing all the incoming/enqueued tasks into the resource pool. 2. Controller queue: containing controller tasks which are transferred from the pending queue if they can't be admitted. 3. Non-Preemptible queue: non-preemptible tasks are moved to this queue if not admissible. 4. Revocable queue: revocable tasks if can not be admitted are moved to this queue, primary goal to maintain revocable queue is to unblock non-revocable tasks from admitting.
Admission cycle ¶
Each resource pool, leaf or non-leaf, maintains an in-memory resource entitlement for non-revocable and revocable tasks. Entitlement is calculated outside this package, resource pool exposes API to calculate them. Entitlement is calculated, based on resource pools limit, reservation, slack limit, demand (resource requried) for pending tasks and allocation (already admitted tasks).
Admission cycle iterates over above mentioned 4 queues in following sequeunce for each leaf resource pools 1) Non-Preemptible 2) Controller 3) Revocable 4) Pending
For gangs dequeued from each queue, admission control validates 3 conditions before admitting the task 1) Sufficient entitlement based on task type (revocable or non-revocable). 2) Non-Prememtible tasks do not use resources more than Resource pool reservation. 3) Controller tasks do not use resources more than controller limit.
Below is an example for admission control, on how it works:
1. Resource pools first dequeus gangs from non-preemptible queue, and tries to admit as many non-preemptible (which are also non-revocalbe tasks) untill resource pool reservation.
Resource pool reservation:
cpu:100 mem:1000 disk:1000 gpu:10
Task requirement:
cpu:10 mem:100 disk:100 gpu:1
10 non-preemptible tasks are admittable, if none is already running. If some resources are already allocated to non-preemptible tasks then fewer will be admitted.
If resources are allocated to revocable tasks, then they will be preempted for non-preemptible tasks.
The resource pool then dequeues gangs from the controller queue and tries to admit as many controller tasks it can. This is decided by the max_controller_percentage, which is defined as the maximum resources the controller tasks can use a percentage of the resource pools reservation. For eg if the max_controller_percentage=10 and resource pool's reservation is: cpu:100 mem:1000 disk:1000 gpu:10
Then the maximum resources the controller tasks can use is 10% of the reservation, i.e.: cpu:10 mem:100 disk:100 gpu:1
If these resources are not used by the controller tasks they will be given to other tasks. Once the controller task can be admitted it is removed from the controller queue and the resoures are added to the controller allocation. One we have exhausted all the controller tasks which can be admitted we move on to the second phase.
3. Resource pool then dequeus from revocable queue, revocable tasks are admitted using slack entitlement. Resources available for revocable tasks are constraint to slack limit. Behavior is some what comparable to controller limit. Resource pool reservation
cpu:100 mem:1000 disk:1000 gpu:10
Slack limit = 30%, i.e. max resource available for revocable tasks are 30% of reservation
cpu:30 mem:300 disk:300 gpu:3
Slack limit is enforced to prevent revocable tasks from hogging all resources in resource pool. Secondly, slack limit does not guarantee resources to revocable tasks, as they are best effort in behavior and have lower preference then non-revocable tasks.
- We now look at the pending queue to admit the rest of the tasks. The resource pool keeps deuqueueing gangs until it reaches a gang which can't be admitted. - At this point, if a task is non-preemptible, then it is moved to non-preemptible queue. - If a task is controller task then it is moved to controller queue. - Similarly, if a task is revocable task then moved to revocable queue, to unblock non-revocable tasks. - Once it reaches a task which can't be admitted and is *not* a controller task, non-preemptible or revocable tasks then it stops the admission control and returns the list of gangs to the scheduler.
Index ¶
- Constants
- func ValidateChildrenReservations(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
- func ValidateControllerLimit(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
- func ValidateCycle(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
- func ValidateParent(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
- func ValidateResourcePool(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
- func ValidateResourcePoolPath(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
- func ValidateSiblings(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
- type Metrics
- type QueueType
- type ResPool
- type ResourcePoolConfigData
- type ResourcePoolConfigValidatorFunc
- type Tree
- type Validator
Constants ¶
const ( // ResourcePoolPathDelimiter is the delimiter for the resource pool path ResourcePoolPathDelimiter = "/" // DefaultResPoolSchedulingPolicy is the default scheduling policy for respool DefaultResPoolSchedulingPolicy = respool.SchedulingPolicy_PriorityFIFO )
Variables ¶
This section is empty.
Functions ¶
func ValidateChildrenReservations ¶
func ValidateChildrenReservations(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateChildrenReservations All Child reservations against it parent
func ValidateControllerLimit ¶
func ValidateControllerLimit(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateControllerLimit validates the controller limit
func ValidateCycle ¶
func ValidateCycle(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateCycle if adding/updating current pool would result in a cycle
func ValidateParent ¶
func ValidateParent(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateParent {current} resource pool against it's {parent}
func ValidateResourcePool ¶
func ValidateResourcePool(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateResourcePool if resource configurations are correct
func ValidateResourcePoolPath ¶
func ValidateResourcePoolPath(_ Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateResourcePoolPath validates the resource pool path
func ValidateSiblings ¶
func ValidateSiblings(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
ValidateSiblings validates the resource pool name is unique amongst its siblings
Types ¶
type Metrics ¶
type Metrics struct { APICreateResourcePool tally.Counter CreateResourcePoolSuccess tally.Counter CreateResourcePoolFail tally.Counter CreateResourcePoolRollbackFail tally.Counter APIGetResourcePool tally.Counter GetResourcePoolSuccess tally.Counter GetResourcePoolFail tally.Counter APILookupResourcePoolID tally.Counter LookupResourcePoolIDSuccess tally.Counter LookupResourcePoolIDFail tally.Counter APIUpdateResourcePool tally.Counter UpdateResourcePoolSuccess tally.Counter UpdateResourcePoolFail tally.Counter UpdateResourcePoolRollbackFail tally.Counter APIDeleteResourcePool tally.Counter DeleteResourcePoolSuccess tally.Counter DeleteResourcePoolFail tally.Counter APIQueryResourcePools tally.Counter QueryResourcePoolsSuccess tally.Counter QueryResourcePoolsFail tally.Counter PendingQueueSize tally.Gauge RevocableQueueSize tally.Gauge ControllerQueueSize tally.Gauge NPQueueSize tally.Gauge TotalAllocation scalar.GaugeMaps NonPreemptibleAllocation scalar.GaugeMaps NonSlackAllocation scalar.GaugeMaps SlackAllocation scalar.GaugeMaps ControllerAllocation scalar.GaugeMaps TotalEntitlement scalar.GaugeMaps NonSlackEntitlement scalar.GaugeMaps SlackEntitlement scalar.GaugeMaps NonSlackAvailable scalar.GaugeMaps SlackAvailable scalar.GaugeMaps Demand scalar.GaugeMaps SlackDemand scalar.GaugeMaps ResourcePoolReservation scalar.GaugeMaps ResourcePoolLimit scalar.GaugeMaps ControllerLimit scalar.GaugeMaps SlackLimit scalar.GaugeMaps }
Metrics is a placeholder for all metrics in respool.
func NewMetrics ¶
NewMetrics returns a new instance of respool.Metrics.
type QueueType ¶
type QueueType int
QueueType defines the different queues of the resource pool from which the gangs are admitted.
const ( // PendingQueue is the default queue for all incoming gangs PendingQueue QueueType = iota + 1 // ControllerQueue is the queue for controller gangs ControllerQueue // NonPreemptibleQueue is the queue for non preemptible gangs NonPreemptibleQueue // RevocableQueue is the queue for revocable gangs // which are scheduled using slack resources [cpus] RevocableQueue )
type ResPool ¶
type ResPool interface { // Returns the config of the resource pool. ResourcePoolConfig() *respool.ResourcePoolConfig // Sets the resource pool config. SetResourcePoolConfig(*respool.ResourcePoolConfig) // Returns a map of resources and its resource config. Resources() map[string]*respool.ResourceConfig // Converts to resource pool info. ToResourcePoolInfo() *respool.ResourcePoolInfo // Aggregates the child reservations by resource type. AggregatedChildrenReservations() (map[string]float64, error) // Enqueues gang (task list) into resource pool pending queue. EnqueueGang(gang *resmgrsvc.Gang) error // Dequeues gangs (task list) from the resource pool. DequeueGangs(int) ([]*resmgrsvc.Gang, error) // PeekGangs returns a list of gangs from the resource pool's queue based // on the queue type. limit determines the max number of gangs to be // returned. PeekGangs(qt QueueType, limit uint32) ([]*resmgrsvc.Gang, error) // SetEntitlement sets the entitlement of non-revocable resources // for non-revocable tasks + revocable tasks for this resource pool. SetEntitlement(res *scalar.Resources) // SetSlackEntitlement sets the entitlement of revocable cpus // + non-revocable resources [mem, disk, gpu] for the resource pool SetSlackEntitlement(res *scalar.Resources) // SetNonSlackEntitlement Sets the entitlement of non-revocable resources // for non-revocable tasks for this resource pool SetNonSlackEntitlement(res *scalar.Resources) // GetEntitlement returns the total entitlement of non-revocable resources // for this resource pool. GetEntitlement() *scalar.Resources // GetSlackEntitlement returns the entitlement for revocable tasks. GetSlackEntitlement() *scalar.Resources // GetNonSlackEntitlement returns the entitlement for non-revocable tasks. GetNonSlackEntitlement() *scalar.Resources // AddToAllocation adds resources to current allocation // for the resource pool. AddToAllocation(*scalar.Allocation) error // SubtractFromAllocation recaptures the resources from task. SubtractFromAllocation(*scalar.Allocation) error // GetTotalAllocatedResources returns the total resource allocation for the resource // pool. GetTotalAllocatedResources() *scalar.Resources // GetSlackAllocatedResources returns the slack allocation for the resource pool. GetSlackAllocatedResources() *scalar.Resources // GetNonSlackAllocatedResources returns resources allocated to non-revocable tasks. GetNonSlackAllocatedResources() *scalar.Resources // CalculateTotalAllocatedResources calculates the total allocation recursively for // all the children. CalculateTotalAllocatedResources() *scalar.Resources // SetAllocation sets the resource allocation for the resource pool. // Used by tests. SetTotalAllocatedResources(resources *scalar.Resources) // AddToDemand adds resources to current demand // for the resource pool. AddToDemand(res *scalar.Resources) error // SubtractFromDemand subtracts resources from current demand // for the resource pool. SubtractFromDemand(res *scalar.Resources) error // GetDemand returns the resource demand for the resource pool. GetDemand() *scalar.Resources // CalculateDemand calculates the resource demand // for the resource pool recursively for the subtree. CalculateDemand() *scalar.Resources // AddToSlackDemand adds resources to slack demand // for the resource pool. AddToSlackDemand(res *scalar.Resources) error // SubtractFromSlackDemand subtracts resources from slack demand // for the resource pool. SubtractFromSlackDemand(res *scalar.Resources) error // GetSlackDemand returns the slack resource demand for the resource pool. GetSlackDemand() *scalar.Resources // CalculateSlackDemand calculates the slack resource demand // for the resource pool recursively for the subtree. CalculateSlackDemand() *scalar.Resources // GetSlackLimit returns the limit of resources [mem,disk] // can be used by revocable tasks. GetSlackLimit() *scalar.Resources // AddInvalidTask will add the killed tasks to respool which can be // discarded asynchronously which scheduling. AddInvalidTask(task *peloton.TaskID) // UpdateResourceMetrics updates metrics for this resource pool // on each entitlement cycle calculation (15s) UpdateResourceMetrics() // contains filtered or unexported methods }
ResPool is a node in a resource pool hierarchy.
func NewRespool ¶
func NewRespool( scope tally.Scope, id string, parent ResPool, config *respool.ResourcePoolConfig, preemptionConfig rc.PreemptionConfig) (ResPool, error)
NewRespool will initialize the resource pool node and return that.
type ResourcePoolConfigData ¶
type ResourcePoolConfigData struct { ID *peloton.ResourcePoolID // Resource Pool ID Path *respool.ResourcePoolPath // Resource Pool path ResourcePoolConfig *respool.ResourcePoolConfig // Resource Pool Configuration }
ResourcePoolConfigData holds the data that needs to be validated
type ResourcePoolConfigValidatorFunc ¶
type ResourcePoolConfigValidatorFunc func(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error
ResourcePoolConfigValidatorFunc Validator func for registering custom validator
type Tree ¶
type Tree interface { // Start initializes the respool tree by loading all resource pools // from DB. This should be called when a // resource manager gains the leadership. Start() error // Stop resets a respool tree when a resource manager lost the // leadership. Stop() error // Get returns a respool node by the given ID Get(ID *peloton.ResourcePoolID) (ResPool, error) // GetByPath returns the respool node by the given path GetByPath(path *respool.ResourcePoolPath) (ResPool, error) // GetAllNodes returns all respool nodes or all leaf respool nodes. GetAllNodes(leafOnly bool) *list.List // Upsert add/update a resource pool poolConfig to the tree Upsert(ID *peloton.ResourcePoolID, resPoolConfig *respool.ResourcePoolConfig) error // UpdatedChannel is written to whenever the resource tree is changed. There // may be only one event for multiple updates. // TODO: Redo package imports, such that a method Calculator.SuggestRefresh // can be used instead, without breaking circular imports. UpdatedChannel() <-chan struct{} // Delete deletes the resource pool from the tree Delete(ID *peloton.ResourcePoolID) error }
Tree defines the interface for a Resource Pool Tree
func NewTree ¶
func NewTree( scope tally.Scope, respoolOps ormobjects.ResPoolOps, jobStore storage.JobStore, taskStore storage.TaskStore, cfg rc.PreemptionConfig, ) Tree
NewTree will initializing the respool tree
type Validator ¶
type Validator interface { // Validate validates the resource pool config Validate(data ResourcePoolConfigData) error // register a slice of Validator functions Register(validatorFuncs []ResourcePoolConfigValidatorFunc) (Validator, error) }
Validator performs validations on the resource pool config
func NewResourcePoolConfigValidator ¶
NewResourcePoolConfigValidator returns a new resource pool config validator