respool

package
v0.0.0-...-c0686e8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 10, 2022 License: Apache-2.0 Imports: 20 Imported by: 0

Documentation

Overview

Package respool is responsible for

  1. Building and maintaining the in-memory resource pool tree and perform operations it.
  2. Providing API handlers to operate on the resource pool hierarchy.
  3. Performing admission control of gangs in the resource pool.

Each leaf resource pool maintains 2 queues 1. Pending queue: containing all the incoming/enqueued tasks into the resource pool. 2. Controller queue: containing controller tasks which are transferred from the pending queue if they can't be admitted. 3. Non-Preemptible queue: non-preemptible tasks are moved to this queue if not admissible. 4. Revocable queue: revocable tasks if can not be admitted are moved to this queue, primary goal to maintain revocable queue is to unblock non-revocable tasks from admitting.

Admission cycle

Each resource pool, leaf or non-leaf, maintains an in-memory resource entitlement for non-revocable and revocable tasks. Entitlement is calculated outside this package, resource pool exposes API to calculate them. Entitlement is calculated, based on resource pools limit, reservation, slack limit, demand (resource requried) for pending tasks and allocation (already admitted tasks).

Admission cycle iterates over above mentioned 4 queues in following sequeunce for each leaf resource pools 1) Non-Preemptible 2) Controller 3) Revocable 4) Pending

For gangs dequeued from each queue, admission control validates 3 conditions before admitting the task 1) Sufficient entitlement based on task type (revocable or non-revocable). 2) Non-Prememtible tasks do not use resources more than Resource pool reservation. 3) Controller tasks do not use resources more than controller limit.

Below is an example for admission control, on how it works:

1. Resource pools first dequeus gangs from non-preemptible queue, and tries to admit as many non-preemptible (which are also non-revocalbe tasks) untill resource pool reservation.

Resource pool reservation:

cpu:100
mem:1000
disk:1000
gpu:10

Task requirement:

cpu:10
mem:100
disk:100
gpu:1

10 non-preemptible tasks are admittable, if none is already running. If some resources are already allocated to non-preemptible tasks then fewer will be admitted.

If resources are allocated to revocable tasks, then they will be preempted for non-preemptible tasks.

  1. The resource pool then dequeues gangs from the controller queue and tries to admit as many controller tasks it can. This is decided by the max_controller_percentage, which is defined as the maximum resources the controller tasks can use a percentage of the resource pools reservation. For eg if the max_controller_percentage=10 and resource pool's reservation is: cpu:100 mem:1000 disk:1000 gpu:10

    Then the maximum resources the controller tasks can use is 10% of the reservation, i.e.: cpu:10 mem:100 disk:100 gpu:1

    If these resources are not used by the controller tasks they will be given to other tasks. Once the controller task can be admitted it is removed from the controller queue and the resoures are added to the controller allocation. One we have exhausted all the controller tasks which can be admitted we move on to the second phase.

3. Resource pool then dequeus from revocable queue, revocable tasks are admitted using slack entitlement. Resources available for revocable tasks are constraint to slack limit. Behavior is some what comparable to controller limit. Resource pool reservation

cpu:100
mem:1000
disk:1000
gpu:10

Slack limit = 30%, i.e. max resource available for revocable tasks are 30% of reservation

cpu:30
mem:300
disk:300
gpu:3

Slack limit is enforced to prevent revocable tasks from hogging all resources in resource pool. Secondly, slack limit does not guarantee resources to revocable tasks, as they are best effort in behavior and have lower preference then non-revocable tasks.

  1. We now look at the pending queue to admit the rest of the tasks. The resource pool keeps deuqueueing gangs until it reaches a gang which can't be admitted. - At this point, if a task is non-preemptible, then it is moved to non-preemptible queue. - If a task is controller task then it is moved to controller queue. - Similarly, if a task is revocable task then moved to revocable queue, to unblock non-revocable tasks. - Once it reaches a task which can't be admitted and is *not* a controller task, non-preemptible or revocable tasks then it stops the admission control and returns the list of gangs to the scheduler.

Index

Constants

View Source
const (
	// ResourcePoolPathDelimiter is the delimiter for the resource pool path
	ResourcePoolPathDelimiter = "/"

	// DefaultResPoolSchedulingPolicy is the default scheduling policy for respool
	DefaultResPoolSchedulingPolicy = respool.SchedulingPolicy_PriorityFIFO
)

Variables

This section is empty.

Functions

func ValidateChildrenReservations

func ValidateChildrenReservations(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error

ValidateChildrenReservations All Child reservations against it parent

func ValidateControllerLimit

func ValidateControllerLimit(_ Tree,
	resourcePoolConfigData ResourcePoolConfigData) error

ValidateControllerLimit validates the controller limit

func ValidateCycle

func ValidateCycle(_ Tree,
	resourcePoolConfigData ResourcePoolConfigData) error

ValidateCycle if adding/updating current pool would result in a cycle

func ValidateParent

func ValidateParent(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error

ValidateParent {current} resource pool against it's {parent}

func ValidateResourcePool

func ValidateResourcePool(_ Tree,
	resourcePoolConfigData ResourcePoolConfigData) error

ValidateResourcePool if resource configurations are correct

func ValidateResourcePoolPath

func ValidateResourcePoolPath(_ Tree,
	resourcePoolConfigData ResourcePoolConfigData) error

ValidateResourcePoolPath validates the resource pool path

func ValidateSiblings

func ValidateSiblings(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error

ValidateSiblings validates the resource pool name is unique amongst its siblings

Types

type Metrics

type Metrics struct {
	APICreateResourcePool          tally.Counter
	CreateResourcePoolSuccess      tally.Counter
	CreateResourcePoolFail         tally.Counter
	CreateResourcePoolRollbackFail tally.Counter

	APIGetResourcePool     tally.Counter
	GetResourcePoolSuccess tally.Counter
	GetResourcePoolFail    tally.Counter

	APILookupResourcePoolID     tally.Counter
	LookupResourcePoolIDSuccess tally.Counter
	LookupResourcePoolIDFail    tally.Counter

	APIUpdateResourcePool          tally.Counter
	UpdateResourcePoolSuccess      tally.Counter
	UpdateResourcePoolFail         tally.Counter
	UpdateResourcePoolRollbackFail tally.Counter

	APIDeleteResourcePool     tally.Counter
	DeleteResourcePoolSuccess tally.Counter
	DeleteResourcePoolFail    tally.Counter

	APIQueryResourcePools     tally.Counter
	QueryResourcePoolsSuccess tally.Counter
	QueryResourcePoolsFail    tally.Counter

	PendingQueueSize    tally.Gauge
	RevocableQueueSize  tally.Gauge
	ControllerQueueSize tally.Gauge
	NPQueueSize         tally.Gauge

	TotalAllocation          scalar.GaugeMaps
	NonPreemptibleAllocation scalar.GaugeMaps
	NonSlackAllocation       scalar.GaugeMaps
	SlackAllocation          scalar.GaugeMaps
	ControllerAllocation     scalar.GaugeMaps

	TotalEntitlement    scalar.GaugeMaps
	NonSlackEntitlement scalar.GaugeMaps
	SlackEntitlement    scalar.GaugeMaps

	NonSlackAvailable scalar.GaugeMaps
	SlackAvailable    scalar.GaugeMaps

	Demand      scalar.GaugeMaps
	SlackDemand scalar.GaugeMaps

	ResourcePoolReservation scalar.GaugeMaps
	ResourcePoolLimit       scalar.GaugeMaps
	ResourcePoolShare       scalar.GaugeMaps

	ControllerLimit scalar.GaugeMaps
	SlackLimit      scalar.GaugeMaps
}

Metrics is a placeholder for all metrics in respool.

func NewMetrics

func NewMetrics(scope tally.Scope) *Metrics

NewMetrics returns a new instance of respool.Metrics.

type QueueType

type QueueType int

QueueType defines the different queues of the resource pool from which the gangs are admitted.

const (
	// PendingQueue is the default queue for all incoming gangs
	PendingQueue QueueType = iota + 1
	// ControllerQueue is the queue for controller gangs
	ControllerQueue
	// NonPreemptibleQueue is the queue for non preemptible gangs
	NonPreemptibleQueue
	// RevocableQueue is the queue for revocable gangs
	// which are scheduled using slack resources [cpus]
	RevocableQueue
)

func (QueueType) String

func (qt QueueType) String() string

String returns the queue name

type ResPool

type ResPool interface {

	// Returns the config of the resource pool.
	ResourcePoolConfig() *respool.ResourcePoolConfig
	// Sets the resource pool config.
	SetResourcePoolConfig(*respool.ResourcePoolConfig)

	// Returns a map of resources and its resource config.
	Resources() map[string]*respool.ResourceConfig

	// Converts to resource pool info.
	ToResourcePoolInfo() *respool.ResourcePoolInfo

	// Aggregates the child reservations by resource type.
	AggregatedChildrenReservations() (map[string]float64, error)

	// Enqueues gang (task list) into resource pool pending queue.
	EnqueueGang(gang *resmgrsvc.Gang) error
	// Dequeues gangs (task list) from the resource pool.
	DequeueGangs(int) ([]*resmgrsvc.Gang, error)
	// PeekGangs returns a list of gangs from the resource pool's queue based
	// on the queue type. limit determines the max number of gangs to be
	// returned.
	PeekGangs(qt QueueType, limit uint32) ([]*resmgrsvc.Gang, error)

	// SetEntitlement sets the entitlement of non-revocable resources
	// for non-revocable tasks + revocable tasks for this resource pool.
	SetEntitlement(res *scalar.Resources)
	// SetSlackEntitlement sets the entitlement of revocable cpus
	// + non-revocable resources [mem, disk, gpu] for the resource pool
	SetSlackEntitlement(res *scalar.Resources)
	// SetNonSlackEntitlement Sets the entitlement of non-revocable resources
	// for non-revocable tasks for this resource pool
	SetNonSlackEntitlement(res *scalar.Resources)

	// GetEntitlement returns the total entitlement of non-revocable resources
	// for this resource pool.
	GetEntitlement() *scalar.Resources
	// GetSlackEntitlement returns the entitlement for revocable tasks.
	GetSlackEntitlement() *scalar.Resources
	// GetNonSlackEntitlement returns the entitlement for non-revocable tasks.
	GetNonSlackEntitlement() *scalar.Resources

	// AddToAllocation adds resources to current allocation
	// for the resource pool.
	AddToAllocation(*scalar.Allocation) error
	// SubtractFromAllocation recaptures the resources from task.
	SubtractFromAllocation(*scalar.Allocation) error

	// GetTotalAllocatedResources returns the total resource allocation for the resource
	// pool.
	GetTotalAllocatedResources() *scalar.Resources
	// GetSlackAllocatedResources returns the slack allocation for the resource pool.
	GetSlackAllocatedResources() *scalar.Resources
	// GetNonSlackAllocatedResources returns resources allocated to non-revocable tasks.
	GetNonSlackAllocatedResources() *scalar.Resources

	// CalculateTotalAllocatedResources calculates the total allocation recursively for
	// all the children.
	CalculateTotalAllocatedResources() *scalar.Resources
	// SetAllocation sets the resource allocation for the resource pool.
	// Used by tests.
	SetTotalAllocatedResources(resources *scalar.Resources)

	// AddToDemand adds resources to current demand
	// for the resource pool.
	AddToDemand(res *scalar.Resources) error
	// SubtractFromDemand subtracts resources from current demand
	// for the resource pool.
	SubtractFromDemand(res *scalar.Resources) error
	// GetDemand returns the resource demand for the resource pool.
	GetDemand() *scalar.Resources
	// CalculateDemand calculates the resource demand
	// for the resource pool recursively for the subtree.
	CalculateDemand() *scalar.Resources

	// AddToSlackDemand adds resources to slack demand
	// for the resource pool.
	AddToSlackDemand(res *scalar.Resources) error
	// SubtractFromSlackDemand subtracts resources from slack demand
	// for the resource pool.
	SubtractFromSlackDemand(res *scalar.Resources) error
	// GetSlackDemand returns the slack resource demand for the resource pool.
	GetSlackDemand() *scalar.Resources
	// CalculateSlackDemand calculates the slack resource demand
	// for the resource pool recursively for the subtree.
	CalculateSlackDemand() *scalar.Resources

	// GetSlackLimit returns the limit of resources [mem,disk]
	// can be used by revocable tasks.
	GetSlackLimit() *scalar.Resources

	// AddInvalidTask will add the killed tasks to respool which can be
	// discarded asynchronously which scheduling.
	AddInvalidTask(task *peloton.TaskID)

	// UpdateResourceMetrics updates metrics for this resource pool
	// on each entitlement cycle calculation (15s)
	UpdateResourceMetrics()
	// contains filtered or unexported methods
}

ResPool is a node in a resource pool hierarchy.

func NewRespool

func NewRespool(
	scope tally.Scope,
	id string,
	parent ResPool,
	config *respool.ResourcePoolConfig,
	preemptionConfig rc.PreemptionConfig) (ResPool, error)

NewRespool will initialize the resource pool node and return that.

type ResourcePoolConfigData

type ResourcePoolConfigData struct {
	ID                 *peloton.ResourcePoolID     // Resource Pool ID
	Path               *respool.ResourcePoolPath   // Resource Pool path
	ResourcePoolConfig *respool.ResourcePoolConfig // Resource Pool Configuration
}

ResourcePoolConfigData holds the data that needs to be validated

type ResourcePoolConfigValidatorFunc

type ResourcePoolConfigValidatorFunc func(resTree Tree, resourcePoolConfigData ResourcePoolConfigData) error

ResourcePoolConfigValidatorFunc Validator func for registering custom validator

type Tree

type Tree interface {
	// Start initializes the respool tree by loading all resource pools
	// from DB. This should be called when a
	// resource manager gains the leadership.
	Start() error

	// Stop resets a respool tree when a resource manager lost the
	// leadership.
	Stop() error

	// Get returns a respool node by the given ID
	Get(ID *peloton.ResourcePoolID) (ResPool, error)

	// GetByPath returns the respool node by the given path
	GetByPath(path *respool.ResourcePoolPath) (ResPool, error)

	// GetAllNodes returns all respool nodes or all leaf respool nodes.
	GetAllNodes(leafOnly bool) *list.List

	// Upsert add/update a resource pool poolConfig to the tree
	Upsert(ID *peloton.ResourcePoolID, resPoolConfig *respool.ResourcePoolConfig) error

	// UpdatedChannel is written to whenever the resource tree is changed. There
	// may be only one event for multiple updates.
	// TODO: Redo package imports, such that a method Calculator.SuggestRefresh
	// can be used instead, without breaking circular imports.
	UpdatedChannel() <-chan struct{}

	// Delete deletes the resource pool from the tree
	Delete(ID *peloton.ResourcePoolID) error
}

Tree defines the interface for a Resource Pool Tree

func NewTree

func NewTree(
	scope tally.Scope,
	respoolOps ormobjects.ResPoolOps,
	jobStore storage.JobStore,
	taskStore storage.TaskStore,
	cfg rc.PreemptionConfig,
) Tree

NewTree will initializing the respool tree

type Validator

type Validator interface {
	// Validate validates the resource pool config
	Validate(data ResourcePoolConfigData) error
	// register a slice of Validator functions
	Register(validatorFuncs []ResourcePoolConfigValidatorFunc) (Validator, error)
}

Validator performs validations on the resource pool config

func NewResourcePoolConfigValidator

func NewResourcePoolConfigValidator(rTree Tree) (Validator, error)

NewResourcePoolConfigValidator returns a new resource pool config validator

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL