Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AddResourceList ¶
func AddResourceList(a v1.ResourceList, b v1.ResourceList)
AddResourceList add another v1.ResourceList to first's inner quantity. v1.ResourceList is equal to map[string]Quantity
func WithMaxLoadDesired ¶
func WithMaxLoadDesired(maxLoadDesired float64) func(as *Autoscaler)
WithMaxLoadDesired init with maxLoadDesired
Types ¶
type Autoscaler ¶
type Autoscaler struct {
// contains filtered or unexported fields
}
Autoscaler launches and scales the training jobs.
func NewAutoscaler ¶
func NewAutoscaler(kubeClient kubernetes.Interface, jobUpdater *sync.Map, options ...func(*Autoscaler)) *Autoscaler
NewAutoscaler creates a new Autoscaler.
func (*Autoscaler) InquiryResource ¶
func (a *Autoscaler) InquiryResource() (ClusterResource, error)
InquiryResource returns the idle and total resources of the k8s cluster.
func (*Autoscaler) Run ¶
func (a *Autoscaler) Run()
Run monitors the cluster resources and training jobs in a loop, scales the training jobs according to the cluster resource.
type ClusterResource ¶
type ClusterResource struct { NodeCount int // The total number of nodes in the cluster. // Each Kubernetes job could require some number of GPUs in // the range of [request, limit]. GPURequest int // \sum_job num_gpu_request(job) GPULimit int // \sum_job num_gpu_limit(job) GPUTotal int // The total number of GPUs in the cluster // Each Kubernetes job could require some CPU timeslices in // the unit of *milli*. CPURequestMilli int64 // \sum_job cpu_request_in_milli(job) CPULimitMilli int64 // \sum_job cpu_request_in_milli(job) CPUTotalMilli int64 // The total amount of CPUs in the cluster in milli. // Each Kubernetes job could require some amount of memory in // the unit of *mega*. MemoryRequestMega int64 // \sum_job memory_request_in_mega(job) MemoryLimitMega int64 // \sum_job memory_limit_in_mega(job) MemoryTotalMega int64 // The total amount of memory in the cluster in mega. Nodes Nodes }
ClusterResource is the resource of a cluster
Click to show internal directories.
Click to hide internal directories.