gpuallocator

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 21, 2023 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Allocator

type Allocator struct {
	GPUs []*Device
	// contains filtered or unexported fields
}

Allocator defines the primary object for allocating and freeing the available GPUs on a node.

func NewAllocator

func NewAllocator(policy Policy) (*Allocator, error)

NewAllocator creates a new Allocator using the given allocation policy

func NewBestEffortAllocator

func NewBestEffortAllocator() (*Allocator, error)

NewBestEffortAllocator creates a new Allocator using the BestEffort allocation policy

func NewPhysicalIDAllocator added in v0.3.0

func NewPhysicalIDAllocator() (*Allocator, error)

func NewSimpleAllocator

func NewSimpleAllocator() (*Allocator, error)

NewSimpleAllocator creates a new Allocator using the Simple allocation policy

func (*Allocator) Allocate

func (a *Allocator) Allocate(num int) []*Device

Allocate a set of 'num' GPUs from the allocator. If 'num' devices cannot be allocated, return an empty slice.

func (*Allocator) AllocateSNV added in v0.5.0

func (a *Allocator) AllocateSNV(num int, partitionGroupPhysIds []int) []*Device

Allocate a set of 'num' GPUs from the allocator. If 'num' devices cannot be allocated, return an empty slice.

func (*Allocator) AllocateSpecific

func (a *Allocator) AllocateSpecific(devices ...*Device) error

AllocateSpecific allocates a specific set of GPUs from the allocator. Return an error if any of the specified devices cannot be allocated.

func (*Allocator) Free

func (a *Allocator) Free(devices ...*Device)

Free a set of GPUs back to the allocator.

type Device

type Device struct {
	*nvml.Device
	Index      int
	Links      map[int][]P2PLink
	PhysicalID int
}

Device represents a GPU device as reported by NVML, including all of its Point-to-Point link information.

func NewDevices added in v0.2.0

func NewDevices() ([]*Device, error)

Create a list of Devices from all available nvml.Devices.

func NewDevicesFrom added in v0.2.0

func NewDevicesFrom(uuids []string) ([]*Device, error)

Create a list of Devices from the specific set of GPU uuids passed in.

func (*Device) Details

func (d *Device) Details() string

Details returns all details of a Device as a multi-line string.

func (*Device) String

func (d *Device) String() string

String returns a compact representation of a Device as string of its index.

type DeviceSet

type DeviceSet map[string]*Device

DeviceSet is used to hold and manipulate a set of unique GPU devices.

func NewDeviceSet

func NewDeviceSet(devices ...*Device) DeviceSet

NewDeviceSet creates a new DeviceSet.

func (DeviceSet) Contains

func (ds DeviceSet) Contains(device *Device) bool

Contains checks if a device is present in a DeviceSet.

func (DeviceSet) ContainsAll added in v0.2.0

func (ds DeviceSet) ContainsAll(devices []*Device) bool

ContainsAll checks if a list of devices is present in a DeviceSet.

func (DeviceSet) Delete

func (ds DeviceSet) Delete(devices ...*Device)

Delete deletes a list of devices from a DeviceSet.

func (DeviceSet) Insert

func (ds DeviceSet) Insert(devices ...*Device)

Insert inserts a list of devices into a DeviceSet.

func (DeviceSet) PhysicalIDSortedSlice added in v0.3.0

func (ds DeviceSet) PhysicalIDSortedSlice() []*Device

PhysicalIDSortedSlice returns a slice of devices, sorted by device physical ID from a DeviceSet.

func (DeviceSet) PhysicalIDSortedSliceSNV added in v0.5.0

func (ds DeviceSet) PhysicalIDSortedSliceSNV(partitionGroupPhysIds []int) []*Device

PhysicalIDSortedSlice returns a slice of devices, sorted by device physical ID from a DeviceSet.

func (DeviceSet) SortedSlice

func (ds DeviceSet) SortedSlice() []*Device

SortedSlice etunrs returns a slice of devices, sorted by device index from a DeviceSet.

type GPUType

type GPUType int

GPUType represents the valid set of GPU types a Static DGX policy can be created for.

const (
	GPUTypePascal GPUType = iota // Pascal GPUs
	GPUTypeVolta
)

Valid GPUTypes

type P2PLink struct {
	GPU  *Device
	Type nvml.P2PLinkType
}

P2PLink represents a Point-to-Point link between two GPU devices. The link is between the Device struct this struct is embedded in and the GPU Device contained in the P2PLink struct itself.

type Policy

type Policy interface {
	// Allocate is meant to do the heavy-lifting of implementing the actual
	// allocation strategy of the policy. It takes a slice of devices to
	// allocate GPUs from, and an amount 'size' to allocate from that slice. It
	// then returns a subset of devices of length 'size'. If the policy is
	// unable to allocate 'size' GPUs from the slice of input devices, it
	// returns an empty slice.
	Allocate(available []*Device, required []*Device, size int) []*Device
	AllocateSNV(available []*Device, required []*Device, size int, partitionGroupPhysIds []int) []*Device
}

Policy defines an interface for plugagable allocation policies to be added to an Allocator.

func NewBestEffortPolicy

func NewBestEffortPolicy() Policy

NewBestEffortPolicy creates a new BestEffortPolicy.

func NewPhysicalIDPolicy added in v0.3.0

func NewPhysicalIDPolicy() Policy

NewPhysicalIDPolicy creates a new physicalIDPolicy.

func NewSimplePolicy

func NewSimplePolicy() Policy

NewSimplePolicy creates a new SimplePolicy.

func NewStaticDGX1Policy

func NewStaticDGX1Policy(gpuType GPUType) Policy

NewStaticDGX1Policy creates a new StaticDGX1Policy for gpuType.

func NewStaticDGX2Policy

func NewStaticDGX2Policy() Policy

NewStaticDGX2Policy creates a new StaticDGX2Policy.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL