Documentation
¶
Index ¶
- Constants
- Variables
- func NewNvidiaTree(cfg *config.Config) device.GPUTree
- type DeviceMeta
- type LessFunc
- type LevelMap
- type NvidiaNode
- type NvidiaTree
- func (t *NvidiaTree) Available() int
- func (t *NvidiaTree) Init(input string)
- func (t *NvidiaTree) Leaves() []*NvidiaNode
- func (t *NvidiaTree) MarkFree(node *NvidiaNode, util int64, memory int64)
- func (t *NvidiaTree) MarkOccupied(node *NvidiaNode, util int64, memory int64)
- func (t *NvidiaTree) PrintGraph() string
- func (t *NvidiaTree) Query(name string) *NvidiaNode
- func (t *NvidiaTree) Root() *NvidiaNode
- func (t *NvidiaTree) Total() int
- func (t *NvidiaTree) Update()
- type SchedulerCache
Constants ¶
const ( //MaxProcess is the Maximum number of process in one device. MaxProcess = 64 //NamePattern is the name pattern of nvidia device. NamePattern = "/dev/nvidia%d" //HundredCore represents 100 virtual cores. HundredCore = 100 )
Variables ¶
var ( //ByType compares two NvidiaNode by GpuTopologyLevel ByType = func(p1, p2 *NvidiaNode) bool { return p1.Type() < p2.Type() } //ByAvailable compares two NvidiaNode by count of available leaves ByAvailable = func(p1, p2 *NvidiaNode) bool { return p1.Available() < p2.Available() } //ByID compares two NvidiaNode by ID ByID = func(p1, p2 *NvidiaNode) bool { return p1.Meta.ID < p2.Meta.ID } //ByMemory compares two NvidiaNode by memory already used ByMemory = func(p1, p2 *NvidiaNode) bool { return p1.Meta.UsedMemory < p2.Meta.UsedMemory } //ByPids compares two NvidiaNode by length of PIDs running on node ByPids = func(p1, p2 *NvidiaNode) bool { return len(p1.Meta.Pids) < len(p2.Meta.Pids) } //ByAllocatableCores compares two NvidiaNode by available cores ByAllocatableCores = func(p1, p2 *NvidiaNode) bool { return p1.AllocatableMeta.Cores < p2.AllocatableMeta.Cores } //ByAllocatableMemory compares two NvidiaNode by available memory ByAllocatableMemory = func(p1, p2 *NvidiaNode) bool { return p1.AllocatableMeta.Memory/types.MemoryBlockSize < p2.AllocatableMeta.Memory/types.MemoryBlockSize } //PrintSorter is used to sort nodes when printing them out PrintSorter = &printSort{ less: []LessFunc{ByType, ByAvailable, ByID}, } )
Functions ¶
Types ¶
type DeviceMeta ¶
type DeviceMeta struct { ID int MinorID int UsedMemory uint64 TotalMemory uint64 Pids []uint BusId string Utilization uint UUID string }
DeviceMeta contains metadata of GPU device
type LessFunc ¶
type LessFunc func(p1, p2 *NvidiaNode) bool
LessFunc represents funcion to compare two NvidiaNode
type LevelMap ¶
type LevelMap map[nvml.GpuTopologyLevel][]*NvidiaNode
LevelMap is a map stores NvidiaNode on each level.
type NvidiaNode ¶
type NvidiaNode struct { Meta DeviceMeta AllocatableMeta SchedulerCache Parent *NvidiaNode Children []*NvidiaNode Mask uint32 // contains filtered or unexported fields }
NvidiaNode represents a node of Nvidia GPU
func NewNvidiaNode ¶
func NewNvidiaNode(t *NvidiaTree) *NvidiaNode
NewNvidiaNode returns a new NvidiaNode
func (*NvidiaNode) Available ¶
func (n *NvidiaNode) Available() int
Available returns conut of available leaves of this NvidiaNode.
func (*NvidiaNode) GetAvailableLeaves ¶
func (n *NvidiaNode) GetAvailableLeaves() []*NvidiaNode
GetAvailableLeaves returns leaves of this NvidiaNode which available for allocating.
func (*NvidiaNode) MinorName ¶
func (n *NvidiaNode) MinorName() string
MinorName returns MinorID of this NvidiaNode
func (*NvidiaNode) String ¶
func (n *NvidiaNode) String() string
func (*NvidiaNode) Type ¶
func (n *NvidiaNode) Type() int
Type returns GpuTopologyLevel of this NvidiaNode
type NvidiaTree ¶
NvidiaTree represents a Nvidia GPU in a tree.
func (*NvidiaTree) Available ¶
func (t *NvidiaTree) Available() int
Available returns number of available leaves of this tree.
func (*NvidiaTree) Init ¶
func (t *NvidiaTree) Init(input string)
Init a NvidiaTree. Will try to use nvml first, fallback to input string if parseFromLibrary() failed.
func (*NvidiaTree) Leaves ¶
func (t *NvidiaTree) Leaves() []*NvidiaNode
Leaves returns leaves of tree
func (*NvidiaTree) MarkFree ¶
func (t *NvidiaTree) MarkFree(node *NvidiaNode, util int64, memory int64)
MarkFree updates a NvidiaNode by freeing request cores and memory. If request cores < HundredCore, plus available cores and memory with request value. If request cores >= HundredCore, set available cores and memory to total, and update mask of all parents of this node.
func (*NvidiaTree) MarkOccupied ¶
func (t *NvidiaTree) MarkOccupied(node *NvidiaNode, util int64, memory int64)
MarkOccupied updates a NvidiaNode by adding request cores and memory. Mask of all parents of this node will be updated. If request cores < HundredCore, minus available cores and memory with request value. If request cores >= HundredCore, set available cores and memory to 0,
func (*NvidiaTree) PrintGraph ¶
func (t *NvidiaTree) PrintGraph() string
PrintGraph returns the details of tree as string
func (*NvidiaTree) Query ¶
func (t *NvidiaTree) Query(name string) *NvidiaNode
Query tries to find node by name, return nil if not found
func (*NvidiaTree) Update ¶
func (t *NvidiaTree) Update()
Update NvidiaTree by info getting from GPU devices. Return immediately if real GPU device is not available.
type SchedulerCache ¶
SchedulerCache contains allocatable resource of GPU