utils

package

v0.7.0 Latest Latest Go to latest Published: Oct 28, 2024 License: Apache-2.0, Apache-2.0 Imports: 10 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/Azure/karpenter-provider-azure

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
func GetAKSGPUImageSHA(size string) string
func GetGPUDriverVersion(size string) string
func GetSubnetResourceID(subscriptionID, resourceGroupName, virtualNetworkName, subnetName string) string
func GetVMName(providerID string) (string, error)
func GetVnetSubnetIDComponents(vnetSubnetID string) (vnetSubnetResource, error)
func ImageReferenceToString(imageRef *armcompute.ImageReference) string
func IsMarinerEnabledGPUSKU(vmSize string) bool
func IsNvidiaEnabledSKU(vmSize string) bool
func IsVMDeleting(vm armcompute.VirtualMachine) bool
func MkVMID(resourceGroupName string, vmName string) string
func PrettySlice[T any](s []T, maxItems int) string
func ResourceIDToProviderID(ctx context.Context, id string) string
func StringMap(list v1.ResourceList) map[string]string
func UseGridDrivers(size string) bool
func WithDefaultFloat64(key string, def float64) float64

Constants ¶

View Source

const (
	Nvidia470CudaDriverVersion = "cuda-470.82.01"
	Nvidia550CudaDriverVersion = "cuda-550.54.15"
	Nvidia535GridDriverVersion = "grid-535.161.08"

	// These SHAs will change once we update aks-gpu images in aks-gpu repository. We do that fairly rarely at this time.
	// So for now these will be kept here like this and periodically bump them
	AKSGPUGridSHA = "sha-d1f0ca"
	AKSGPUCudaSHA = "sha-2d4c96"
)

TODO: Get these from agentbaker

Variables ¶

View Source

var (
	/* If a new GPU sku becomes available, add a key to this map, but only if you have a confirmation
	   that we have an agreement with NVIDIA for this specific gpu.
	*/
	NvidiaEnabledSKUs = map[string]bool{

		"standard_nv6":      true,
		"standard_nv12":     true,
		"standard_nv12s_v3": true,
		"standard_nv24":     true,
		"standard_nv24s_v3": true,
		"standard_nv24r":    true,
		"standard_nv48s_v3": true,

		"standard_nd6s":   true,
		"standard_nd12s":  true,
		"standard_nd24s":  true,
		"standard_nd24rs": true,

		"standard_nc6s_v2":   true,
		"standard_nc12s_v2":  true,
		"standard_nc24s_v2":  true,
		"standard_nc24rs_v2": true,

		"standard_nc6s_v3":   true,
		"standard_nc12s_v3":  true,
		"standard_nc24s_v3":  true,
		"standard_nc24rs_v3": true,
		"standard_nd40s_v3":  true,
		"standard_nd40rs_v2": true,

		"standard_nc4as_t4_v3":  true,
		"standard_nc8as_t4_v3":  true,
		"standard_nc16as_t4_v3": true,
		"standard_nc64as_t4_v3": true,

		"standard_nd96asr_v4":       true,
		"standard_nd112asr_a100_v4": true,
		"standard_nd120asr_a100_v4": true,

		"standard_nd96amsr_a100_v4":  true,
		"standard_nd112amsr_a100_v4": true,
		"standard_nd120amsr_a100_v4": true,

		"standard_nc24ads_a100_v4": true,
		"standard_nc48ads_a100_v4": true,
		"standard_nc96ads_a100_v4": true,
		"standard_ncads_a100_v4":   true,

		"standard_nc8ads_a10_v4":  true,
		"standard_nc16ads_a10_v4": true,
		"standard_nc32ads_a10_v4": true,

		"standard_nv6ads_a10_v5":   true,
		"standard_nv12ads_a10_v5":  true,
		"standard_nv18ads_a10_v5":  true,
		"standard_nv36ads_a10_v5":  true,
		"standard_nv36adms_a10_v5": true,
		"standard_nv72ads_a10_v5":  true,

		"standard_nd96ams_v4":      true,
		"standard_nd96ams_a100_v4": true,
	}

	// List of GPU SKUs currently enabled and validated for Mariner. Will expand the support
	// to cover other SKUs available in Azure
	MarinerNvidiaEnabledSKUs = map[string]bool{

		"standard_nc6s_v3":   true,
		"standard_nc12s_v3":  true,
		"standard_nc24s_v3":  true,
		"standard_nc24rs_v3": true,
		"standard_nd40s_v3":  true,
		"standard_nd40rs_v2": true,

		"standard_nc4as_t4_v3":  true,
		"standard_nc8as_t4_v3":  true,
		"standard_nc16as_t4_v3": true,
		"standard_nc64as_t4_v3": true,
	}
)

View Source

var ConvergedGPUDriverSizes = map[string]bool{
	"standard_nv6ads_a10_v5":   true,
	"standard_nv12ads_a10_v5":  true,
	"standard_nv18ads_a10_v5":  true,
	"standard_nv36ads_a10_v5":  true,
	"standard_nv72ads_a10_v5":  true,
	"standard_nv36adms_a10_v5": true,
	"standard_nc8ads_a10_v4":   true,
	"standard_nc16ads_a10_v4":  true,
	"standard_nc32ads_a10_v4":  true,
}

ConvergedGPUDriverSizes : these sizes use a "converged" driver to support both cuda/grid workloads.

how do you figure this out? ask HPC or find out by trial and error. installing vanilla cuda drivers will fail to install with opaque errors. see https://github.com/Azure/azhpc-extensions/blob/daaefd78df6f27012caf30f3b54c3bd6dc437652/NvidiaGPU/resources.json

Functions ¶

func GetAKSGPUImageSHA ¶

func GetAKSGPUImageSHA(size string) string

func GetGPUDriverVersion ¶

func GetGPUDriverVersion(size string) string

NV series GPUs target graphics workloads vs NC which targets compute. they typically use GRID, not CUDA drivers, and will fail to install CUDA drivers. NVv1 seems to run with CUDA, NVv5 requires GRID. NVv3 is untested on AKS, NVv4 is AMD so n/a, and NVv2 no longer seems to exist (?).

func GetSubnetResourceID ¶ added in v0.4.0

func GetSubnetResourceID(subscriptionID, resourceGroupName, virtualNetworkName, subnetName string) string

GetSubnetResourceID constructs the subnet resource id

func GetVMName ¶

func GetVMName(providerID string) (string, error)

GetVMName parses the provider ID stored on the node to get the vmName associated with a node

func GetVnetSubnetIDComponents ¶ added in v0.4.0

func GetVnetSubnetIDComponents(vnetSubnetID string) (vnetSubnetResource, error)

func ImageReferenceToString ¶ added in v0.7.0

func ImageReferenceToString(imageRef *armcompute.ImageReference) string

func IsMarinerEnabledGPUSKU ¶

func IsMarinerEnabledGPUSKU(vmSize string) bool

IsNvidiaEnabledSKU determines if an VM SKU has nvidia driver support

func IsNvidiaEnabledSKU ¶

func IsNvidiaEnabledSKU(vmSize string) bool

IsNvidiaEnabledSKU determines if an VM SKU has nvidia driver support

func IsVMDeleting ¶ added in v0.7.0

func IsVMDeleting(vm armcompute.VirtualMachine) bool

func MkVMID ¶

func MkVMID(resourceGroupName string, vmName string) string

func PrettySlice ¶ added in v0.7.0

func PrettySlice[T any](s []T, maxItems int) string

PrettySlice truncates a slice after a certain number of max items to ensure that the Slice isn't too long

func ResourceIDToProviderID ¶

func ResourceIDToProviderID(ctx context.Context, id string) string

func StringMap ¶ added in v0.7.0

func StringMap(list v1.ResourceList) map[string]string

StringMap returns the string map representation of the resource list

func UseGridDrivers ¶ added in v0.7.0

func UseGridDrivers(size string) bool

func WithDefaultFloat64 ¶ added in v0.7.0

func WithDefaultFloat64(key string, def float64) float64

WithDefaultFloat64 returns the float64 value of the supplied environment variable or, if not present, the supplied default value. If the float64 conversion fails, returns the default

Types ¶

This section is empty.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
opts
project

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL