Documentation ¶
Overview ¶
Package nvidia contains the NVIDIA accelerator components and its query interface.
Directories ¶
Path | Synopsis |
---|---|
Package badenvs tracks any bad environment variables that are globally set for the NVIDIA GPUs.
|
Package badenvs tracks any bad environment variables that are globally set for the NVIDIA GPUs. |
id
Package id defines the ID for the bad-envs check.
|
Package id defines the ID for the bad-envs check. |
Package clock monitors NVIDIA GPU clock events of all GPUs, such as HW Slowdown events
|
Package clock monitors NVIDIA GPU clock events of all GPUs, such as HW Slowdown events |
Package clockspeed tracks the NVIDIA per-GPU clock speed.
|
Package clockspeed tracks the NVIDIA per-GPU clock speed. |
Package ecc tracks the NVIDIA per-GPU ECC errors and other ECC related information.
|
Package ecc tracks the NVIDIA per-GPU ECC errors and other ECC related information. |
Package error implements NVIDIA GPU driver error detector.
|
Package error implements NVIDIA GPU driver error detector. |
sxid
Package sxid tracks the NVIDIA GPU SXid errors scanning the dmesg.
|
Package sxid tracks the NVIDIA GPU SXid errors scanning the dmesg. |
sxid/id
Package id provides the nvidia error sxid id component.
|
Package id provides the nvidia error sxid id component. |
xid
Package xid tracks the NVIDIA GPU Xid errors scanning the dmesg and using the NVIDIA Management Library (NVML).
|
Package xid tracks the NVIDIA GPU Xid errors scanning the dmesg and using the NVIDIA Management Library (NVML). |
xid/id
Package id provides the nvidia error xid id component.
|
Package id provides the nvidia error xid id component. |
Package errorxidsxid implements NVIDIA GPU driver Xid/SXid error detector.
|
Package errorxidsxid implements NVIDIA GPU driver Xid/SXid error detector. |
id
Package id is the identifier for the nvidia error xid sxid component.
|
Package id is the identifier for the nvidia error xid sxid component. |
Package fabricmanager tracks the NVIDIA fabric manager version and its activeness.
|
Package fabricmanager tracks the NVIDIA fabric manager version and its activeness. |
Package gpm tracks the NVIDIA per-GPU GPM metrics.
|
Package gpm tracks the NVIDIA per-GPU GPM metrics. |
Package gspfirmwaremode tracks the NVIDIA GSP firmware mode.
|
Package gspfirmwaremode tracks the NVIDIA GSP firmware mode. |
id
Package id defines the GSP firmware component ID.
|
Package id defines the GSP firmware component ID. |
Package infiniband monitors the infiniband status of the system.
|
Package infiniband monitors the infiniband status of the system. |
id
Package id provides the ID for the NVIDIA InfiniBand component.
|
Package id provides the ID for the NVIDIA InfiniBand component. |
Package info provides relatively static information about the NVIDIA accelerator (e.g., GPU product names).
|
Package info provides relatively static information about the NVIDIA accelerator (e.g., GPU product names). |
Package memory tracks the NVIDIA per-GPU memory usage.
|
Package memory tracks the NVIDIA per-GPU memory usage. |
Package nccl monitors the NCCL status.
|
Package nccl monitors the NCCL status. |
Package nvlink monitors the NVIDIA per-GPU nvlink devices.
|
Package nvlink monitors the NVIDIA per-GPU nvlink devices. |
Package peermem monitors the peermem module status.
|
Package peermem monitors the peermem module status. |
Package persistencemode tracks the NVIDIA persistence mode.
|
Package persistencemode tracks the NVIDIA persistence mode. |
id
Package id defines the persistence mode component ID.
|
Package id defines the persistence mode component ID. |
Package power tracks the NVIDIA per-GPU power usage.
|
Package power tracks the NVIDIA per-GPU power usage. |
Package processes tracks the NVIDIA per-GPU processes.
|
Package processes tracks the NVIDIA per-GPU processes. |
Package query implements "nvidia-smi --query" output helpers.
|
Package query implements "nvidia-smi --query" output helpers. |
fabric-manager-log
Package fabricmanagerlog implements the fabric manager log poller.
|
Package fabricmanagerlog implements the fabric manager log poller. |
infiniband
Package infiniband provides utilities to query infiniband status.
|
Package infiniband provides utilities to query infiniband status. |
metrics/clock
Package clock provides the NVIDIA clock metrics collection and reporting.
|
Package clock provides the NVIDIA clock metrics collection and reporting. |
metrics/clock-speed
Package clockspeed provides the NVIDIA clock speed metrics collection and reporting.
|
Package clockspeed provides the NVIDIA clock speed metrics collection and reporting. |
metrics/ecc
Package ecc provides the NVIDIA ECC metrics collection and reporting.
|
Package ecc provides the NVIDIA ECC metrics collection and reporting. |
metrics/gpm
Package gpm provides the NVIDIA GPM metrics collection and reporting.
|
Package gpm provides the NVIDIA GPM metrics collection and reporting. |
metrics/memory
Package memory provides the NVIDIA memory metrics collection and reporting.
|
Package memory provides the NVIDIA memory metrics collection and reporting. |
metrics/nvlink
Package nvlink provides the NVIDIA nvlink metrics collection and reporting.
|
Package nvlink provides the NVIDIA nvlink metrics collection and reporting. |
metrics/power
Package power provides the NVIDIA power usage metrics collection and reporting.
|
Package power provides the NVIDIA power usage metrics collection and reporting. |
metrics/processes
Package processes provides the NVIDIA processes metrics collection and reporting.
|
Package processes provides the NVIDIA processes metrics collection and reporting. |
metrics/remapped-rows
Package remappedrows provides the NVIDIA row remapping metrics collection and reporting.
|
Package remappedrows provides the NVIDIA row remapping metrics collection and reporting. |
metrics/temperature
Package temperature provides the NVIDIA temperature metrics collection and reporting.
|
Package temperature provides the NVIDIA temperature metrics collection and reporting. |
metrics/utilization
Package utilization provides the NVIDIA GPU utilization metrics collection and reporting.
|
Package utilization provides the NVIDIA GPU utilization metrics collection and reporting. |
nccl
Package nccl contains the implementation of the NCCL (NVIDIA Collective Communications Library) query for NVIDIA GPUs.
|
Package nccl contains the implementation of the NCCL (NVIDIA Collective Communications Library) query for NVIDIA GPUs. |
nvml
Package nvml implements the NVIDIA Management Library (NVML) interface.
|
Package nvml implements the NVIDIA Management Library (NVML) interface. |
peermem
Package peermem contains the implementation of the peermem query for NVIDIA GPUs.
|
Package peermem contains the implementation of the peermem query for NVIDIA GPUs. |
sxid
Package sxid provides the NVIDIA SXID error details.
|
Package sxid provides the NVIDIA SXID error details. |
xid
Package xid provides the NVIDIA XID error details.
|
Package xid provides the NVIDIA XID error details. |
xid-sxid-state
Package xidsxidstate provides the persistent storage layer for the nvidia query results.
|
Package xidsxidstate provides the persistent storage layer for the nvidia query results. |
Package remappedrows tracks the NVIDIA per-GPU remapped rows.
|
Package remappedrows tracks the NVIDIA per-GPU remapped rows. |
Package temperature tracks the NVIDIA per-GPU temperatures.
|
Package temperature tracks the NVIDIA per-GPU temperatures. |
Package utilization tracks the NVIDIA per-GPU utilization.
|
Package utilization tracks the NVIDIA per-GPU utilization. |
Click to show internal directories.
Click to hide internal directories.