Documentation ¶
Directories ¶
Path | Synopsis |
---|---|
Package nvidia contains the NVIDIA accelerator components and its query interface.
|
Package nvidia contains the NVIDIA accelerator components and its query interface. |
bad-envs
Package badenvs tracks any bad environment variables that are globally set for the NVIDIA GPUs.
|
Package badenvs tracks any bad environment variables that are globally set for the NVIDIA GPUs. |
bad-envs/id
Package id defines the ID for the bad-envs check.
|
Package id defines the ID for the bad-envs check. |
clock
Package clock monitors NVIDIA GPU clock events of all GPUs, such as HW Slowdown events
|
Package clock monitors NVIDIA GPU clock events of all GPUs, such as HW Slowdown events |
clock-speed
Package clockspeed tracks the NVIDIA per-GPU clock speed.
|
Package clockspeed tracks the NVIDIA per-GPU clock speed. |
ecc
Package ecc tracks the NVIDIA per-GPU ECC errors and other ECC related information.
|
Package ecc tracks the NVIDIA per-GPU ECC errors and other ECC related information. |
error
Package error implements NVIDIA GPU driver error detector.
|
Package error implements NVIDIA GPU driver error detector. |
error/sxid
Package sxid tracks the NVIDIA GPU SXid errors scanning the dmesg.
|
Package sxid tracks the NVIDIA GPU SXid errors scanning the dmesg. |
error/xid
Package xid tracks the NVIDIA GPU Xid errors scanning the dmesg and using the NVIDIA Management Library (NVML).
|
Package xid tracks the NVIDIA GPU Xid errors scanning the dmesg and using the NVIDIA Management Library (NVML). |
fabric-manager
Package fabricmanager tracks the NVIDIA fabric manager version and its activeness.
|
Package fabricmanager tracks the NVIDIA fabric manager version and its activeness. |
gpm
Package gpm tracks the NVIDIA per-GPU GPM metrics.
|
Package gpm tracks the NVIDIA per-GPU GPM metrics. |
infiniband
Package infiniband monitors the infiniband status of the system.
|
Package infiniband monitors the infiniband status of the system. |
info
Package info provides relatively static information about the NVIDIA accelerator (e.g., GPU product names).
|
Package info provides relatively static information about the NVIDIA accelerator (e.g., GPU product names). |
memory
Package memory tracks the NVIDIA per-GPU memory usage.
|
Package memory tracks the NVIDIA per-GPU memory usage. |
nccl
Package nccl monitors the NCCL status.
|
Package nccl monitors the NCCL status. |
nvlink
Package nvlink monitors the NVIDIA per-GPU nvlink devices.
|
Package nvlink monitors the NVIDIA per-GPU nvlink devices. |
peermem
Package peermem monitors the peermem module status.
|
Package peermem monitors the peermem module status. |
persistence-mode
Package persistencemode tracks the NVIDIA persistence mode.
|
Package persistencemode tracks the NVIDIA persistence mode. |
persistence-mode/id
Package id defines the persistence mode component ID.
|
Package id defines the persistence mode component ID. |
power
Package power tracks the NVIDIA per-GPU power usage.
|
Package power tracks the NVIDIA per-GPU power usage. |
processes
Package processes tracks the NVIDIA per-GPU processes.
|
Package processes tracks the NVIDIA per-GPU processes. |
query
Package query implements "nvidia-smi --query" output helpers.
|
Package query implements "nvidia-smi --query" output helpers. |
query/fabric-manager-log
Package fabricmanagerlog implements the fabric manager log poller.
|
Package fabricmanagerlog implements the fabric manager log poller. |
query/metrics/clock
Package clock provides the NVIDIA clock metrics collection and reporting.
|
Package clock provides the NVIDIA clock metrics collection and reporting. |
query/metrics/clock-speed
Package clockspeed provides the NVIDIA clock speed metrics collection and reporting.
|
Package clockspeed provides the NVIDIA clock speed metrics collection and reporting. |
query/metrics/ecc
Package ecc provides the NVIDIA ECC metrics collection and reporting.
|
Package ecc provides the NVIDIA ECC metrics collection and reporting. |
query/metrics/gpm
Package gpm provides the NVIDIA GPM metrics collection and reporting.
|
Package gpm provides the NVIDIA GPM metrics collection and reporting. |
query/metrics/memory
Package memory provides the NVIDIA memory metrics collection and reporting.
|
Package memory provides the NVIDIA memory metrics collection and reporting. |
query/metrics/nvlink
Package nvlink provides the NVIDIA nvlink metrics collection and reporting.
|
Package nvlink provides the NVIDIA nvlink metrics collection and reporting. |
query/metrics/power
Package power provides the NVIDIA power usage metrics collection and reporting.
|
Package power provides the NVIDIA power usage metrics collection and reporting. |
query/metrics/processes
Package processes provides the NVIDIA processes metrics collection and reporting.
|
Package processes provides the NVIDIA processes metrics collection and reporting. |
query/metrics/remapped-rows
Package remappedrows provides the NVIDIA row remapping metrics collection and reporting.
|
Package remappedrows provides the NVIDIA row remapping metrics collection and reporting. |
query/metrics/temperature
Package temperature provides the NVIDIA temperature metrics collection and reporting.
|
Package temperature provides the NVIDIA temperature metrics collection and reporting. |
query/metrics/utilization
Package utilization provides the NVIDIA GPU utilization metrics collection and reporting.
|
Package utilization provides the NVIDIA GPU utilization metrics collection and reporting. |
query/nccl
Package nccl contains the implementation of the NCCL (NVIDIA Collective Communications Library) query for NVIDIA GPUs.
|
Package nccl contains the implementation of the NCCL (NVIDIA Collective Communications Library) query for NVIDIA GPUs. |
query/nvml
Package nvml implements the NVIDIA Management Library (NVML) interface.
|
Package nvml implements the NVIDIA Management Library (NVML) interface. |
query/peermem
Package peermem contains the implementation of the peermem query for NVIDIA GPUs.
|
Package peermem contains the implementation of the peermem query for NVIDIA GPUs. |
query/sxid
Package sxid provides the NVIDIA SXID error details.
|
Package sxid provides the NVIDIA SXID error details. |
query/xid
Package xid provides the NVIDIA XID error details.
|
Package xid provides the NVIDIA XID error details. |
remapped-rows
Package remappedrows tracks the NVIDIA per-GPU remapped rows.
|
Package remappedrows tracks the NVIDIA per-GPU remapped rows. |
temperature
Package temperature tracks the NVIDIA per-GPU temperatures.
|
Package temperature tracks the NVIDIA per-GPU temperatures. |
utilization
Package utilization tracks the NVIDIA per-GPU utilization.
|
Package utilization tracks the NVIDIA per-GPU utilization. |
Click to show internal directories.
Click to hide internal directories.