xid

package
v0.0.1-alpha8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 2, 2024 License: Apache-2.0 Imports: 17 Imported by: 0

Documentation

Overview

Package xid tracks the NVIDIA GPU Xid errors scanning the dmesg and using the NVIDIA Management Library (NVML). See Xid messages https://docs.nvidia.com/deploy/gpu-debug-guidelines/index.html#xid-messages.

Index

Constants

View Source
const (
	StateNameErrorXid = "error_xid"

	StateKeyErrorXidData           = "data"
	StateKeyErrorXidEncoding       = "encoding"
	StateValueErrorXidEncodingJSON = "json"
)
View Source
const (
	EventNameErroXid = "error_xid"

	EventKeyErroXidUnixSeconds    = "unix_seconds"
	EventKeyErroXidData           = "data"
	EventKeyErroXidEncoding       = "encoding"
	EventValueErroXidEncodingJSON = "json"
)
View Source
const Name = "accelerator-nvidia-error-xid"

Variables

This section is empty.

Functions

func CreateGet

func CreateGet() query.GetFunc

DO NOT for-loop here the query.GetFunc is already called periodically in a loop by the poller

func New

Types

type Config

type Config struct {
	Query query_config.Config `json:"query"`
}

func ParseConfig

func ParseConfig(b any, db *sql.DB) (*Config, error)

func (Config) Validate

func (cfg Config) Validate() error

type NVMLError

type NVMLError struct {
	Xid   uint64 `json:"xid"`
	Error error  `json:"error"`
}

func ParseNVMLErrorJSON

func ParseNVMLErrorJSON(data []byte) (*NVMLError, error)

func ParseNVMLErrorYAML

func ParseNVMLErrorYAML(data []byte) (*NVMLError, error)

func (*NVMLError) JSON

func (nv *NVMLError) JSON() ([]byte, error)

func (*NVMLError) YAML

func (nv *NVMLError) YAML() ([]byte, error)

type Output

type Output struct {
	DmesgErrors  []nvidia_query_xid.DmesgError `json:"dmesg_errors,omitempty"`
	NVMLXidEvent *nvidia_query_nvml.XidEvent   `json:"nvml_xid_event,omitempty"`
}

func ParseOutputJSON

func ParseOutputJSON(data []byte) (*Output, error)

func ParseOutputYAML

func ParseOutputYAML(data []byte) (*Output, error)

func ParseStateErrorXid

func ParseStateErrorXid(m map[string]string) (*Output, error)

func ParseStatesToOutput

func ParseStatesToOutput(states ...components.State) (*Output, error)

func (*Output) Evaluate

func (o *Output) Evaluate() (string, bool, error)

Returns the output evaluation reason and its healthy-ness.

func (*Output) Events

func (o *Output) Events() []components.Event

func (*Output) JSON

func (o *Output) JSON() ([]byte, error)

func (*Output) States

func (o *Output) States() ([]components.State, error)

func (*Output) YAML

func (o *Output) YAML() ([]byte, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL