analytics

package
v1.6.0-rc4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 16, 2024 License: Apache-2.0 Imports: 17 Imported by: 0

README

What Data is shared by users of Bacalhau?

When a job is submitted or completed, data is collected about it to help track, manage, and optimize its execution.

What information is collected on the bacalhau agent:

  • Node Type: One of: ‘hybrid’, ‘orchestrator’, ‘compute’.
  • Node Version: The version of bacalhau the node is running.
  • Node ID: The identifier of the bacalhau node.
  • Installation ID: The identified associated with the installation of bacalhau.
  • Instance ID: An anonymous identifier of the bacalhau node.
  • Operating System Type: The name of the operating system the bacalhau node is running on.

What information is collected on job submissions and completions:

  1. Job Identification
    • ID: A unique identifier for the job.
    • Namespace Hash: A hashed version of the job’s namespace, used for grouping related jobs.
    • Name Set: Whether a specific name was set for the job.
    • Type: The type of job you’re running.
    • Count: The number of tasks associated with the job.
    • Labels & Metadata Counts: The number of labels and metadata entries attached to the job.
  2. State and Timing Information (Terminal Jobs Only)
    • State: The current state of the job (e.g., completed, failed).
    • Creation & Modification Times: When the job was created and last modified.
  3. Versioning and Revisions
    • Version & Revision: These fields help track changes to the job’s configuration over time.
  4. Task-Specific Information
    • Task Name Hash: A hashed version of the task name for internal tracking.
    • Task Engine & Publisher Types: The type of engine and publisher used for the task.
    • Environment Variables & Metadata: The number of environment variables and metadata entries tied to the task.
    • Input Source Types: The types of input sources for the task (e.g., file, database).
    • Result Paths Count: The number of result paths generated by the task.
  5. Resource Allocation
    • CPU, Memory, Disk, GPU Usage: The amount of CPU, memory, disk, and GPU resources requested by the task.
    • Network Details: The network type and number of network domains used by the task.
  6. Timeouts
    • Execution Timeout: The maximum allowed time for the task to run.
    • Queue Timeout: The maximum time the task can wait in the queue.
    • Total Timeout: The total allowed time for the job, including both queue and execution time.
  7. Warnings and Errors (Submitted Jobs Only)
    • Any warnings or errors that occurred during the job submission or execution process.

What Information is Collected on Job Execution

When a job is executed, detailed information about the execution process is collected to help monitor and optimize performance, as well as assist with troubleshooting. Here’s a breakdown of what is collected:

  1. Execution Identification
    • Execution ID: A unique identifier for the execution.
    • Job ID: The identifier for the associated job.
    • Evaluation ID: An identifier linking the execution to its evaluation process.
    • Node Name Hash: A hashed version of the name of the node where the execution is running.
    • Namespace Hash: A hashed version of the namespace under which the execution is running.
  2. Execution Metadata
    • Execution Name Set: Whether a specific name was set for the execution.
    • Previous & Next Executions: Links to any preceding or subsequent executions, if applicable.
    • Follow-up Evaluation ID: An identifier for any follow-up evaluations related to the execution.
    • Revision: A version number that tracks changes to the execution configuration over time.
    • Creation & Modification Times: Timestamps indicating when the execution was created and last modified.
  3. Resource Allocation
    • Total CPU Units: The total CPU resources allocated for the execution.
    • Total Memory, Disk, and GPU Usage: The memory, disk space, and GPU resources used by the execution.
  4. Execution States
    • Desired State: The intended state of the execution (e.g., running, completed).
    • Compute State & Message: The actual state of the execution, including any details about its progress or errors.
    • Compute Error Code: An error code related to any issues with the execution's state on the compute node.
  5. Published Results
    • Published Result Type: The type of result produced by the execution, such as output files or data.
  6. Run Command Results
    • Run Output Details: Information about the command’s execution, including:
      • Exit Code: The exit code returned by the executed task (typically 0 for success).
      • RunResultStdoutTruncated: Whether stdout was truncated during execution.
      • RunResultStderrTruncated: Whether stderr was truncated during execution.

How do users opt out of sharing data?

To opt out of sharing data, users may run one of the following commands before starting their bacalhau node: Disable collection via config set

bacalhau config set DisableAnalytics true

Disable collection via environment variable

export BACALHAU_DISABLEANALYTICS=true

Disable collection via editing the config file

echo 'disableanalytics: true' >> ~/.bacalhau/config.yaml

Disable collection via a config flag

bacalhau --config=DisableAnalytics=true <command>

How can users verify they have opted out?

bacalhau config list | grep disableanalytics

Expected output when collection is disabled:

disableanalytics  true  No description available                                                         

Expected output when collection is enabled:

disableanalytics  false  No description available                                                         

Documentation

Index

Constants

View Source
const (
	NodeInstallationIDKey = "installation_id"
	NodeInstanceIDKey     = "instance_id"
	NodeIDHashKey         = "node_id_hash"
	NodeTypeKey           = "node_type"
	NodeVersionKey        = "node_version"
)
View Source
const ComputeMessageExecutionEventType = "bacalhau.execution_v1.compute_message"
View Source
const CreatedExecutionEventType = "bacalhau.execution_v1.create"
View Source
const DefaultOtelCollectorEndpoint = "t.bacalhau.org:4317"
View Source
const ProviderKey = "bacalhau-analytics"
View Source
const SubmitJobEventType = "bacalhau.job_v1.submit"

SubmitJobEventType is the event type for a job that has been submitted to an orchestrator.

View Source
const TerminalExecutionEventType = "bacalhau.execution_v1.terminal"
View Source
const TerminalJobEventType = "bacalhau.job_v1.terminal"

TerminalJobEventType is the event type for a job that has reached a terminal state.

Variables

This section is empty.

Functions

func EmitEvent

func EmitEvent(ctx context.Context, event *Event)

func SetupAnalyticsProvider

func SetupAnalyticsProvider(ctx context.Context, opts ...Option) error

func ShutdownAnalyticsProvider

func ShutdownAnalyticsProvider(ctx context.Context) error

Types

type Config

type Config struct {
	// contains filtered or unexported fields
}

type Event

type Event struct {
	Type       string
	Properties any
}

func NewComputeMessageExecutionEvent

func NewComputeMessageExecutionEvent(e models.Execution) *Event

func NewCreatedExecutionEvent

func NewCreatedExecutionEvent(e models.Execution) *Event

func NewEvent

func NewEvent(eventType string, properties any) *Event

NewEvent creates a new Event.

func NewJobTerminalEvent

func NewJobTerminalEvent(j models.Job) *Event

func NewTerminalExecutionEvent

func NewTerminalExecutionEvent(e models.Execution) *Event

func (*Event) ToLogRecord

func (e *Event) ToLogRecord() (otellog.Record, error)

ToLogRecord converts an Event to a LogRecord.

type EventType

type EventType string

type ExecutionComputeMessage

type ExecutionComputeMessage struct {
	JobID            string `json:"job_id,omitempty"`
	ExecutionID      string `json:"execution_id,omitempty"`
	ComputeMessage   string `json:"compute_message,omitempty"`
	ComputeErrorCode string `json:"compute_state_error_code,omitempty"`
}

type ExecutionEvent

type ExecutionEvent struct {
	JobID       string `json:"job_id,omitempty"`
	ExecutionID string `json:"execution_id,omitempty"`
	EvalID      string `json:"evaluation_id,omitempty"`

	NameSet       bool   `json:"name_set,omitempty"`
	NodeNameHash  string `json:"node_name_hash,omitempty"`
	NamespaceHash string `json:"namespace_hash,omitempty"`

	Resources map[string]Resource `json:"resources,omitempty"`

	DesiredState          string `json:"desired_state,omitempty"`
	DesiredStateErrorCode string `json:"desired_state_error_code,omitempty"`

	ComputeState          string `json:"compute_state,omitempty"`
	ComputeStateErrorCode string `json:"compute_state_error_code,omitempty"`

	PublishedResultType string `json:"publisher_type,omitempty"`

	RunResultStdoutTruncated bool `json:"run_result_stdout_truncated,omitempty"`
	RunResultStderrTruncated bool `json:"run_result_stderr_truncated,omitempty"`
	RunResultExitCode        int  `json:"run_result_exit_code,omitempty"`

	PreviousExecution string `json:"previous_execution,omitempty"`
	NextExecution     string `json:"next_execution,omitempty"`
	FollowupEvalID    string `json:"followup_eval_id,omitempty"`

	Revision   uint64    `json:"revision,omitempty"`
	CreateTime time.Time `json:"create_time,omitempty"`
	ModifyTime time.Time `json:"modify_time,omitempty"`
}

type GPUInfo

type GPUInfo struct {
	Name   string `json:"name,omitempty"`
	Vendor string `json:"vendor,omitempty"`
}

type JobTerminalEvent

type JobTerminalEvent struct {
	JobID string `json:"job_id"`

	NameSet       bool   `json:"name_set"`
	NamespaceHash string `json:"namespace_hash"`

	Type        string `json:"type"`
	Count       int    `json:"count"`
	LabelsCount int    `json:"labels_count"`
	MetaCount   int    `json:"meta_count"`

	State string `json:"state"`

	Version    uint64    `json:"version"`
	Revision   uint64    `json:"revision"`
	CreateTime time.Time `json:"create_time"`
	ModifyTime time.Time `json:"modify_time"`

	TaskNameHash         string   `json:"task_name_hash"`
	TaskEngineType       string   `json:"task_engine_type"`
	TaskPublisherType    string   `json:"task_publisher_type"`
	TaskEnvVarCount      int      `json:"task_env_var_count"`
	TaskMetaCount        int      `json:"task_meta_count"`
	TaskInputSourceTypes []string `json:"task_input_source_types"`
	TaskResultPathCount  int      `json:"task_result_path_count"`

	Resources Resource `json:"resources,omitempty"`

	TaskNetworkType      string `json:"task_network_type"`
	TaskDomainsCount     int    `json:"task_domains_count"`
	TaskExecutionTimeout int64  `json:"task_execution_timeout"`
	TaskQueueTimeout     int64  `json:"task_queue_timeout"`
	TaskTotalTimeout     int64  `json:"task_total_timeout"`
}

type Option

type Option func(*Config)

func WithEndpoint

func WithEndpoint(endpoint string) Option

func WithInstallationID

func WithInstallationID(id string) Option

func WithInstanceID

func WithInstanceID(id string) Option

func WithNodeID

func WithNodeID(id string) Option

func WithNodeType

func WithNodeType(isRequester, isCompute bool) Option

func WithVersion

func WithVersion(bv *models.BuildVersionInfo) Option

type Resource

type Resource struct {
	CPUUnits    float64   `json:"cpu_units,omitempty"`
	MemoryBytes uint64    `json:"memory_bytes,omitempty"`
	DiskBytes   uint64    `json:"disk_bytes,omitempty"`
	GPUCount    uint64    `json:"gpu_count,omitempty"`
	GPUTypes    []GPUInfo `json:"gpu_types,omitempty"`
}

type SubmitJobEvent

type SubmitJobEvent struct {
	JobID string `json:"job_id"`

	NameSet       bool   `json:"name_set"`
	NamespaceHash string `json:"namespace_hash"`

	Type        string `json:"type"`
	Count       int    `json:"count"`
	LabelsCount int    `json:"labels_count"`
	MetaCount   int    `json:"meta_count"`

	Version    uint64    `json:"version"`
	Revision   uint64    `json:"revision"`
	CreateTime time.Time `json:"create_time"`
	ModifyTime time.Time `json:"modify_time"`

	TaskNameHash         string   `json:"task_name_hash"`
	TaskEngineType       string   `json:"task_engine_type"`
	TaskPublisherType    string   `json:"task_publisher_type"`
	TaskEnvVarCount      int      `json:"task_env_var_count"`
	TaskMetaCount        int      `json:"task_meta_count"`
	TaskInputSourceTypes []string `json:"task_input_source_types"`
	TaskResultPathCount  int      `json:"task_result_path_count"`

	Resources Resource `json:"resources,omitempty"`

	TaskNetworkType      string `json:"task_network_type"`
	TaskDomainsCount     int    `json:"task_domains_count"`
	TaskExecutionTimeout int64  `json:"task_execution_timeout"`
	TaskQueueTimeout     int64  `json:"task_queue_timeout"`
	TaskTotalTimeout     int64  `json:"task_total_timeout"`

	Warnings []string `json:"warnings"`
	Error    string   `json:"error"`
}

func NewSubmitJobEvent

func NewSubmitJobEvent(j models.Job, warnings ...string) SubmitJobEvent

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL