analytics

package

v1.5.1-rc1 Latest Latest Go to latest Published: Oct 24, 2024 License: Apache-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/bacalhau-project/bacalhau

README ¶

What Data is shared by users of Bacalhau?

When a job is submitted or completed, data is collected about it to help track, manage, and optimize its execution.

What information is collected on the bacalhau agent:

Node Type: One of: ‘hybrid’, ‘orchestrator’, ‘compute’.
Node Version: The version of bacalhau the node is running.
Node ID: The identifier of the bacalhau node.
Installation ID: The identified associated with the installation of bacalhau.
Instance ID: An anonymous identifier of the bacalhau node.
Operating System Type: The name of the operating system the bacalhau node is running on.

What information is collected on job submissions and completions:

Job Identification
- ID: A unique identifier for the job.
- Namespace Hash: A hashed version of the job’s namespace, used for grouping related jobs.
- Name Set: Whether a specific name was set for the job.
- Type: The type of job you’re running.
- Count: The number of tasks associated with the job.
- Labels & Metadata Counts: The number of labels and metadata entries attached to the job.
State and Timing Information (Terminal Jobs Only)
- State: The current state of the job (e.g., completed, failed).
- Creation & Modification Times: When the job was created and last modified.
Versioning and Revisions
- Version & Revision: These fields help track changes to the job’s configuration over time.
Task-Specific Information
- Task Name Hash: A hashed version of the task name for internal tracking.
- Task Engine & Publisher Types: The type of engine and publisher used for the task.
- Environment Variables & Metadata: The number of environment variables and metadata entries tied to the task.
- Input Source Types: The types of input sources for the task (e.g., file, database).
- Result Paths Count: The number of result paths generated by the task.
Resource Allocation
- CPU, Memory, Disk, GPU Usage: The amount of CPU, memory, disk, and GPU resources requested by the task.
- Network Details: The network type and number of network domains used by the task.
Timeouts
- Execution Timeout: The maximum allowed time for the task to run.
- Queue Timeout: The maximum time the task can wait in the queue.
- Total Timeout: The total allowed time for the job, including both queue and execution time.
Warnings and Errors (Submitted Jobs Only)
- Any warnings or errors that occurred during the job submission or execution process.

What Information is Collected on Job Execution

When a job is executed, detailed information about the execution process is collected to help monitor and optimize performance, as well as assist with troubleshooting. Here’s a breakdown of what is collected:

Execution Identification
- Execution ID: A unique identifier for the execution.
- Job ID: The identifier for the associated job.
- Evaluation ID: An identifier linking the execution to its evaluation process.
- Node Name Hash: A hashed version of the name of the node where the execution is running.
- Namespace Hash: A hashed version of the namespace under which the execution is running.
Execution Metadata
- Execution Name Set: Whether a specific name was set for the execution.
- Previous & Next Executions: Links to any preceding or subsequent executions, if applicable.
- Follow-up Evaluation ID: An identifier for any follow-up evaluations related to the execution.
- Revision: A version number that tracks changes to the execution configuration over time.
- Creation & Modification Times: Timestamps indicating when the execution was created and last modified.
Resource Allocation
- Total CPU Units: The total CPU resources allocated for the execution.
- Total Memory, Disk, and GPU Usage: The memory, disk space, and GPU resources used by the execution.
Execution States
- Desired State: The intended state of the execution (e.g., running, completed).
- Compute State & Message: The actual state of the execution, including any details about its progress or errors.
- Compute Error Code: An error code related to any issues with the execution's state on the compute node.
Published Results
- Published Result Type: The type of result produced by the execution, such as output files or data.
Run Command Results
- Run Output Details: Information about the command’s execution, including:
  - Exit Code: The exit code returned by the executed task (typically 0 for success).
  - RunResultStdoutTruncated: Whether stdout was truncated during execution.
  - RunResultStderrTruncated: Whether stderr was truncated during execution.

To opt out of sharing data, users may run one of the following commands before starting their bacalhau node: Disable collection via config set

bacalhau config set DisableAnalytics true

Disable collection via environment variable

export BACALHAU_DISABLEANALYTICS=true

Disable collection via editing the config file

echo 'disableanalytics: true' >> ~/.bacalhau/config.yaml

Disable collection via a config flag

bacalhau --config=DisableAnalytics=true <command>

How can users verify they have opted out?

bacalhau config list | grep disableanalytics

Expected output when collection is disabled:

disableanalytics  true  No description available

Expected output when collection is enabled:

disableanalytics  false  No description available

Documentation ¶

Index ¶

Constants
func EmitEvent(ctx context.Context, event *Event)
func SetupAnalyticsProvider(ctx context.Context, opts ...Option) error
func ShutdownAnalyticsProvider(ctx context.Context) error
type Config
type Event
- func (e *Event) ToLogRecord() (otellog.Record, error)
type EventType
type ExecutionComputeMessage
type ExecutionEvent
type GPUInfo
type JobTerminalEvent
type Option
type Resource
type SubmitJobEvent
- func NewSubmitJobEvent(j models.Job, warnings ...string) SubmitJobEvent

Constants ¶

View Source

const (
	NodeInstallationIDKey = "installation_id"
	NodeInstanceIDKey     = "instance_id"
	NodeIDHashKey         = "node_id_hash"
	NodeTypeKey           = "node_type"
	NodeVersionKey        = "node_version"
)

View Source

const ComputeMessageExecutionEventType = "bacalhau.execution_v1.compute_message"

View Source

const CreatedExecutionEventType = "bacalhau.execution_v1.create"

View Source

const DefaultOtelCollectorEndpoint = "t.bacalhau.org:4317"

View Source

const ProviderKey = "bacalhau-analytics"

View Source

const SubmitJobEventType = "bacalhau.job_v1.submit"

SubmitJobEventType is the event type for a job that has been submitted to an orchestrator.

View Source

const TerminalExecutionEventType = "bacalhau.execution_v1.terminal"

View Source

const TerminalJobEventType = "bacalhau.job_v1.terminal"

TerminalJobEventType is the event type for a job that has reached a terminal state.

Variables ¶

This section is empty.

Functions ¶

func EmitEvent ¶

func EmitEvent(ctx context.Context, event *Event)

func SetupAnalyticsProvider ¶

func SetupAnalyticsProvider(ctx context.Context, opts ...Option) error

func ShutdownAnalyticsProvider ¶

func ShutdownAnalyticsProvider(ctx context.Context) error

Types ¶

type Config ¶

type Config struct {
	// contains filtered or unexported fields
}

type Event ¶

type Event struct {
	Type       string
	Properties any
}

func NewComputeMessageExecutionEvent ¶

func NewComputeMessageExecutionEvent(e models.Execution) *Event

func NewCreatedExecutionEvent ¶

func NewCreatedExecutionEvent(e models.Execution) *Event

func NewEvent ¶

func NewEvent(eventType string, properties any) *Event

NewEvent creates a new Event.

func NewJobTerminalEvent ¶

func NewJobTerminalEvent(j models.Job) *Event

func NewTerminalExecutionEvent ¶

func NewTerminalExecutionEvent(e models.Execution) *Event

func (*Event) ToLogRecord ¶

func (e *Event) ToLogRecord() (otellog.Record, error)

ToLogRecord converts an Event to a LogRecord.

type EventType ¶

type EventType string

type ExecutionComputeMessage ¶

type ExecutionComputeMessage struct {
	JobID            string `json:"job_id,omitempty"`
	ExecutionID      string `json:"execution_id,omitempty"`
	ComputeMessage   string `json:"compute_message,omitempty"`
	ComputeErrorCode string `json:"compute_state_error_code,omitempty"`
}

type ExecutionEvent ¶

type ExecutionEvent struct {
	JobID       string `json:"job_id,omitempty"`
	ExecutionID string `json:"execution_id,omitempty"`
	EvalID      string `json:"evaluation_id,omitempty"`

	NameSet       bool   `json:"name_set,omitempty"`
	NodeNameHash  string `json:"node_name_hash,omitempty"`
	NamespaceHash string `json:"namespace_hash,omitempty"`

	Resources map[string]Resource `json:"resources,omitempty"`

	DesiredState          string `json:"desired_state,omitempty"`
	DesiredStateErrorCode string `json:"desired_state_error_code,omitempty"`

	ComputeState          string `json:"compute_state,omitempty"`
	ComputeStateErrorCode string `json:"compute_state_error_code,omitempty"`

	PublishedResultType string `json:"publisher_type,omitempty"`

	RunResultStdoutTruncated bool `json:"run_result_stdout_truncated,omitempty"`
	RunResultStderrTruncated bool `json:"run_result_stderr_truncated,omitempty"`
	RunResultExitCode        int  `json:"run_result_exit_code,omitempty"`

	PreviousExecution string `json:"previous_execution,omitempty"`
	NextExecution     string `json:"next_execution,omitempty"`
	FollowupEvalID    string `json:"followup_eval_id,omitempty"`

	Revision   uint64    `json:"revision,omitempty"`
	CreateTime time.Time `json:"create_time,omitempty"`
	ModifyTime time.Time `json:"modify_time,omitempty"`
}

type GPUInfo ¶

type GPUInfo struct {
	Name   string `json:"name,omitempty"`
	Vendor string `json:"vendor,omitempty"`
}

type JobTerminalEvent ¶

type JobTerminalEvent struct {
	JobID string `json:"job_id"`

	NameSet       bool   `json:"name_set"`
	NamespaceHash string `json:"namespace_hash"`

	Type        string `json:"type"`
	Count       int    `json:"count"`
	LabelsCount int    `json:"labels_count"`
	MetaCount   int    `json:"meta_count"`

	State string `json:"state"`

	Version    uint64    `json:"version"`
	Revision   uint64    `json:"revision"`
	CreateTime time.Time `json:"create_time"`
	ModifyTime time.Time `json:"modify_time"`

	TaskNameHash         string   `json:"task_name_hash"`
	TaskEngineType       string   `json:"task_engine_type"`
	TaskPublisherType    string   `json:"task_publisher_type"`
	TaskEnvVarCount      int      `json:"task_env_var_count"`
	TaskMetaCount        int      `json:"task_meta_count"`
	TaskInputSourceTypes []string `json:"task_input_source_types"`
	TaskResultPathCount  int      `json:"task_result_path_count"`

	Resources Resource `json:"resources,omitempty"`

	TaskNetworkType      string `json:"task_network_type"`
	TaskDomainsCount     int    `json:"task_domains_count"`
	TaskExecutionTimeout int64  `json:"task_execution_timeout"`
	TaskQueueTimeout     int64  `json:"task_queue_timeout"`
	TaskTotalTimeout     int64  `json:"task_total_timeout"`
}

type Option ¶

type Option func(*Config)

func WithEndpoint ¶

func WithEndpoint(endpoint string) Option

func WithInstallationID ¶

func WithInstallationID(id string) Option

func WithInstanceID ¶

func WithInstanceID(id string) Option

func WithNodeID ¶

func WithNodeID(id string) Option

func WithNodeType ¶

func WithNodeType(isRequester, isCompute bool) Option

func WithVersion ¶

func WithVersion(bv *models.BuildVersionInfo) Option

type Resource ¶

type Resource struct {
	CPUUnits    float64   `json:"cpu_units,omitempty"`
	MemoryBytes uint64    `json:"memory_bytes,omitempty"`
	DiskBytes   uint64    `json:"disk_bytes,omitempty"`
	GPUCount    uint64    `json:"gpu_count,omitempty"`
	GPUTypes    []GPUInfo `json:"gpu_types,omitempty"`
}

type SubmitJobEvent ¶

type SubmitJobEvent struct {
	JobID string `json:"job_id"`

	NameSet       bool   `json:"name_set"`
	NamespaceHash string `json:"namespace_hash"`

	Type        string `json:"type"`
	Count       int    `json:"count"`
	LabelsCount int    `json:"labels_count"`
	MetaCount   int    `json:"meta_count"`

	Version    uint64    `json:"version"`
	Revision   uint64    `json:"revision"`
	CreateTime time.Time `json:"create_time"`
	ModifyTime time.Time `json:"modify_time"`

	TaskNameHash         string   `json:"task_name_hash"`
	TaskEngineType       string   `json:"task_engine_type"`
	TaskPublisherType    string   `json:"task_publisher_type"`
	TaskEnvVarCount      int      `json:"task_env_var_count"`
	TaskMetaCount        int      `json:"task_meta_count"`
	TaskInputSourceTypes []string `json:"task_input_source_types"`
	TaskResultPathCount  int      `json:"task_result_path_count"`

	Resources Resource `json:"resources,omitempty"`

	TaskNetworkType      string `json:"task_network_type"`
	TaskDomainsCount     int    `json:"task_domains_count"`
	TaskExecutionTimeout int64  `json:"task_execution_timeout"`
	TaskQueueTimeout     int64  `json:"task_queue_timeout"`
	TaskTotalTimeout     int64  `json:"task_total_timeout"`

	Warnings []string `json:"warnings"`
	Error    string   `json:"error"`
}

func NewSubmitJobEvent ¶

func NewSubmitJobEvent(j models.Job, warnings ...string) SubmitJobEvent

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

What Data is shared by users of Bacalhau?

What information is collected on the bacalhau agent:

What information is collected on job submissions and completions:

What Information is Collected on Job Execution

How do users opt out of sharing data?

How can users verify they have opted out?

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func EmitEvent ¶

func SetupAnalyticsProvider ¶

func ShutdownAnalyticsProvider ¶

Types ¶

type Config ¶

type Event ¶

func NewComputeMessageExecutionEvent ¶

func NewCreatedExecutionEvent ¶

func NewEvent ¶

func NewJobTerminalEvent ¶

func NewTerminalExecutionEvent ¶

func (*Event) ToLogRecord ¶

type EventType ¶

type ExecutionComputeMessage ¶

type ExecutionEvent ¶

type GPUInfo ¶

type JobTerminalEvent ¶

type Option ¶

func WithEndpoint ¶

func WithInstallationID ¶

func WithInstanceID ¶

func WithNodeID ¶

func WithNodeType ¶

func WithVersion ¶

type Resource ¶

type SubmitJobEvent ¶

func NewSubmitJobEvent ¶

Source Files ¶