go-spancheck
Checks usage of OpenTelemetry spans from go.opentelemetry.io/otel/trace.
Installation & Usage
go install github.com/jjti/go-spancheck/cmd/spancheck@latest
spancheck ./...
Example
spancheck -checks 'end,set-status,record-error' ./...
func _() error {
// span.End is not called on all paths, possible memory leak
// span.SetStatus is not called on all paths
// span.RecordError is not called on all paths
_, span := otel.Tracer("foo").Start(context.Background(), "bar")
if true {
// return can be reached without calling span.End
// return can be reached without calling span.SetStatus
// return can be reached without calling span.RecordError
return errors.New("err")
}
return nil // return can be reached without calling span.End
}
Configuration
Only the span.End()
check is enabled by default. The others can be enabled with -checks 'end,set-status,record-error'
.
$ spancheck -h
...
Flags:
-checks string
comma-separated list of checks to enable (options: end, set-status, record-error) (default "end")
-ignore-check-signatures string
comma-separated list of regex for function signatures that disable checks on errors
Ignore check signatures
The span.SetStatus()
and span.RecordError()
checks warn when there is:
- a path to return statement
- that returns an error
- without a call (to
SetStatus
or RecordError
, respectively)
But it's convenient to call SetStatus
and RecordError
from utility methods [1]. To support that, the ignore-*-check-signatures
settings will suppress warnings if the configured function is present in the path.
For example, by default, the code below would have warnings as shown:
func task(ctx context.Context) error {
ctx, span := otel.Tracer("foo").Start(ctx, "bar") // span.SetStatus is not called on all paths
defer span.End()
if err := subTask(ctx); err != nil {
return recordErr(span, err) // return can be reached without calling span.SetStatus
}
return nil
}
func recordErr(span trace.Span, err error) error {
span.SetStatus(codes.Error, err.Error())
span.RecordError(err)
return err
}
The warnings are can be ignored by setting -ignore-check-signatures
flag to recordErr
:
spancheck -checks 'end,set-status,record-error' -ignore-check-signatures 'recordErr' ./...
Problem Statement
Tracing is a celebrated [1,2] and well marketed [3,4] pillar of observability. But self-instrumented tracing requires a lot of easy-to-forget boilerplate:
import (
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/codes"
)
func task(ctx context.Context) error {
ctx, span := otel.Tracer("foo").Start(ctx, "bar")
defer span.End() // call `.End()`
if err := subTask(ctx); err != nil {
span.SetStatus(codes.Error, err.Error()) // call SetStatus(codes.Error, msg) to set status:error
span.RecordError(err) // call RecordError(err) to record an error event
return err
}
return nil
}
For spans to be really useful, developers need to:
- call
span.End()
always
- call
span.SetStatus(codes.Error, msg)
on error
- call
span.RecordError(err)
on error
- call
span.SetAttributes()
liberally
This linter helps developers with steps 1-3.
Checks
This linter supports three checks, each documented below. Only the check for span.End()
is enabled by default. See Configuration for instructions on enabling the others.
span.End()
Enabled by default.
Not calling End
can cause memory leaks and prevents spans from being closed.
Any Span that is created MUST also be ended. This is the responsibility of the user. Implementations of this API may leak memory or other resources if Spans are not ended.
source: trace.go
func task(ctx context.Context) error {
otel.Tracer("app").Start(ctx, "foo") // span is unassigned, probable memory leak
_, span := otel.Tracer().Start(ctx, "foo") // span.End is not called on all paths, possible memory leak
return nil // return can be reached without calling span.End
}
span.SetStatus(codes.Error, "msg")
Disabled by default. Enable with -checks 'set-status'
.
Developers should call SetStatus
on spans. The status attribute is an important, first-class attribute:
- observability platforms and APMs differentiate "success" vs "failure" using span's status codes.
- telemetry collector agents, like the Open Telemetry Collector's Tail Sampling Processor, are configurable to sample
Error
spans at a higher rate than OK
spans.
- observability platforms, like DataDog, have trace retention filters that use spans' status. In other words,
status:error
spans often receive special treatment with the assumption they are more useful for debugging. And forgetting to set the status can lead to spans, with useful debugging information, being dropped.
func _() error {
_, span := otel.Tracer("foo").Start(context.Background(), "bar") // span.SetStatus is not called on all paths
defer span.End()
if err := subTask(); err != nil {
span.RecordError(err)
return errors.New(err) // return can be reached without calling span.SetStatus
}
return nil
}
OpenTelemetry docs: Set span status.
span.RecordError(err)
Disabled by default. Enable with -checks 'record-error'
.
Calling RecordError
creates a new exception-type event (structured log message) on the span. This is recommended to capture the error's stack trace.
func _() error {
_, span := otel.Tracer("foo").Start(context.Background(), "bar") // span.RecordError is not called on all paths
defer span.End()
if err := subTask(); err != nil {
span.SetStatus(codes.Error, err.Error())
return errors.New(err) // return can be reached without calling span.RecordError
}
return nil
}
OpenTelemetry docs: Record errors.
Attribution
This linter is the product of liberal copying of: