spanmetricsconnector

package module
v0.110.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 24, 2024 License: Apache-2.0 Imports: 24 Imported by: 7

README

Span Metrics Connector

Status
Distributions contrib
Issues Open issues Closed issues
Code Owners @portertech, @Frapschen | Seeking more code owners!
Emeritus @albertteoh

Supported Pipeline Types

Exporter Pipeline Type Receiver Pipeline Type Stability Level
traces metrics alpha

Overview

Aggregates Request, Error and Duration (R.E.D) OpenTelemetry metrics from span data.

Request counts are computed as the number of spans seen per unique set of dimensions, including Errors. Multiple metrics can be aggregated if, for instance, a user wishes to view call counts just on service.name and span.name.

calls{service.name="shipping",span.name="get_shipping/{shippingId}",span.kind="SERVER",status.code="Ok"}

Error counts are computed from the Request counts which have an Error Status Code metric dimension.

calls{service.name="shipping",span.name="get_shipping/{shippingId},span.kind="SERVER",status.code="Error"}

Duration is computed from the difference between the span start and end times and inserted into the relevant duration histogram time bucket for each unique set dimensions.

duration{service.name="shipping",span.name="get_shipping/{shippingId}",span.kind="SERVER",status.code="Ok"}

Each metric will have at least the following dimensions because they are common across all spans:

  • service.name
  • span.name
  • span.kind
  • status.code

Span to Metrics processor to Span to metrics connector

The spanmetrics connector replaces spanmetrics processor with multiple improvements and breaking changes. It was done to bring the spanmetrics connector closer to the OpenTelemetry specification and make the component agnostic to exporters logic. The spanmetrics processor essentially was mixing the OTel with Prometheus conventions by using the OTel data model and the Prometheus metric and attributes naming convention.

The following changes were done to the connector component.

Breaking changes:

  • The operation metric attribute was renamed to span.name.
  • The latency histogram metric name was changed to duration.
  • The _total metric prefix was dropped from generated metrics names.
  • The Prometheus-specific metrics labels sanitization was dropped.

Improvements:

  • Added support for OTel exponential histograms for recording span duration measurements.
  • Added support for the milliseconds and seconds histogram units.
  • Added support for generating metrics resource scope attributes. The spanmetrics connector will generate the number of metrics resource scopes that corresponds to the number of the spans resource scopes meaning that more metrics are generated now. Previously, spanmetrics generated a single metrics resource scope.

Configurations

If you are not already familiar with connectors, you may find it helpful to first visit the Connectors README.

The following settings can be optionally configured:

  • histogram (default: explicit): Use to configure the type of histogram to record calculated from spans duration measurements. Must be either explicit or exponential.

    • disable (default: false): Disable all histogram metrics.
    • unit (default: ms): The time unit for recording duration measurements. calculated from spans duration measurements. One of either: ms or s.
    • explicit:
      • buckets: the list of durations defining the duration histogram time buckets. Default buckets: [2ms, 4ms, 6ms, 8ms, 10ms, 50ms, 100ms, 200ms, 400ms, 800ms, 1s, 1400ms, 2s, 5s, 10s, 15s]
    • exponential:
      • max_size (default: 160) the maximum number of buckets per positive or negative number range.
  • dimensions: the list of dimensions to add together with the default dimensions defined above.

    Each additional dimension is defined with a name which is looked up in the span's collection of attributes or resource attributes (AKA process tags) such as ip, host.name or region.

    If the named attribute is missing in the span, the optional provided default is used.

    If no default is provided, this dimension will be omitted from the metric.

  • exclude_dimensions: the list of dimensions to be excluded from the default set of dimensions. Use to exclude unneeded data from metrics.

  • dimensions_cache_size (default: 1000): the size of cache for storing Dimensions to improve collectors memory usage. Must be a positive number.

  • resource_metrics_cache_size (default: 1000): the size of the cache holding metrics for a service. This is mostly relevant for cumulative temporality to avoid memory leaks and correct metric timestamp resets.

  • aggregation_temporality (default: AGGREGATION_TEMPORALITY_CUMULATIVE): Defines the aggregation temporality of the generated metrics. One of either AGGREGATION_TEMPORALITY_CUMULATIVE or AGGREGATION_TEMPORALITY_DELTA.

  • namespace (default: traces.span.metrics): Defines the namespace of the generated metrics. If namespace provided, generated metric name will be added namespace. prefix.

  • metrics_flush_interval (default: 60s): Defines the flush interval of the generated metrics.

  • metrics_expiration (default: 0): Defines the expiration time as time.Duration, after which, if no new spans are received, metrics will no longer be exported. Setting to 0 means the metrics will never expire (default behavior).

  • metric_timestamp_cache_size (default 1000): Only relevant for delta temporality span metrics. Controls the size of the cache used to keep track of a metric's TimestampUnixNano the last time it was flushed. When a metric is evicted from the cache, its next data point will indicate a "reset" in the series. Downstream components converting from delta to cumulative, like prometheusexporter, may handle these resets by setting cumulative counters back to 0.

  • exemplars: Use to configure how to attach exemplars to metrics.

    • enabled (default: false): enabling will add spans as Exemplars to all metrics. Exemplars are only kept for one flush interval.rom the cache, its next data point will indicate a "reset" in the series. Downstream components converting from delta to cumulative, like prometheusexporter, may handle these resets by setting cumulative counters back to 0.
  • events: Use to configure the events metric.

    • enabled: (default: false): enabling will add the events metric.
    • dimensions: (mandatory if enabled) the list of the span's event attributes to add as dimensions to the events metric, which will be included on top of the common and configured dimensions for span and resource attributes.
  • resource_metrics_key_attributes: Filter the resource attributes used to produce the resource metrics key map hash. Use this in case changing resource attributes (e.g. process id) are breaking counter metrics.

The feature gate connector.spanmetrics.legacyMetricNames (disabled by default) controls the connector to use legacy metric names.

Examples

The following is a simple example usage of the spanmetrics connector.

For configuration examples on other use cases, please refer to More Examples.

The full list of settings exposed for this connector are documented here.

receivers:
  nop:

exporters:
  nop:

connectors:
  spanmetrics:
    histogram:
      explicit:
        buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
    dimensions:
      - name: http.method
        default: GET
      - name: http.status_code
    exemplars:
      enabled: true
    exclude_dimensions: ['status.code']
    dimensions_cache_size: 1000
    aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"    
    metrics_flush_interval: 15s
    metrics_expiration: 5m
    events:
      enabled: true
      dimensions:
        - name: exception.type
        - name: exception.message
    resource_metrics_key_attributes:
      - service.name
      - telemetry.sdk.language
      - telemetry.sdk.name

service:
  pipelines:
    traces:
      receivers: [nop]
      exporters: [spanmetrics]
    metrics:
      receivers: [spanmetrics]
      exporters: [nop]
Using spanmetrics with Prometheus components

The spanmetrics connector can be used with Prometheus exporter components.

For some functionality of the exporters, e.g. like generation of the target_info metric the incoming spans resource scope attributes must contain service.name and service.instance.id attributes.

Let's look at the example of using the spanmetrics connector with the prometheusremotewrite exporter:

receivers:
  otlp:
    protocols:
      http:
      grpc:

exporters:
  prometheusremotewrite:
    endpoint: http://localhost:9090/api/v1/write
    target_info:
      enabled: true

connectors:
  spanmetrics:
    namespace: span.metrics

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [spanmetrics]
    metrics:
      receivers: [spanmetrics]
      exporters: [prometheusremotewrite]

This configures the spanmetrics connector to generate metrics from received spans and export the metrics to the Prometheus Remote Write exporter. The target_info metric will be generated for each resource scope, while OpenTelemetry metric names and attributes will be normalized to be compliant with Prometheus naming rules. For example, the generated calls OTel sum metric can result in multiple Prometheus calls_total (counter type) time series and the target_info time series. For example:

target_info{job="shippingservice", instance="...", ...} 1
calls_total{span_name="/Address", service_name="shippingservice", span_kind="SPAN_KIND_SERVER", status_code="STATUS_CODE_UNSET", ...} 142
More Examples

For more example configuration covering various other use cases, please visit the testdata directory.

Documentation

Index

Constants

View Source
const (
	DefaultNamespace = "traces.span.metrics"
)

Variables

This section is empty.

Functions

func NewFactory

func NewFactory() connector.Factory

NewFactory creates a factory for the spanmetrics connector.

Types

type Config

type Config struct {
	// Dimensions defines the list of additional dimensions on top of the provided:
	// - service.name
	// - span.kind
	// - span.kind
	// - status.code
	// The dimensions will be fetched from the span's attributes. Examples of some conventionally used attributes:
	// https://github.com/open-telemetry/opentelemetry-collector/blob/main/model/semconv/opentelemetry.go.
	Dimensions        []Dimension `mapstructure:"dimensions"`
	ExcludeDimensions []string    `mapstructure:"exclude_dimensions"`

	// DimensionsCacheSize defines the size of cache for storing Dimensions, which helps to avoid cache memory growing
	// indefinitely over the lifetime of the collector.
	// Optional. See defaultDimensionsCacheSize in connector.go for the default value.
	DimensionsCacheSize int `mapstructure:"dimensions_cache_size"`

	// ResourceMetricsCacheSize defines the size of the cache holding metrics for a service. This is mostly relevant for
	// cumulative temporality to avoid memory leaks and correct metric timestamp resets.
	// Optional. See defaultResourceMetricsCacheSize in connector.go for the default value.
	ResourceMetricsCacheSize int `mapstructure:"resource_metrics_cache_size"`

	// ResourceMetricsKeyAttributes filters the resource attributes used to create the resource metrics key hash.
	// This can be used to avoid situations where resource attributes may change across service restarts, causing
	// metric counters to break (and duplicate). A resource does not need to have all of the attributes. The list
	// must include enough attributes to properly identify unique resources or risk aggregating data from more
	// than one service and span.
	// e.g. ["service.name", "telemetry.sdk.language", "telemetry.sdk.name"]
	// See https://opentelemetry.io/docs/specs/semconv/resource/ for possible attributes.
	ResourceMetricsKeyAttributes []string `mapstructure:"resource_metrics_key_attributes"`

	AggregationTemporality string `mapstructure:"aggregation_temporality"`

	Histogram HistogramConfig `mapstructure:"histogram"`

	// MetricsEmitInterval is the time period between when metrics are flushed or emitted to the configured MetricsExporter.
	MetricsFlushInterval time.Duration `mapstructure:"metrics_flush_interval"`

	// MetricsExpiration is the time period after which, if no new spans are received, metrics are considered stale and will no longer be exported.
	// Default value (0) means that the metrics will never expire.
	MetricsExpiration time.Duration `mapstructure:"metrics_expiration"`

	// TimestampCacheSize controls the size of the cache used to keep track of delta metrics' TimestampUnixNano the last time it was flushed
	TimestampCacheSize *int `mapstructure:"metric_timestamp_cache_size"`

	// Namespace is the namespace of the metrics emitted by the connector.
	Namespace string `mapstructure:"namespace"`

	// Exemplars defines the configuration for exemplars.
	Exemplars ExemplarsConfig `mapstructure:"exemplars"`

	// Events defines the configuration for events section of spans.
	Events EventsConfig `mapstructure:"events"`
}

Config defines the configuration options for spanmetricsconnector.

func (Config) GetAggregationTemporality

func (c Config) GetAggregationTemporality() pmetric.AggregationTemporality

GetAggregationTemporality converts the string value given in the config into a AggregationTemporality. Returns cumulative, unless delta is correctly specified.

func (Config) GetDeltaTimestampCacheSize added in v0.103.0

func (c Config) GetDeltaTimestampCacheSize() int

func (Config) Validate added in v0.73.0

func (c Config) Validate() error

Validate checks if the processor configuration is valid

type Dimension

type Dimension struct {
	Name    string  `mapstructure:"name"`
	Default *string `mapstructure:"default"`
}

Dimension defines the dimension name and optional default value if the Dimension is missing from a span attribute.

type EventsConfig added in v0.89.0

type EventsConfig struct {
	// Enabled is a flag to enable events.
	Enabled bool `mapstructure:"enabled"`
	// Dimensions defines the list of dimensions to add to the events metric.
	Dimensions []Dimension `mapstructure:"dimensions"`
}

type ExemplarsConfig added in v0.82.0

type ExemplarsConfig struct {
	Enabled         bool `mapstructure:"enabled"`
	MaxPerDataPoint *int `mapstructure:"max_per_data_point"`
}

type ExplicitHistogramConfig added in v0.73.0

type ExplicitHistogramConfig struct {
	// Buckets is the list of durations representing explicit histogram buckets.
	Buckets []time.Duration `mapstructure:"buckets"`
}

type ExponentialHistogramConfig added in v0.73.0

type ExponentialHistogramConfig struct {
	MaxSize int32 `mapstructure:"max_size"`
}

type HistogramConfig added in v0.73.0

type HistogramConfig struct {
	Disable     bool                        `mapstructure:"disable"`
	Unit        metrics.Unit                `mapstructure:"unit"`
	Exponential *ExponentialHistogramConfig `mapstructure:"exponential"`
	Explicit    *ExplicitHistogramConfig    `mapstructure:"explicit"`
}

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL