otelarrowreceiver

package module
v0.24.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2024 License: Apache-2.0 Imports: 27 Imported by: 1

README

OTel-Arrow Receiver

Status
Stability development: metrics, traces, logs
Distributions contrib
Issues Open issues Closed issues

Receives telemetry data using the OTel-Arrow protocol via gRPC and standard OTLP protocol via gRPC.

Getting Started

The OTel-Arrow receiver is an extension of the core OpenTelemetry Collector OTLP receiver component with additional support for the OTel-Arrow protocol.

OTel-Arrow supports column-oriented data transport using the Apache Arrow data format. The OTel-Arrow exporter converts OTLP data into an optimized representation and then sends batches of data using Apache Arrow to encode the stream. This component contains logic to reverse the process used in the OTel-Arrow exporter.

The use of an OTel-Arrow exporter-receiver pair is recommended when the network is expensive. Typically, expect to see a 50% reduction in bandwidth compared with the same data being sent using standard OTLP/gRPC and gzip compression.

This component includes all the features and configuration of the core OTLP receiver, making it possible to upgrade from the core component simply by replacing "otlp" with "otelarrow" as the component name in the collector configuration.

To enable the OTel-Arrow receiver, include it in the list of receivers for a pipeline. No further configuration is needed. This receiver listens on the standard OTLP/gRPC port 4317 and serves standard OTLP over gRPC out of the box.

receivers:
  otelarrow:

Advanced Configuration

Users may wish to configure gRPC settings, for example:

receivers:
  otelarrow:
    protocols:
      grpc:
        ...

Several common configuration structures provide additional capabilities automatically:

Arrow-specific Configuration

In the arrow configuration block, the following settings are available:

  • memory_limit_mib (default: 128): limits the amount of concurrent memory used by Arrow data buffers.

When the limit is reached, the receiver will return RESOURCE_EXHAUSTED error codes to the receiver, which are conditionally retryable, see exporter retry configuration.

  • admission_limit_mib (default: 64): limits the number of requests that are received by the stream based on request size information available. This should not be confused with memory_limit_mib which limits allocations made by the consumer when translating arrow records into pdata objects. i.e. request size is used to control how much traffic we admit, but does not control how much memory is used during request processing.

  • waiter_limit (default: 1000): limits the number of requests waiting on admission once admission_limit_mib is reached. This is another dimension of memory limiting that ensures waiters are not holding onto a significant amount of memory while waiting to be processed.

admission_limit_mib and waiter_limit are arguments supplied to admission.BoundedQueue. This custom semaphore is meant to be used within receivers to help limit memory within the collector pipeline.

Compression Configuration

In the arrow configuration block, zstd sub-section applies to all compression levels used by exporters:

  • memory_limit_mib limits memory dedicated to Zstd decompression, per stream (default 128)
  • max_window_size_mib: maximum size of the Zstd window in MiB, 0 indicates to determine based on level (default 32)
  • concurrency: controls background CPU used for decompression, 0 indicates to let zstd library decide (default 1)
Keepalive configuration

As a gRPC streaming service, the OTel Arrow receiver is able to limit stream lifetime through configuration of the underlying http/2 connection via keepalive settings.

Keepalive settings are vital to the operation of OTel Arrow, because longer-lived streams use more memory and streams are fixed to a single host. Since every stream of data is different, we recommend experimenting to find a good balance between memory usage, stream lifetime, and load balance.

gRPC libraries do not build-in a facility for long-lived RPCs to learn about impending http/2 connection state changes, including the event that initiates connection reset. While the receiver knows its own keepalive settings, a shorter maximum connection lifetime can be imposed by intermediate http/2 proxies, and therefore the receiver and exporter are expected to independently configure these limits.

receivers:
  otelarrow:
    protocols:
      grpc:
        keepalive:
          server_parameters:
            max_connection_age: 1m
            max_connection_age_grace: 10m

In the example configuration above, OTel-Arrow streams will have reset initiated after 10 minutes. Note that max_connection_age is set to a small value and we recommend tuning max_connection_age_grace.

OTel Arrow exporters are expected to configure their max_stream_lifetime property to a value that is slightly smaller than the receiver's max_connection_age_grace setting, which causes the exporter to cleanly shut down streams, allowing requests to complete before the http/2 connection is forcibly closed. While the exporter will retry data that was in-flight during an unexpected stream shutdown, instrumentation about the telemety pipeline will show RPC errors when the exporter's max_stream_lifetime is not configured correctly.

See the exporter README for more guidance. For the example where max_connection_age_grace is set to 10 minutes, the exporter's max_stream_lifetime should be set to the same number minus a reasonable timeout to allow in-flight requests to complete. For example, an exporter with 9m30s stream lifetime:

exporters:
  otelarrow:
    timeout: 30s
    arrow:
      max_stream_lifetime: 9m30s
    endpoint: ...
    tls: ...
Receiver metrics

In addition to the the standard obsreport metrics, this component provides network-level measurement instruments which we anticipate will become part of obsreport in the future. At the normal level of metrics detail:

  • receiver_recv: uncompressed bytes received, prior to compression
  • receiver_recv_wire: compressed bytes received, on the wire.

Arrow's compression performance can be derived by dividing the average receiver_recv value by the average receiver_recv_wire value.

At the detailed metrics detail level, information about the stream of data being returned from the receiver will be instrumented:

  • receiver_sent: uncompressed bytes sent, prior to compression
  • receiver_sent_wire: compressed bytes sent, on the wire.

There several OTel-Arrow-consumer related metrics available to help diagnose internal performance. These are disabled at the basic level of detail. At the normal level, these metrics are introduced:

  • arrow_batch_records: Counter of Arrow-IPC records processed
  • arrow_memory_inuse: UpDownCounter of memory in use by current streams
  • arrow_schema_resets: Counter of times the schema was adjusted, by data type.
service
  ...
  telemetry:
    ...
    metrics:
      ...
      level: detailed

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewFactory

func NewFactory() receiver.Factory

NewFactory creates a new OTel-Arrow receiver factory.

Types

type ArrowConfig added in v0.21.0

type ArrowConfig struct {
	// MemoryLimitMiB is the size of a shared memory region used
	// by all Arrow streams, in MiB.  When too much load is
	// passing through, they will see ResourceExhausted errors.
	MemoryLimitMiB uint64 `mapstructure:"memory_limit_mib"`

	// AdmissionLimitMiB limits the number of requests that are received by the stream based on
	// request size information available. Request size is used to control how much traffic we admit
	// for processing, but does not control how much memory is used during request processing.
	AdmissionLimitMiB uint64 `mapstructure:"admission_limit_mib"`

	// WaiterLimit is the limit on the number of waiters waiting to be processed and consumed.
	// This is a dimension of memory limiting to ensure waiters are not consuming an
	// unexpectedly large amount of memory in the arrow receiver.
	WaiterLimit int64 `mapstructure:"waiter_limit"`

	// Zstd settings apply to OTel-Arrow use of gRPC specifically.
	Zstd zstd.DecoderConfig `mapstructure:"zstd"`
}

ArrowConfig support configuring the Arrow receiver.

func (*ArrowConfig) Validate added in v0.21.0

func (cfg *ArrowConfig) Validate() error

type Config

type Config struct {
	// Protocols is the configuration for gRPC and Arrow.
	Protocols `mapstructure:"protocols"`
}

Config defines configuration for OTel Arrow receiver.

type Protocols

type Protocols struct {
	GRPC  configgrpc.ServerConfig `mapstructure:"grpc"`
	Arrow ArrowConfig             `mapstructure:"arrow"`
}

Protocols is the configuration for the supported protocols.

Directories

Path Synopsis
internal
arrow/mock
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL