OTel-Arrow Receiver
Receives telemetry data using the
OTel-Arrow protocol
via gRPC and standard OTLP
protocol via gRPC.
Getting Started
The OTel-Arrow receiver is an extension of the core OpenTelemetry
Collector OTLP
receiver
component with additional support for the
OTel-Arrow protocol.
OTel-Arrow supports column-oriented data transport using the Apache
Arrow data format. The OTel-Arrow
exporter
converts OTLP data into an optimized representation and then sends
batches of data using Apache Arrow to encode the stream. This
component contains logic to reverse the process used in the OTel-Arrow
exporter.
The use of an OTel-Arrow exporter-receiver pair is recommended when
the network is expensive. Typically, expect to see a 50% reduction in
bandwidth compared with the same data being sent using standard
OTLP/gRPC and gzip compression.
This component includes all the features and configuration of the core
OTLP receiver, making it possible to upgrade from the core component
simply by replacing "otlp" with "otelarrow" as the component name in
the collector configuration.
To enable the OTel-Arrow receiver, include it in the list of receivers
for a pipeline. No further configuration is needed. This receiver
listens on the standard OTLP/gRPC port 4317 and serves standard OTLP
over gRPC out of the box.
receivers:
otelarrow:
Advanced Configuration
Users may wish to configure gRPC settings, for example:
receivers:
otelarrow:
protocols:
grpc:
...
Several common configuration structures provide additional capabilities automatically:
Arrow-specific Configuration
In the arrow
configuration block, the following settings are available:
memory_limit_mib
(default: 128): limits the amount of concurrent memory used by Arrow data buffers.
When the limit is reached, the receiver will return RESOURCE_EXHAUSTED
error codes to the receiver, which are conditionally retryable, see
exporter retry configuration.
-
admission_limit_mib
(default: 64): limits the number of requests that are received by the stream based on request size information available. This should not be confused with memory_limit_mib
which limits allocations made by the consumer when translating arrow records into pdata objects. i.e. request size is used to control how much traffic we admit, but does not control how much memory is used during request processing.
-
waiter_limit
(default: 1000): limits the number of requests waiting on admission once admission_limit_mib
is reached. This is another dimension of memory limiting that ensures waiters are not holding onto a significant amount of memory while waiting to be processed.
admission_limit_mib
and waiter_limit
are arguments supplied to admission.BoundedQueue. This custom semaphore is meant to be used within receivers to help limit memory within the collector pipeline.
Compression Configuration
In the arrow
configuration block, zstd
sub-section applies to all
compression levels used by exporters:
memory_limit_mib
limits memory dedicated to Zstd decompression, per stream (default 128)
max_window_size_mib
: maximum size of the Zstd window in MiB, 0 indicates to determine based on level (default 32)
concurrency
: controls background CPU used for decompression, 0 indicates to let zstd
library decide (default 1)
Keepalive configuration
As a gRPC streaming service, the OTel Arrow receiver is able to limit
stream lifetime through configuration of the underlying http/2
connection via keepalive settings.
Keepalive settings are vital to the operation of OTel Arrow, because
longer-lived streams use more memory and streams are fixed to a single
host. Since every stream of data is different, we recommend
experimenting to find a good balance between memory usage, stream
lifetime, and load balance.
gRPC libraries do not build-in a facility for long-lived RPCs to learn
about impending http/2 connection state changes, including the event
that initiates connection reset. While the receiver knows its own
keepalive settings, a shorter maximum connection lifetime can be
imposed by intermediate http/2 proxies, and therefore the receiver and
exporter are expected to independently configure these limits.
receivers:
otelarrow:
protocols:
grpc:
keepalive:
server_parameters:
max_connection_age: 1m
max_connection_age_grace: 10m
In the example configuration above, OTel-Arrow streams will have reset
initiated after 10 minutes. Note that max_connection_age
is set to
a small value and we recommend tuning max_connection_age_grace
.
OTel Arrow exporters are expected to configure their
max_stream_lifetime
property to a value that is slightly smaller
than the receiver's max_connection_age_grace
setting, which causes
the exporter to cleanly shut down streams, allowing requests to
complete before the http/2 connection is forcibly closed. While the
exporter will retry data that was in-flight during an unexpected
stream shutdown, instrumentation about the telemety pipeline will show
RPC errors when the exporter's max_stream_lifetime
is not configured
correctly.
See the exporter README for more
guidance. For the
example where max_connection_age_grace
is set to 10 minutes, the
exporter's max_stream_lifetime
should be set to the same number
minus a reasonable timeout to allow in-flight requests to complete.
For example, an exporter with 9m30s
stream lifetime:
exporters:
otelarrow:
timeout: 30s
arrow:
max_stream_lifetime: 9m30s
endpoint: ...
tls: ...
Receiver metrics
In addition to the the standard
obsreport
metrics, this component provides network-level measurement instruments
which we anticipate will become part of obsreport
in the future. At
the normal
level of metrics detail:
receiver_recv
: uncompressed bytes received, prior to compression
receiver_recv_wire
: compressed bytes received, on the wire.
Arrow's compression performance can be derived by dividing the average
receiver_recv
value by the average receiver_recv_wire
value.
At the detailed
metrics detail level, information about the stream
of data being returned from the receiver will be instrumented:
receiver_sent
: uncompressed bytes sent, prior to compression
receiver_sent_wire
: compressed bytes sent, on the wire.
There several OTel-Arrow-consumer related metrics available to help
diagnose internal performance. These are disabled at the basic level
of detail. At the normal level, these metrics are introduced:
arrow_batch_records
: Counter of Arrow-IPC records processed
arrow_memory_inuse
: UpDownCounter of memory in use by current streams
arrow_schema_resets
: Counter of times the schema was adjusted, by data type.
service
...
telemetry:
...
metrics:
...
level: detailed