OTel Arrow Exporter
Status |
|
Stability |
development: metrics, traces, logs |
Distributions |
[] |
Issues |
|
Exports telemetry data using the
OTel-Arrow protocol and standard OTLP
protocol via gRPC.
Getting Started
The OTel-Arrow exporter is an extension of the core OpenTelemetry
Collector OTLP
exporter
component with additional support for the
OTel-Arrow protocol.
OTel-Arrow supports column-oriented data transport using the Apache
Arrow data format. This component converts OTLP data into an
optimized representation and then sends batches of data using Apache
Arrow to encode the stream. The OTel-Arrow
receiver
component contains logic to reverse the process used in this
component.
The use of an OTel-Arrow exporter-receiver pair is recommended when
the network is expensive. Typically, expect to see a 50% reduction in
bandwidth compared with the same data being sent using standard
OTLP/gRPC and gzip compression.
This component includes all the features and configuration of the core
OTLP exporter, making it possible to upgrade from the core component
simply by replacing "otlp" with "otelarrow" as the component name in
the collector configuration.
To enable the OTel-Arrow exporter, include it in the list of exporters
for a pipeline. Two settings are required:
endpoint
(no default): host:port to which the exporter is going to send OTLP trace data,
using the gRPC protocol. The valid syntax is described
here.
If a scheme of https
is used then client transport security is enabled and overrides the insecure
setting.
tls
: see TLS Configuration Settings for the full set of available options.
Example:
exporters:
otelarrow/secure:
endpoint: external-collector:4317
tls:
cert_file: file.cert
key_file: file.key
otelarrow/insecure:
endpoint: internal-collector:4317
tls:
insecure: true
By default, zstd
compression is enabled at the gRPC level. See
compression
comparison
for details and benchmark information. To disable gRPC-level
compression, configure as follows:
exporters:
otelarrow:
compression: none
endpoint: ...
tls: ...
Configuration
Several helper files are leveraged to provide additional capabilities automatically:
Arrow-specific Configuration
In the arrow
configuration block, the following settings enable and
disable the use of OTel Arrow as opposed to standard OTLP.
disabled
(default: false): disables use of Arrow, causing the exporter to use standard OTLP
disable_downgrade
(default: false): prevents this exporter from using standard OTLP.
The following settings determine the resources that the exporter will use:
num_streams
(default: number of CPUs): the number of concurrent Arrow streams
max_stream_lifetime
(default: unlimited): duration after which streams are recycled.
Network Configuration
This component uses round_robin
by default as the gRPC load
balancer. This can be modified using the balancer_name
setting, for
example, to configure the pick_first
balancer:
exporters:
otelarrow:
balancer_name: pick_first
endpoint: ...
tls: ...
When the server or an intermediate proxy uses a keepalive setting, the
Arrow-specific max_stream_lifetime
setting is critical to avoiding
abrupt termination of Arrow streams, which causes retries of the
in-flight requests. The maximum stream lifetime should be set to a
value less than the minimum of the server's keepalive parameter (and
any of the intermediate proxies), plus the export timeout.
exporters:
otelarrow:
timeout: 30s
arrow:
max_stream_lifetime: 9m30s
endpoint: ...
tls: ...
When this is configured, the stream will terminate cleanly without
causing retries, with OK
gRPC status.
The corresponding otelarrowreceiver
keepalive setting, that is
compatible with the one above,
reads:
receivers:
otelarrow:
protocols:
grpc:
keepalive:
server_parameters:
max_connection_age: 1m
max_connection_age_grace: 10m
Exporter metrics
In addition to the the standard
exporterhelper
and
obsreport
metrics, this component provides network-level measurement instruments
which we anticipate will become part of exporterhelper
and/or
obsreport
in the future. At the normal
level of metrics detail:
exporter_sent
: uncompressed bytes sent, prior to compression
exporter_sent_wire
: compressed bytes sent, on the wire.
Arrow's compression performance can be derived by dividing the average
exporter_sent
value by the average exporter_sent_wire
value.
At the detailed
metrics detail level, information about the stream
of data being returned to the exporter will be instrumented:
exporter_recv
: uncompressed bytes received, prior to compression
exporter_recv_wire
: compressed bytes received, on the wire.
Compression Configuration
The exporter supports configuring Zstd compression at both the gRPC
and the Arrow level. The exporter metrics described above will be
correct in either case. The default settings are subject to change as
we gain experience.
The gRPC-level Zstd compression can be configured, however there is an
important caveat. The gRPC-Go library requires that compressor
implementations be registered statically. These libraries use
compressors named zstdarrow1
, zstdarrow2
, ..., zstdarrow10
,
supporting 10 configurable compression levels. Note, however that
these configurations are static and only one unique configuration is
possible per level. It is possible to configure multiple OTel-Arrow
exporters with different Zstd configuration simply by using distinct
levels.
Under arrow
, the zstd
sub-configuration has the following fields:
level
: in the range 1-10 determines a number of defaults (default 5)
window_size_mib
: size of the Zstd window in MiB, 0 indicates to determine based on level (default 0)
concurrency
: controls background CPU used for compression, 0 indicates to let zstd
library decide (default 1)
The exporter supports configuring compression at the Arrow
columnar-protocol
level.
payload_compression
: compression applied at the Arrow IPC level, "none" by default, "zstd" supported.
Compression settings at the Arrow IPC level cannot be further configured.
For example, two exporters may be configured with multiple zstd
configurations, provided they use different levels:
exporters:
otelarrow/best:
compression: zstd # describes gRPC-level compression (default "zstd")
arrow:
zstd:
level: 10 # describes gRPC-level compression level (default 5)
payload_compression: zstd # describes Arrow-IPC compression (default "none")
otelarrow/fastest:
compression: zstd
arrow:
zstd:
level: 1 # 1 is the "fastest" compression level
Experimental Configuration
The exporter uses the signal-specific Arrow stream methods (i.e.,
ArrowTraces
, ArrowLogs
, and ArrowMetrics
) by default. There is
an option to use the generic ArrowStream
method instead.
enable_mixed_signals
(default: false): Use ArrowStream
instead of per-signal stream methods.
This option has the potential to enable the future exporter to cross
signals, meaning to allow traces, metrics and logs to refer to the
same shared-data items across a single stream. Presently, there is no
cross-signal compression benefit, this option simply causes one method
name instead of three method names to be used by the exporter
instances of different signal types.