Deprecated Tanzu Observability (Wavefront) Exporter
[!WARNING]
Tanzu Observability (Wavefront) Proxy v11.3 supports native OpenTelemetry protocol (OTLP) ingestion of traces and metrics, released in June 2022.
This vendor specific exporter is deprecated and will become unavailable after the end of 2023.
Refer to our documentation for configuring the Proxy to receive traces and metrics via OTLP gRPC or OTLP HTTP.
This exporter supports sending metrics and traces to Tanzu Observability.
Prerequisites
Configuration
Given a Wavefront proxy at 10.10.10.10 configured with customTracingListenerPorts=30001
, a basic configuration of
the Tanzu Observability exporter follows:
receivers:
examplereceiver:
processors:
batch:
timeout: 10s
exporters:
tanzuobservability:
traces:
endpoint: "http://10.10.10.10:30001"
metrics:
endpoint: "http://10.10.10.10:2878"
service:
pipelines:
traces:
receivers: [ examplereceiver ]
processors: [ batch ]
exporters: [ tanzuobservability ]
metrics:
receivers: [ examplereceiver ]
processors: [ batch ]
exporters: [ tanzuobservability ]
Advanced Configuration
Resource Attributes on Metrics
Client programs using an OpenTelemetry SDK can be configured to wrap all emitted telemetry (metrics, spans, logs) with
a set of global key-value pairs,
called resource attributes
.
By default, the Tanzu Observability Exporter includes resource attributes on spans but excludes them on metrics. To
include resource attributes as tags on metrics, set the flag resource_attrs_included
to true
as per the example
below.
Note: Tanzu Observability has a 254-character limit on tag key-value pairs. If a resource attribute exceeds this
limit, the metric will not show up in Tanzu Observability.
Application Resource Attributes on Metrics
The Tanzu Observability Exporter will
include application resource attributes on
metrics (application
, service.name
, cluster
, and shard
). To exclude these resource
attributes as tags on metrics, set the flag app_tags_excluded
to true
as per the example
below.
Note: A tag service.name
(if provided) becomes service
on the transformed wavefront metric. However, if both the
tags (service
& service.name
) are provided then the service
tag will be included.
Queuing and Retries
This exporter uses OpenTelemetry Collector helpers to queue data and retry on failures.
Recommended Pipeline Processors
The memory_limiter processor is recommended to prevent out of memory situations on the collector. It allows performing
periodic checks of memory usage – if it exceeds defined limits it will begin dropping data and forcing garbage
collection to reduce memory
consumption. Details and defaults here
.
Note: The order matters when enabling multiple processors in a pipeline (e.g. the memory limiter and batch
processors in the example config below). Please refer to the
processors' documentation
for more information.
Example Advanced Configuration
receivers:
examplereceiver:
processors:
memory_limiter:
check_interval: 1s
limit_percentage: 50
spike_limit_percentage: 30
batch:
timeout: 10s
exporters:
tanzuobservability:
traces:
endpoint: "http://10.10.10.10:30001"
metrics:
endpoint: "http://10.10.10.10:2878"
resource_attrs_included: true
app_tags_excluded: true
retry_on_failure:
max_elapsed_time: 3m
sending_queue:
queue_size: 10000
service:
pipelines:
traces:
receivers: [ examplereceiver ]
processors: [ memory_limiter, batch ]
exporters: [ tanzuobservability ]
metrics:
receivers: [ examplereceiver ]
processors: [ memory_limiter, batch ]
exporters: [ tanzuobservability ]
Attributes Required by Tanzu Observability
Source
A source
field is required in Tanzu
Observability spans
and metrics. The source is set to
the
first matching OpenTelemetry Resource Attribute:
source
host.name
hostname
host.id
To reduce duplicate data, the matched attribute is excluded from the tags on the exported Tanzu Observability span or
metric.
If none of the above resource attributes exist, the OpenTelemetry Collector's hostname is used as a fallback for source.
Application identity tags of
application
and service
are required for all spans in Tanzu Observability.
application
is set to the value of the attribute application
on the OpenTelemetry Span or Resource. Default is "
defaultApp".
service
is set the value of the attribute service
or service.name
on the OpenTelemetry Span or Resource. Default
is "defaultService".
Data Conversion for Traces
- Trace IDs and Span IDs are converted to UUIDs. For example, span IDs are left-padded with zeros to fit the correct
size.
- Events are converted to Span Logs.
- Kind is converted to the
span.kind
tag.
- If a Span's status code is error, a tag of
error=true
is added. If the status also has a description, it's set
to otel.status_description
.
- TraceState is converted to the
w3c.tracestate
tag.
Data Conversion for Metrics
This section describes the process used by the Exporter when converting from
OpenTelemetry Metrics to
Tanzu Observability by Wavefront Metrics.
OpenTelemetry Metric Type |
Wavefront Metric Type |
Notes |
Gauge |
Gauge |
|
Cumulative Sum |
Cumulative Counter |
|
Delta Sum |
Delta Counter |
|
Cumulative Histogram (incl. Exponential) |
Cumulative Counters |
Details below. |
Delta Histogram (incl. Exponential) |
Histogram |
|
Summary |
Gauges |
Details below. |
Cumulative Histogram Conversion (incl. Exponential)
A cumulative histogram is converted to multiple counter metrics: one counter per bucket in the histogram. Each counter
has a special "le" tag that matches the upper bound of the corresponding bucket. The value of the counter metric is the
sum of the histogram's corresponding bucket and all the buckets before it.
When working with OpenTelemetry Cumulative Histograms that have been converted to Wavefront Counters, these functions
will be of use:
Example
Suppose a cumulative histogram named "http.response_times" has
the following buckets and values:
Bucket |
Value |
≤ 100ms |
5 |
> 100ms to ≤ 200ms |
20 |
> 200ms |
100 |
The exporter sends the following metrics to tanzuobservability:
Name |
Tags |
Value |
http.response_times |
le="100" |
5 |
http.response_times |
le="200" |
25 |
http.response_times |
le="+Inf" |
125 |
Example WQL Query on a Cumulative Histogram
Using the cumulative histogram from the section above, this WQL query will produce a graph showing
the 95th percentile of http response times in the last 15 minutes.
cumulativePercentile(95, mavg(15m, deriv(sum(ts(http.reponse_times), le))))
The sum function aggregates the http response times and groups them by the le tag. Since
http.response_times has three buckets, the sum() function will graph three lines, one for each bucket.
deriv() shows the per second rate of change in the three lines from sum. The mavg function averages
the rates of change of the three lines over the last 15 minutes. Since the rates of change are per
second, if you multiply the average rate of change for a bucket by 900, you get the number of new
http requests falling into that bucket in the last 15 minutes. Finally, cumulativePercentile
uses the values of the le
tags, which are http response times, and linear interpolation of the
bucket counts to estimate the 95th percentile of http.response_times over the last 15 minutes.
Summary Conversion
A summary is converted to multiple gauge metrics: one gauge for every quantile in the summary. A special "quantile" tag
contains avalue between 0 and 1 indicating the quantile for which the value belongs.