SignalFx Metrics Exporter
This exporter can be used to send metrics, events, and trace correlation to SignalFx.
Apart from metrics, the exporter is also capable of sending metric metadata
(properties and tags) to SignalFx. Currently, only metric metadata updates from
the k8s_cluster receiver are
supported.
Metrics Configuration
The following configuration options are required:
access_token
(no default): The access token is the authentication token
provided by Splunk Observability Cloud. The access token can be obtained from the
web app. For details on how to do so please refer the documentation here.
- Either
realm
or both api_url
and ingest_url
. Both api_url
and
ingest_url
take precedence over realm
.
realm
(no default): SignalFx realm where the data will be received.
api_url
(no default): Destination to which properties and
tags
are sent. If realm
is set, this option is derived and will be
https://api.{realm}.signalfx.com
. If a value is explicitly set, the
value of realm
will not be used in determining api_url
. The explicit
value will be used instead.
ingest_url
(no default): Destination where SignalFx metrics are sent. If
realm
is set, this option is derived and will be
https://ingest.{realm}.signalfx.com
. If a value is
explicitly set, the value of realm
will not be used in determining
ingest_url
. The explicit value will be used instead. The exporter will
automatically append the appropriate path: "/v2/datapoint" for metrics,
and "/v2/event" for events.
The following configuration options can also be configured:
access_token_passthrough
: (default = true
) Whether to use
"com.splunk.signalfx.access_token"
metric resource attribute, if any, as the
SignalFx access token. In either case this attribute will be dropped during
final translation, in this exporter only. Intended to be used in tandem with
identical configuration option for SignalFx
receiver to preserve datapoint
origin for only this exporter, as others will reveal the organization access token
by not filtering the attribute.
exclude_metrics
: List of metric filters that will determine metrics to be
excluded from sending to Signalfx backend. The filtering is applied after the default
translations controlled by disable_default_translation_rules
option.
See here for examples. Apart from the values explicitly
provided via this option, by default, these are
also appended to this list. Setting this option to []
will override all the default
excludes.
include_metrics
: List of filters to override exclusion of any metrics.
This option can be used to included metrics that are otherwise dropped by
default. See here for a list of metrics
that are dropped by default. For example, the following configuration can be
used to send through some of that are dropped by default.
include_metrics:
# When sending in translated metrics.
- metric_names: [cpu.interrupt, cpu.user, cpu.system]
# When sending in metrics in OTel convention.
- metric_name: system.cpu.time
dimensions:
state: [interrupt, user, system]
log_data_points
(default = false
): If the log level is set to debug
and this is true, all datapoints dispatched to Splunk Observability Cloud will be logged
log_dimension_updates
(default = false
): Whether or not to log dimension
updates.
disable_default_translation_rules
(default = false
): Disable default translation
of the OTel metrics to a SignalFx compatible format. The default translation rules are
defined in translation/constants.go
.
timeout
(default = 10s): Amount of time to wait for a send operation to
complete.
http2_read_idle_timeout
(default = 10s): Send a ping frame for a health check if the connection has been idle for the configured value.
0s means http/2 health check will be disabled.
http2_ping_timeout
(default = 10s): Triggered by http2_read_idle_timeout
; When there's no response to the ping within the configured value,
the connection will be closed. If this value is set to 0, it will default to 15s.
headers
(no default): Headers to pass in the payload.
max_idle_conns
(default = 100): The maximum idle HTTP connections the client can keep open.
max_idle_conns_per_host
(default = 100): The maximum idle HTTP connections the client can keep open per host.
idle_conn_timeout
(default = 30s): The maximum amount of time an idle connection will remain open before closing itself.
- More HTTP settings are available, see
HTTP settings.
sync_host_metadata
: Defines whether the exporter should scrape host metadata
and send it as property updates to SignalFx backend. Disabled by default.
IMPORTANT: Host metadata synchronization relies on resourcedetection
processor. If this option is enabled make sure that resourcedetection
processor is enabled in the pipeline with one of the cloud provider detectors
or environment variable detector setting a unique value to host.name
attribute
within your k8s cluster. And keep override=true
in resourcedetection config.
exclude_properties
: A list of property filters to limit dimension update content.
Property filters can contain any number of the following fields, supporting (negated)
string literals, re2 /regex/
, and glob syntax values:
dimension_name
, dimension_value
, property_name
, and property_value
. For any field
not expressly configured for each filter object, a default catch-all value of /^.*$/
is used
to allow each specified field to require a match for the filter to take effect:
# will filter all 'k8s.workload.name' properties from 'k8s.pod.uid' dimension updates:
exclude_properties:
- dimension_name: k8s.pod.uid
property_name: k8s.workload.name
dimension_client
: Contains options controlling the dimension update client configuration used for metadata updates.
max_buffered:
(default = 10,000): The buffer size for queued dimension updates.
send_delay
(default = 10s): The time to wait between dimension updates for a given dimension.
max_idle_conns
(default = 20): The maximum idle HTTP connections the client can keep open.
max_idle_conns_per_host
(default = 20): The maximum idle HTTP connections the client can keep open per host.
max_conns_per_host
(default = 20): The maximum total number of connections the client can keep open per host.
idle_conn_timeout
(default = 30s): The maximum amount of time an idle connection will remain open before closing itself.
timeout
(default = 10s): Amount of time to wait for the dimension HTTP request to complete.
nonalphanumeric_dimension_chars
: (default = "_-."
) A string of characters
that are allowed to be used as a dimension key in addition to alphanumeric
characters. Each nonalphanumeric dimension key character that isn't in this string
will be replaced with a _
.
ingest_tls
: (no default) exposes a list of TLS settings to establish a secure connection with signafx receiver configured on another collector instance.
ca_file
needs to be set if the exporter's ingest_url
is pointing to a signalfx receiver
with TLS enabled and using a self-signed certificate where its CA is not loaded in the system cert pool.
Full list of TLS options can be found in the configtls README
The following example instructs the signalfx exporter ingest client to use a custom ca_file
to verify the server certificate.
ingest_tls:
ca_file: "/etc/opt/certs/ca.pem"
api_tls
: (no default) exposes a list of TLS settings to establish a secure connection with http_forwarder extension configured on another collector instance.
ca_file
needs to be set if the exporter's api_url
is pointing to a http_forwarder extension
with TLS enabled and using a self-signed certificate where its CA is not loaded in the system cert pool.
Full list of TLS options can be found in the configtls README
The following example instructs the signalfx exporter api client to use a custom ca_file
to verify the server certificate.
api_tls:
ca_file: "/etc/opt/certs/ca.pem"
drop_histogram_buckets
: (default = false
) if set to true, histogram buckets will not be translated into datapoints with _bucket
suffix but will be dropped instead, only datapoints with _sum
, _count
, _min
(optional) and _max
(optional) suffixes will be sent. Please note that this option does not apply to histograms sent in OTLP format with send_otlp_histograms
enabled.
send_otlp_histograms
: (default: false
) if set to true, any histogram metrics receiver by the exporter will be sent to Splunk Observability backend in OTLP format without conversion to SignalFx format. This can only be enabled if the Splunk Observability environment (realm) has the new Histograms feature rolled out. Please note that histograms sent in OTLP format do not apply to the exporter configurations include_metrics
and exclude_metrics
.
In addition, this exporter offers queued retry which is enabled by default.
Information about queued retry configuration parameters can be found
here.
Traces Configuration (correlation only)
⚠ Note that traces must still be sent in using sapmexporter to see them in SignalFx.
When traces are sent to the signalfx exporter it correlates traces to metrics. When a new service or environment is
seen it associates the source (e.g. host or pod) to that service or environment in SignalFx. Metrics can then be
filtered based on that trace service and environment (sf_service
and sf_environment
).
One of realm
and api_url
are required.
access_token
(required, no default): The access token is the authentication token
provided by SignalFx.
realm
(no default): SignalFx realm where the data will be received.
api_url
(default = https://api.{realm}.signalfx.com/
): Destination to which correlation updates
are sent. If a value is explicitly set, the value of realm
will not be used in determining api_url
.
The explicit value will be used instead.
correlation
Contains options controlling the syncing of service and environment properties onto dimensions.
endpoint
(required, default = api_url
or https://api.{realm}.signalfx.com/
): This is the base URL for API requests (e.g. https://api.us0.signalfx.com
).
timeout
(default = 5s): Is the timeout for every attempt to send data to the backend.
stale_service_timeout
(default = 5 minutes): How long to wait after a span's service name is last seen before uncorrelating it.
max_requests
(default = 20): Max HTTP requests to be made in parallel.
max_buffered
(default = 10,000): Max number of correlation updates that can be buffered before updates are dropped.
max_retries
(default = 2): Max number of retries that will be made for failed correlation updates.
log_updates
(default = false): Whether or not to log correlation updates to dimensions (at DEBUG
level).
retry_delay
(default = 30 seconds): How long to wait between retries.
cleanup_interval
(default = 1 minute): How frequently to purge duplicate requests.
sync_attributes
(default = {"k8s.pod.uid": "k8s.pod.uid", "container.id": "container.id"}
) Map containing key of the attribute to read from spans to sync to dimensions specified as the value.
Default Metric Filters
List of metrics excluded by default
Some OpenTelemetry receivers may send metrics that SignalFx considers to be categorized as custom metrics. In order to prevent unwanted overage usage due to custom metrics from these receivers, the SignalFx exporter has a set of metrics excluded by default. Some exclusion rules use regex to exclude multiple metric names. Some metrics are only excluded if specific resource labels (dimensions) are present. If translation_rules
are configured and new metrics match a default exclusion, the new metric will still be excluded. Users may configure the SignalFx exporter's include_metrics
config option to override the any of the default exclusions, as include_metrics
will always take precedence over any exclusions. An example of include_metrics
is shown below.
exporters:
signalfx:
include_metrics:
- metric_names: [cpu.interrupt, cpu.user, cpu.system]
- metric_name: system.cpu.time
dimensions:
state: [interrupt, user, system]
The following include_metrics
example would instruct the exporter to send only cpu.interrupt
metrics with a cpu
dimension value ("per core" datapoints), and both "per core" and aggregate cpu.idle
metrics:
exporters:
signalfx:
include_metrics:
- metric_name: "cpu.idle"
- metric_name: "cpu.interrupt"
dimensions:
cpu: ["*"]
The translation_rules
metrics configuration field accepts a list of metric-transforming actions to
help ensure compatibility with custom charts and dashboards when using the OpenTelemetry Collector. It also provides the ability to produce custom metrics by copying, calculating new, or aggregating other metric values without requiring an additional processor.
The rule language is expressed in yaml mappings and is documented here. Translation rules currently allow the following actions:
aggregate_metric
- Aggregates a metric through removal of specified dimensions
calculate_new_metric
- Creates a new metric via operating on two consistuent ones
convert_values
- Convert float values to int or int to float for specified metric names
copy_metrics
- Creates a new metric as a copy of another
delta_metric
- Creates a new delta metric for a specified non-delta one
divide_int
- Scales a metric's integer value by a given factor
drop_dimensions
- Drops dimensions for specified metrics, or globally
drop_metrics
- Drops all metrics with a given name
multiply_float
- Scales a metric's float value by a given float factor
multiply_int
- Scales a metric's int value by a given int factor
rename_dimension_keys
- Renames dimensions for specified metrics, or globally
rename_metrics
- Replaces a given metric name with specified one
split_metric
- Splits a given metric into multiple new ones for a specified dimension
The translation rules defined in translation/constants.go
are used by default for this value. The default rules will create the following aggregated metrics from the hostmetrics
receiver:
- cpu.idle
- cpu.interrupt
- cpu.nice
- cpu.num_processors
- cpu.softirq
- cpu.steal
- cpu.system
- cpu.user
- cpu.utilization
- cpu.utilization_per_core
- cpu.wait
- disk.summary_utilization
- disk.utilization
- disk_ops.pending
- disk_ops.total
- memory.total
- memory.utilization
- network.total
- process.cpu_time_seconds
- system.disk.io.total
- system.disk.operations.total
- system.network.io.total
- system.network.packets.total
- vmpage_io.memory.in
- vmpage_io.memory.out
- vmpage_io.swap.in
- vmpage_io.swap.out
In addition to the aggregated metrics, the default translation rules make available the following "per core" custom hostmetrics.
The CPU number is assigned to the dimension cpu
- cpu.interrupt
- cpu.nice
- cpu.softirq
- cpu.steal
- cpu.system
- cpu.user
- cpu.wait
These metrics are intended to be reported directly to Splunk IM by the SignalFx exporter. Any desired changes to their attributes or values should be made via additional translation rules or from their constituent host metrics.
Example Config
exporters:
signalfx:
access_token: <replace_with_actual_access_token>
access_token_passthrough: true
headers:
added-entry: "added value"
dot.test: test
realm: us1
timeout: 5s
max_idle_conns: 80
⚠ When enabling the SignalFx receiver or exporter, configure both the metrics
and logs
pipelines.
service:
pipelines:
metrics:
receivers: [signalfx]
processors: [memory_limiter, batch]
exporters: [signalfx]
logs:
receivers: [signalfx]
processors: [memory_limiter, batch]
exporters: [signalfx]
traces:
receivers: [zipkin]
processors: []
exporters: [signalfx]
The full list of settings exposed for this exporter are documented here
with detailed sample configurations here.
This exporter also offers proxy support as documented
here.
Advanced Configuration
Several helper files are leveraged to provide additional capabilities automatically: