resourcedetectionprocessor

package module

v0.83.0 Latest Latest Go to latest Published: Aug 15, 2023 License: Apache-2.0 Imports: 29 Imported by: 19

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/open-telemetry/opentelemetry-collector-contrib

Links

Open Source Insights

README ¶

Resource Detection Processor

Status
Stability	beta: traces, metrics, logs
Distributions	contrib, aws, observiq, redhat, splunk, sumo
Issues
Code Owners	@Aneurysm9, @dashpole

The resource detection processor can be used to detect resource information from the host, in a format that conforms to the OpenTelemetry resource semantic conventions, and append or override the resource value in telemetry data with this information.

Supported detectors

Environment Variable

Reads resource information from the OTEL_RESOURCE_ATTRIBUTES environment variable. This is expected to be in the format <key1>=<value1>,<key2>=<value2>,..., the details of which are currently pending confirmation in the OpenTelemetry specification.

Example:

processors:
  resourcedetection/env:
    detectors: [env]
    timeout: 2s
    override: false

System metadata

Note: use the Docker detector (see below) if running the Collector as a Docker container.

Queries the host machine to retrieve the following resource attributes:

* host.arch
* host.name
* host.id
* os.description
* os.type

By default host.name is being set to FQDN if possible, and a hostname provided by OS used as fallback. This logic can be changed with hostname_sources configuration which is set to ["dns", "os"] by default.

Use the following config to avoid getting FQDN and apply hostname provided by OS only:

processors:
  resourcedetection/system:
    detectors: ["system"]
    system:
      hostname_sources: ["os"]

all valid options for hostname_sources:
- "dns"
- "os"
- "cname"
- "lookup"

Hostname Sources

dns

The "dns" hostname source uses multiple sources to get the fully qualified domain name. First, it looks up the host name in the local machine's hosts file. If that fails, it looks up the CNAME. Lastly, if that fails, it does a reverse DNS query. Note: this hostname source may produce unreliable results on Windows. To produce a FQDN, Windows hosts might have better results using the "lookup" hostname source, which is mentioned below.

os

The "os" hostname source provides the hostname provided by the local machine's kernel.

cname

The "cname" hostname source provides the canonical name, as provided by net.LookupCNAME in the Go standard library. Note: this hostname source may produce unreliable results on Windows.

lookup

The "lookup" hostname source does a reverse DNS lookup of the current host's IP address.

Docker metadata

Queries the Docker daemon to retrieve the following resource attributes from the host machine:

* host.name
* os.type

You need to mount the Docker socket (/var/run/docker.sock on Linux) to contact the Docker daemon. Docker detection does not work on macOS.

Example:

processors:
  resourcedetection/docker:
    detectors: [env, docker]
    timeout: 2s
    override: false

Heroku metadata

When Heroku dyno metadata is active, Heroku applications publish information through environment variables.

We map these environment variables to resource attributes as follows:

Dyno metadata environment variable	Resource attribute
`HEROKU_APP_ID`	`heroku.app.id`
`HEROKU_APP_NAME`	`service.name`
`HEROKU_DYNO_ID`	`service.instance.id`
`HEROKU_RELEASE_CREATED_AT`	`heroku.release.creation_timestamp`
`HEROKU_RELEASE_VERSION`	`service.version`
`HEROKU_SLUG_COMMIT`	`heroku.release.commit`

For more information, see the Heroku cloud provider documentation under the OpenTelemetry specification semantic conventions.

processors:
  resourcedetection/heroku:
    detectors: [env, heroku]
    timeout: 2s
    override: false

GCP Metadata

Uses the Google Cloud Client Libraries for Go to read resource information from the metadata server and environment variables to detect which GCP platform the application is running on, and detect the appropriate attributes for that platform. Regardless of the GCP platform the application is running on, use the gcp detector:

Example:

processors:
  resourcedetection/gcp:
    detectors: [env, gcp]
    timeout: 2s
    override: false

GCE Metadata

* cloud.provider ("gcp")
* cloud.platform ("gcp_compute_engine")
* cloud.account.id (project id)
* cloud.region  (e.g. us-central1)
* cloud.availability_zone (e.g. us-central1-c)
* host.id (instance id)
* host.name (instance name)
* host.type (machine type)
* (optional) gcp.gce.instance.hostname
* (optional) gcp.gce.instance.name

GKE Metadata

* cloud.provider ("gcp")
* cloud.platform ("gcp_kubernetes_engine")
* cloud.account.id (project id)
* cloud.region (only for regional GKE clusters; e.g. "us-central1")
* cloud.availability_zone (only for zonal GKE clusters; e.g. "us-central1-c")
* k8s.cluster.name
* host.id (instance id)
* host.name (instance name; only when workload identity is disabled)

One known issue is when GKE workload identity is enabled, the GCE metadata endpoints won't be available, thus the GKE resource detector won't be able to determine host.name. In that case, users are encouraged to set host.name from either:

node.name through the downward API with the env detector
obtaining the Kubernetes node name from the Kubernetes API (with k8s.io/client-go)

Google Cloud Run Services Metadata

* cloud.provider ("gcp")
* cloud.platform ("gcp_cloud_run")
* cloud.account.id (project id)
* cloud.region (e.g. "us-central1")
* faas.id (instance id)
* faas.name (service name)
* faas.version (service revision)

Cloud Run Jobs Metadata

* cloud.provider ("gcp")
* cloud.platform ("gcp_cloud_run")
* cloud.account.id (project id)
* cloud.region (e.g. "us-central1")
* faas.id (instance id)
* faas.name (service name)
* gcp.cloud_run.job.execution ("my-service-ajg89")
* gcp.cloud_run.job.task_index ("0")

Google Cloud Functions Metadata

* cloud.provider ("gcp")
* cloud.platform ("gcp_cloud_functions")
* cloud.account.id (project id)
* cloud.region (e.g. "us-central1")
* faas.id (instance id)
* faas.name (function name)
* faas.version (function version)

Google App Engine Metadata

* cloud.provider ("gcp")
* cloud.platform ("gcp_app_engine")
* cloud.account.id (project id)
* cloud.region (e.g. "us-central1")
* cloud.availability_zone (e.g. "us-central1-c")
* faas.id (instance id)
* faas.name (service name)
* faas.version (service version)

AWS EC2

Uses AWS SDK for Go to read resource information from the EC2 instance metadata API to retrieve the following resource attributes:

* cloud.provider ("aws")
* cloud.platform ("aws_ec2")
* cloud.account.id
* cloud.region
* cloud.availability_zone
* host.id
* host.image.id
* host.name
* host.type

It also can optionally gather tags for the EC2 instance that the collector is running on. Note that in order to fetch EC2 tags, the IAM role assigned to the EC2 instance must have a policy that includes the ec2:DescribeTags permission.

EC2 custom configuration example:

processors:
  resourcedetection/ec2:
    detectors: ["ec2"]
    ec2:
      # A list of regex's to match tag keys to add as resource attributes can be specified
      tags:
        - ^tag1$
        - ^tag2$
        - ^label.*$

If you are using a proxy server on your EC2 instance, it's important that you exempt requests for instance metadata as described in the AWS cli user guide. Failing to do so can result in proxied or missing instance data.

If the instance is part of AWS ParallelCluster and the detector is failing to connect to the metadata server, check the iptable and make sure the chain PARALLELCLUSTER_IMDS contains a rule that allows OTEL user to access 169.254.169.254/32

Amazon ECS

Queries the Task Metadata Endpoint (TMDE) to record information about the current ECS Task. Only TMDE V4 and V3 are supported.

* cloud.provider ("aws")
* cloud.platform ("aws_ecs")
* cloud.account.id
* cloud.region
* cloud.availability_zone
* aws.ecs.cluster.arn
* aws.ecs.task.arn
* aws.ecs.task.family
* aws.ecs.task.revision
* aws.ecs.launchtype (V4 only)
* aws.log.group.names (V4 only)
* aws.log.group.arns (V4 only)
* aws.log.stream.names (V4 only)
* aws.log.stream.arns (V4 only)

Example:

processors:
  resourcedetection/ecs:
    detectors: [env, ecs]
    timeout: 2s
    override: false

Amazon Elastic Beanstalk

Reads the AWS X-Ray configuration file available on all Beanstalk instances with X-Ray Enabled.

* cloud.provider ("aws")
* cloud.platform ("aws_elastic_beanstalk")
* deployment.environment
* service.instance.id
* service.version

Example:

processors:
  resourcedetection/elastic_beanstalk:
    detectors: [env, elastic_beanstalk]
    timeout: 2s
    override: false

Amazon EKS

* cloud.provider ("aws")
* cloud.platform ("aws_eks")

Example:

processors:
  resourcedetection/eks:
    detectors: [env, eks]
    timeout: 2s
    override: false

AWS Lambda

Uses the AWS Lambda runtime environment variables to retrieve the following resource attributes:

Cloud semantic conventions

cloud.provider ("aws")
cloud.platform ("aws_lambda")
cloud.region ($AWS_REGION)

Function as a Service semantic conventions and AWS Lambda semantic conventions

faas.name ($AWS_LAMBDA_FUNCTION_NAME)
faas.version ($AWS_LAMBDA_FUNCTION_VERSION)
faas.instance ($AWS_LAMBDA_LOG_STREAM_NAME)
faas.max_memory ($AWS_LAMBDA_FUNCTION_MEMORY_SIZE)

AWS Logs semantic conventions

aws.log.group.names ($AWS_LAMBDA_LOG_GROUP_NAME)
aws.log.stream.names ($AWS_LAMBDA_LOG_STREAM_NAME)

Example:

processors:
  resourcedetection/lambda:
    detectors: [env, lambda]
    timeout: 0.2s
    override: false

Azure

Queries the Azure Instance Metadata Service to retrieve the following resource attributes:

* cloud.provider ("azure")
* cloud.platform ("azure_vm")
* cloud.region
* cloud.account.id (subscription ID)
* host.id (virtual machine ID)
* host.name
* azure.vm.name (same as host.name)
* azure.vm.size (virtual machine size)
* azure.vm.scaleset.name (name of the scale set if any)
* azure.resourcegroup.name (resource group name)

Example:

processors:
  resourcedetection/azure:
    detectors: [env, azure]
    timeout: 2s
    override: false

Azure AKS

cloud.provider ("azure")
cloud.platform ("azure_aks")

processors:
  resourcedetection/aks:
    detectors: [env, aks]
    timeout: 2s
    override: false

Consul

Queries a consul agent and reads its' configuration endpoint to retrieve the following resource attributes:

cloud.region (consul datacenter)
host.id (consul node id)
host.name (consul node name)
exploded consul metadata - reads all key:value pairs in consul metadata into label:labelvalue pairs.

processors:
  resourcedetection/consul:
    detectors: [env, consul]
    timeout: 2s
    override: false

Heroku

** You must first enable the Heroku metadata feature on the application **

Queries Heroku metadata to retrieve the following resource attributes:

heroku.release.version (identifier for the current release)
heroku.release.creation_timestamp (time and date the release was created)
heroku.release.commit (commit hash for the current release)
heroku.app.name (application name)
heroku.app.id (unique identifier for the application)
heroku.dyno.id (dyno identifier. Used as host name)

processors:
  resourcedetection/heroku:
    detectors: [env, heroku]
    timeout: 2s
    override: false

Openshift

Queries the OpenShift and Kubernetes API to retrieve the following resource attributes:

* cloud.provider
* cloud.platform
* cloud.region
* k8s.cluster.name

The following permissions are required:

kind: ClusterRole
metadata:
  name: otel-collector
rules:
- apiGroups: ["config.openshift.io"]
  resources: ["infrastructures", "infrastructures/status"]
  verbs: ["get", "watch", "list"]

By default, the API address is determined from the environment variables KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT and the service token is read from /var/run/secrets/kubernetes.io/serviceaccount/token. If TLS is not explicit disabled and no ca_file is configured /var/run/secrets/kubernetes.io/serviceaccount/ca.crt is used. The determination of the API address, ca_file and the service token is skipped if they are set in the configuration.

Example:

processors:
  resourcedetection/openshift:
    detectors: [openshift]
    timeout: 2s
    override: false
    openshift: # optional
      address: "https://api.example.com"
      token: "token"
      tls:
        insecure: false
        ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"

See: TLS Configuration Settings for the full set of available options.

Configuration

# a list of resource detectors to run, valid options are: "env", "system", "gce", "gke", "ec2", "ecs", "elastic_beanstalk", "eks", "lambda", "azure", "heroku", "openshift"
detectors: [ <string> ]
# determines if existing resource attributes should be overridden or preserved, defaults to true
override: <bool>
# [DEPRECATED] When included, only attributes in the list will be appended.  Applies to all detectors.
attributes: [ <string> ]

Moreover, you have the ability to specify which detector should collect each attribute with resource_attributes option. An example of such a configuration is:

resourcedetection:
  detectors: [system, ec2]
  system:
    resource_attributes:
      host.name:
        enabled: true
      host.id:
        enabled: false
  ec2:
    resource_attributes:
      host.name:
        enabled: false
      host.id:
        enabled: true

Migration from attributes to resource_attributes

The attributes option is deprecated and will be removed soon, from now on you should enable/disable attributes through resource_attributes. For example, this config:

resourcedetection:
  detectors: [system]
  attributes: ['host.name', 'host.id']

can be replaced with:

resourcedetection:
  detectors: [system]
  system:
    resource_attributes:
      host.name:
        enabled: true
      host.id:
        enabled: true
      os.type:
        enabled: false

Ordering

Note that if multiple detectors are inserting the same attribute name, the first detector to insert wins. For example if you had detectors: [eks, ec2] then cloud.platform will be aws_eks instead of ec2. The below ordering is recommended.

GCP

gke
gce

AWS

lambda
elastic_beanstalk
eks
ecs
ec2

The full list of settings exposed for this extension are documented here with detailed sample configurations here.

Documentation ¶

Overview ¶

package resourcedetectionprocessor implements a processor which can be used to detect resource information from the host, in a format that conforms to the OpenTelemetry resource semantic conventions, and append or override the resource value in telemetry data with this information.

Index ¶

func NewFactory() processor.Factory
type Config
type DetectorConfig
- func (d *DetectorConfig) GetConfigFromType(detectorType internal.DetectorType) internal.DetectorConfig

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NewFactory ¶

func NewFactory() processor.Factory

NewFactory creates a new factory for ResourceDetection processor.

Types ¶

type Config ¶

type Config struct {

	// Detectors is an ordered list of named detectors that should be
	// run to attempt to detect resource information.
	Detectors []string `mapstructure:"detectors"`
	// Override indicates whether any existing resource attributes
	// should be overridden or preserved. Defaults to true.
	Override bool `mapstructure:"override"`
	// DetectorConfig is a list of settings specific to all detectors
	DetectorConfig DetectorConfig `mapstructure:",squash"`
	// HTTP client settings for the detector
	// Timeout default is 5s
	confighttp.HTTPClientSettings `mapstructure:",squash"`
	// Attributes is an allowlist of attributes to add.
	// If a supplied attribute is not a valid attribute of a supplied detector it will be ignored.
	// Deprecated: Please use detector's resource_attributes config instead
	Attributes []string `mapstructure:"attributes"`
}

Config defines configuration for Resource processor.

type DetectorConfig ¶ added in v0.18.0

type DetectorConfig struct {
	// EC2Config contains user-specified configurations for the EC2 detector
	EC2Config ec2.Config `mapstructure:"ec2"`

	// ECSConfig contains user-specified configurations for the ECS detector
	ECSConfig ecs.Config `mapstructure:"ecs"`

	// EKSConfig contains user-specified configurations for the EKS detector
	EKSConfig eks.Config `mapstructure:"eks"`

	// Elasticbeanstalk contains user-specified configurations for the elasticbeanstalk detector
	ElasticbeanstalkConfig elasticbeanstalk.Config `mapstructure:"elasticbeanstalk"`

	// Lambda contains user-specified configurations for the lambda detector
	LambdaConfig lambda.Config `mapstructure:"lambda"`

	// Azure contains user-specified configurations for the azure detector
	AzureConfig azure.Config `mapstructure:"azure"`

	// Aks contains user-specified configurations for the aks detector
	AksConfig aks.Config `mapstructure:"aks"`

	// ConsulConfig contains user-specified configurations for the Consul detector
	ConsulConfig consul.Config `mapstructure:"consul"`

	// DockerConfig contains user-specified configurations for the docker detector
	DockerConfig docker.Config `mapstructure:"docker"`

	// GcpConfig contains user-specified configurations for the gcp detector
	GcpConfig gcp.Config `mapstructure:"gcp"`

	// HerokuConfig contains user-specified configurations for the heroku detector
	HerokuConfig heroku.Config `mapstructure:"heroku"`

	// SystemConfig contains user-specified configurations for the System detector
	SystemConfig system.Config `mapstructure:"system"`

	// OpenShift contains user-specified configurations for the Openshift detector
	OpenShiftConfig openshift.Config `mapstructure:"openshift"`
}

DetectorConfig contains user-specified configurations unique to all individual detectors

func (*DetectorConfig) GetConfigFromType ¶ added in v0.18.0

func (d *DetectorConfig) GetConfigFromType(detectorType internal.DetectorType) internal.DetectorConfig

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal Package internal contains an interface for detecting resource information, and a provider to merge the resources returned by a slice of custom detectors.	Package internal contains an interface for detecting resource information, and a provider to merge the resources returned by a slice of custom detectors.
aws/ec2
aws/ec2/internal/metadata
aws/ecs
aws/ecs/internal/metadata
aws/eks
aws/eks/internal/metadata
aws/elasticbeanstalk
aws/elasticbeanstalk/internal/metadata
aws/lambda
aws/lambda/internal/metadata
azure
azure/aks
azure/aks/internal/metadata
azure/internal/metadata
consul
consul/internal/metadata
docker
docker/internal/metadata
env Package env provides a detector that loads resource information from the OTEL_RESOURCE environment variable.	Package env provides a detector that loads resource information from the OTEL_RESOURCE environment variable.
gcp
gcp/internal/metadata
heroku
heroku/internal/metadata
metadata
openshift
openshift/internal/metadata
system
system/internal/metadata

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL