Kubelet Stats Receiver
The Kubelet Stats Receiver pulls node, pod, container, and volume metrics from the API server on a kubelet
and sends it down the metric pipeline for further processing.
Metrics
Details about the metrics produced by this receiver can be found in metadata.yaml with further documentation in documentation.md
Configuration
A kubelet runs on a kubernetes node and has an API server to which this
receiver connects. To configure this receiver, you have to tell it how
to connect and authenticate to the API server and how often to collect data
and send it to the next consumer.
Kubelet Stats Receiver supports both secure Kubelet endpoint exposed at port 10250 by default and read-only
Kubelet endpoint exposed at port 10255. If auth_type
set to none
, the read-only endpoint will be used. The secure
endpoint will be used if auth_type
set to any of the following values:
tls
tells the receiver to use TLS for auth and requires that the fields
ca_file
, key_file
, and cert_file
also be set.
serviceAccount
tells this receiver to use the default service account token
to authenticate to the kubelet API along with the default certificate which is signed by the cluster's root CA cert:
/var/run/secrets/kubernetes.io/serviceaccount/token
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
kubeConfig
tells this receiver to use the kubeconfig file (KUBECONFIG env variable or ~/.kube/config)
to authenticate and use API server proxy to access the kubelet API.
initial_delay
(default = 1s
): defines how long this receiver waits before starting.
TLS Example
receivers:
kubeletstats:
collection_interval: 20s
initial_delay: 1s
auth_type: "tls"
ca_file: "/path/to/ca.crt"
key_file: "/path/to/apiserver.key"
cert_file: "/path/to/apiserver.crt"
endpoint: "https://192.168.64.1:10250"
insecure_skip_verify: true
exporters:
file:
path: "fileexporter.txt"
service:
pipelines:
metrics:
receivers: [kubeletstats]
exporters: [file]
Service Account Authentication Example
Although it's possible to use kubernetes' hostNetwork feature to talk to the
kubelet api from a pod, the preferred approach is to use the downward API.
Make sure the pod spec sets the node name as follows:
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
Then the otel config can reference the K8S_NODE_NAME
environment variable:
receivers:
kubeletstats:
collection_interval: 20s
auth_type: "serviceAccount"
endpoint: "https://${env:K8S_NODE_NAME}:10250"
insecure_skip_verify: true
exporters:
file:
path: "fileexporter.txt"
service:
pipelines:
metrics:
receivers: [kubeletstats]
exporters: [file]
Note: a missing or empty endpoint
will cause the hostname on which the
collector is running to be used as the endpoint. If the hostNetwork flag is
set, and the collector is running in a pod, this hostname will resolve to the
node's network namespace.
Read Only Endpoint Example
The following config can be used to collect Kubelet metrics from read-only endpoint:
receivers:
kubeletstats:
collection_interval: 20s
auth_type: "none"
endpoint: "http://${env:K8S_NODE_NAME}:10255"
exporters:
file:
path: "fileexporter.txt"
service:
pipelines:
metrics:
receivers: [kubeletstats]
exporters: [file]
Kubeconfig example
The following config can be used to collect Kubelet metrics from read-only endpoint, proxied by the API server:
receivers:
kubeletstats:
collection_interval: 20s
auth_type: "kubeConfig"
context: "my-context"
insecure_skip_verify: true
endpoint: "${env:K8S_NODE_NAME}"
exporters:
file:
path: "fileexporter.txt"
service:
pipelines:
metrics:
receivers: [kubeletstats]
exporters: [file]
Note that using auth_type
kubeConfig
, the endpoint should only be the node name as the communication to the kubelet is proxied by the API server configured in the kubeConfig
.
insecure_skip_verify
still applies by overriding the kubeConfig
settings.
If no context
is specified, the current context or the default context is used.
By default, all produced metrics get resource labels based on what kubelet /stats/summary endpoint provides.
For some use cases it might be not enough. So it's possible to leverage other endpoints to fetch
additional metadata entities and set them as extra labels on metric resource. Currently supported metadata
include the following:
container.id
- to augment metrics with Container ID label obtained from container statuses exposed via /pods
.
k8s.volume.type
- to collect volume type from the Pod spec exposed via /pods
and have it as a label on volume metrics.
If there's more information available from the endpoint than just volume type, those are sycned as well depending on
the available fields and the type of volume. For example, aws.volume.id
would be synced from awsElasticBlockStore
and gcp.pd.name
is synced for gcePersistentDisk
.
If you want to have container.id
label added to your metrics, use extra_metadata_labels
field to enable
it, for example:
receivers:
kubeletstats:
collection_interval: 10s
auth_type: "serviceAccount"
endpoint: "${env:K8S_NODE_NAME}:10250"
insecure_skip_verify: true
extra_metadata_labels:
- container.id
If extra_metadata_labels
is not set, no additional API calls is done to fetch extra metadata.
When dealing with Persistent Volume Claims, it is possible to optionally sync metdadata from the underlying
storage resource rather than just the volume claim. This is achieved by talking to the Kubernetes API. Below
is an example, configuration to achieve this.
receivers:
kubeletstats:
collection_interval: 10s
auth_type: "serviceAccount"
endpoint: "${env:K8S_NODE_NAME}:10250"
insecure_skip_verify: true
extra_metadata_labels:
- k8s.volume.type
k8s_api_config:
auth_type: serviceAccount
If k8s_api_config
set, the receiver will attempt to collect metadata from underlying storage resources for
Persistent Volume Claims. For example, if a Pod is using a PVC backed by an EBS instance on AWS, the receiver
would set the k8s.volume.type
label to be awsElasticBlockStore
rather than persistentVolumeClaim
.
Metric Groups
A list of metric groups from which metrics should be collected. By default, metrics from containers,
pods and nodes will be collected. If metric_groups
is set, only metrics from the listed groups
will be collected. Valid groups are container
, pod
, node
and volume
. For example, if you're
looking to collect only node
and pod
metrics from the receiver use the following configuration.
receivers:
kubeletstats:
collection_interval: 10s
auth_type: "serviceAccount"
endpoint: "${env:K8S_NODE_NAME}:10250"
insecure_skip_verify: true
metric_groups:
- node
- pod
Collect k8s.{container,pod}.{cpu,memory}.node.utilization
as ratio of total node's capacity
In order to calculate the k8s.container.cpu.node.utilization
, k8s.pod.cpu.node.utilization
,
k8s.container.memory.node.utilization
and k8s.pod.memory.node.utilization
metrics, the
information of the node's capacity must be retrieved from the k8s API. In this, the k8s_api_config
needs to be set.
In addition, the node name must be identified properly. The K8S_NODE_NAME
env var can be set using the
downward API inside the collector pod spec as follows:
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
Then set node
value to ${env:K8S_NODE_NAME}
in the receiver's configuration:
receivers:
kubeletstats:
collection_interval: 10s
auth_type: 'serviceAccount'
endpoint: '${env:K8S_NODE_NAME}:10250'
node: '${env:K8S_NODE_NAME}'
k8s_api_config:
auth_type: serviceAccount
metrics:
k8s.container.cpu.node.utilization:
enabled: true
k8s.pod.cpu.node.utilization:
enabled: true
k8s.container.memory.node.utilization:
enabled: true
k8s.pod.memory.node.utilization:
enabled: true
Optional parameters
The following parameters can also be specified:
collection_interval
(default = 10s
): The interval at which to collect data.
insecure_skip_verify
(default = false
): Whether or not to skip certificate verification.
The full list of settings exposed for this receiver are documented here
with detailed sample configurations here.
Role-based access control
The Kubelet Stats Receiver needs get
permissions on the nodes/stats
resources. Additionally, when using extra_metadata_labels
or any of the {request|limit}_utilization
metrics the processor also needs get
permissions for nodes/proxy
resources.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: otel-collector
rules:
- apiGroups: [""]
resources: ["nodes/stats"]
verbs: ["get"]
# Only needed if you are using extra_metadata_labels or
# are collecting the request/limit utilization metrics
- apiGroups: [""]
resources: ["nodes/proxy"]
verbs: ["get"]
Warning about metrics' deprecation
The following metrics will be renamed in a future version:
k8s.node.cpu.utilization
(renamed to k8s.node.cpu.usage
)
k8s.pod.cpu.utilization
(renamed to k8s.pod.cpu.usage
)
container.cpu.utilization
(renamed to container.cpu.usage
)
The above metrics show usage counted in CPUs and it's not a percentage of used resources.
These metrics were previously incorrectly named using the utilization term.
receiver.kubeletstats.enableCPUUsageMetrics
feature gate
- alpha: when enabled it makes the
.cpu.usage
metrics enabled by default, disabling the .cpu.utilization
metrics
- beta:
.cpu.usage
metrics are enabled by default and any configuration enabling the deprecated .cpu.utilization
metrics will be failing. Explicitly disabling the feature gate provides the old (deprecated) behavior.
- stable:
.cpu.usage
metrics are enabled by default and the deprecated metrics are completely removed.
- removed three releases after stable.
More information about the deprecation plan and
the background reasoning can be found at https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/27885.