Kubernetes cluster state monitoring with Netdata
Kubernetes is an open-source container orchestration system for automating software
deployment, scaling, and management.
This module collects health metrics for the following Kubernetes resources:
Requirements
- Only works when Netdata is running inside a Kubernetes cluster.
- RBAC: needs list, watch verbs for pod and node resources.
- RBAC: needs get verb for namespace resource.
Metrics
All metrics have "k8s_state." prefix.
Node
Metric |
Dimensions |
Units |
node_allocatable_cpu_requests_utilization |
requests |
% |
node_allocatable_cpu_requests_used |
requests |
millicpu |
node_allocatable_cpu_limits_utilization |
limits |
% |
node_allocatable_cpu_limits_used |
limits |
millicpu |
node_allocatable_mem_requests_utilization |
requests |
% |
node_allocatable_mem_requests_used |
requests |
bytes |
node_allocatable_mem_limits_utilization |
limits |
% |
node_allocatable_mem_limits_used |
limits |
bytes |
node_allocatable_pods_utilization |
allocated |
% |
node_allocatable_pods_usage |
available, allocated |
pods |
node_condition |
added dynamically |
status |
node_schedulability |
schedulable, unschedulable |
state |
node_pods_readiness |
ready |
% |
node_pods_readiness_state |
ready, unready |
pods |
node_pods_condition |
pod_ready, pod_scheduled, pod_initialized, containers_ready |
pods |
node_pods_phase |
running, failed, succeeded, pending |
pods |
node_containers |
containers, init_containers |
containers |
node_containers_state |
running, waiting, terminated |
containers |
node_init_containers_state |
running, waiting, terminated |
containers |
node_age |
age |
seconds |
Pod
Metric |
Dimensions |
Units |
pod_cpu_requests_used |
requests |
millicpu |
pod_cpu_limits_used |
limits |
millicpu |
pod_mem_requests_used |
requests |
bytes |
pod_mem_limits_used |
limits |
bytes |
pod_condition |
pod_ready, pod_scheduled, pod_initialized, containers_ready |
state |
pod_phase |
running, failed, succeeded, pending |
state |
pod_age |
age |
seconds |
pod_containers |
containers, init_containers |
containers |
pod_containers_state |
running, waiting, terminated |
containers |
pod_init_containers_state |
running, waiting, terminated |
containers |
Pod container
Metric |
Dimensions |
Units |
pod_container_readiness_state |
ready |
state |
pod_container_restarts |
restarts |
restarts/s |
pod_container_state |
running, waiting, terminated |
state |
pod_container_waiting_state_reason |
added dynamically |
state |
pod_container_terminated_state_reason |
added dynamically |
state |
Labels
- 'k8s_cluster_id' value is 'kube-system' namespace UID.
- 'k8s_cluster_name' currently only appears when running on GKE.
Label |
Node |
Pod |
Container |
k8s_kind |
yes |
yes |
yes |
k8s_cluster_id |
yes |
yes |
yes |
k8s_cluster_name |
yes |
yes |
yes |
k8s_node_name |
yes |
yes |
yes |
k8s_namespace |
|
yes |
yes |
k8s_controller_kind |
|
yes |
yes |
k8s_controller_name |
|
yes |
yes |
k8s_pod_uid |
|
yes |
yes |
k8s_pod_name |
|
yes |
yes |
k8s_qos_class |
|
yes |
yes |
k8s_container_id |
|
|
yes |
k8s_container_name |
|
|
yes |
Configuration
No configuration is needed. This module is enabled when you install Netdata
using netdata/helmchart.
Troubleshooting
To troubleshoot issues with the k8s_state
collector, run the go.d.plugin
with the debug option enabled. The
output should give you clues as to why the collector isn't working.
First, navigate to your plugins' directory, usually at /usr/libexec/netdata/plugins.d/
. If that's not the case on your
system, open netdata.conf
and look for the setting plugins directory
. Once you're in the plugin's directory, switch
to the netdata
user.
cd /usr/libexec/netdata/plugins.d/
sudo -u netdata -s
You can now run the go.d.plugin
to debug the collector:
./go.d.plugin -d -m k8s_state