README ¶
cni-metrics-helper
The cni-metrics-helper
is a tool that can be used to scrape ENI and IP information, aggregate it on a cluster
level and publish the metrics to CloudWatch. The following IAM permission is required by the worker nodes to
publish metrics:
"cloudwatch:PutMetricData"
By default, IPAM will publish prometheus metrics on :61678/metrics
.
The following diagram shows how cni-metrics-helper
works in a cluster:
As you can see in the diagram, the cni-metrics-helper
connects to the API Server over https (tcp/443
), and another connection is created from the API Server to the worker node over http (tcp/61678
). If you deploy Amazon EKS with the recommended security groups from Restricting cluster traffic, then make sure that a security group is in place that allows the inbound connection from the API Server to the worker nodes over tcp/61678
.
Adding the CNI metrics helper will publish the following metrics to CloudWatch:
"addReqCount",
"assignIPAddresses",
"awsAPIErr",
"awsAPILatency",
"awsUtilErr",
"delReqCount",
"eniAllocated",
"eniMaxAvailable",
"ipamdActionInProgress",
"ipamdErr",
"maxIPAddresses",
"podENIErr",
"reconcileCount",
"totalIPAddresses",
"totalIPv4Prefixes",
"totalAssignedIPv4sPerCidr"
Using IRSA
As per AWS EKS Security Best Practice, if you are using IRSA for pods then following requirements must be satisfied to succesfully publish metrics to CloudWatch
- The IAM Role for your SA (IRSA) must have following policy attached
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData"
],
"Resource": "*"
}
]
}
- Specify the IRSA name in the cni-metrics-helper deployment spec alongwith the AWS_CLUSTER_ID (as described below). The value that you specify here will show up under the dimension 'CLUSTER_ID' for your published metrics. Specifying a value for this field is mandatory only if you are blocking IMDS access.
AWS_CLUSTER_ID
Type: String
Default: ""
An identifier for your Cluster which will be used as the dimension for published metrics. Ideally it should be ClusterName or ClusterID.
kind: Deployment
apiVersion: apps/v1
metadata:
name: cni-metrics-helper
namespace: kube-system
labels:
k8s-app: cni-metrics-helper
spec:
selector:
matchLabels:
k8s-app: cni-metrics-helper
template:
metadata:
labels:
k8s-app: cni-metrics-helper
spec:
containers:
- env:
- name: AWS_CLUSTER_ID
value: ""
- name: USE_CLOUDWATCH
value: "true"
name: cni-metrics-helper
image: <image>
serviceAccountName: <IRSA name>
With IRSA, the above deployment spec will be auto-injected with AWS_REGION parameter, and it will be used to fetch region information when we publish metrics. Possible scenarios for above configuration:
- If you are not using IRSA, then Region and CLUSTER_ID information will be fetched using IMDS (should have access)
- If you are using IRSA but have not specified AWS_CLUSTER_ID, we will fetch the value for CLUSTER_ID if IMDS access is not blocked
- If you have blocked IMDS access, then you must specify a value for AWS_CLUSTER_ID in the deployment spec
- If you have not blocked IMDS access but have specified AWS_CLUSTER_ID value, then this value will be used.
Installing the cni-metrics-helper
To install the CNI metrics helper, follow the installation instructions from the target version release notes.
Creating a metrics dashboard
After you have deployed the CNI metrics helper, you can view the CNI metrics in the Amazon CloudWatch console.
To create a CNI metrics dashboard
- Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.
- In the left navigation pane, choose Metrics and then select All metrics.
- Choose the Graphed metrics tab.
- Choose Add metrics using browse or query.
- Make sure that under Metrics, you've selected the AWS Region for your cluster.
- In the Search box, enter Kubernetes and then press Enter.
- Select the metrics that you want to add to the dashboard.
- At the upper right of the console, select Actions, and then Add to dashboard.
- In the Select a dashboard section, choose Create new, enter a name for your dashboard, such as EKS-CNI-metrics, and then choose Create.
- In the Widget type section, select Number.
- In the Customize widget title section, enter a logical name for your dashboard title, such as EKS CNI metrics.
- Choose Add to dashboard to finish. Now your CNI metrics are added to a dashboard that you can monitor. For more information about Amazon CloudWatch Logs metrics, see Using Amazon CloudWatch metrics in the Amazon CloudWatch User Guide.
Get cni-metrics-helper logs
kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
aws-node-248ns 1/1 Running 0 6h
aws-node-257bn 1/1 Running 0 2h
...
cni-metrics-helper-6dcff5ddf4-v5l6d 1/1 Running 0 7h
kube-dns-75fddcb66f-48tzn 3/3 Running 0 1d
kubectl logs cni-metrics-helper-6dcff5ddf4-v5l6d -n kube-system
cni-metrics-helper key log messages
Example of some aggregated metrics
I0516 17:11:58.489648 7 metrics.go:350] Produce GAUGE metrics: ipamdActionInProgress, value: 0.000000
I0516 17:11:58.489657 7 metrics.go:350] Produce GAUGE metrics: assignIPAddresses, value: 2.000000
I0516 17:11:58.489665 7 metrics.go:350] Produce GAUGE metrics: totalIPAddresses, value: 11186.000000
I0516 17:11:58.489674 7 metrics.go:350] Produce GAUGE metrics: eniMaxAvailable, value: 800.000000
I0516 17:11:58.489685 7 metrics.go:340] Produce COUNTER metrics: ipamdErr, value: 1.000000
I0516 17:11:58.489695 7 metrics.go:350] Produce GAUGE metrics: eniAllocated, value: 799.000000
I0516 17:11:58.489715 7 metrics.go:350] Produce GAUGE metrics: maxIPAddresses, value: 11200.000000
How to build
In the base folder of the project:
make docker-metrics
To run tests
make docker-metrics-test