Managing Kubernetes Quota with confidence
Kotary
Table of Contents
What is it ?
It is an operator that brings a layer of verification and policy to the native ResourceQuotas mechanism.
It introduced a new resource call a ResourceQuotaClaims that will let users ask to modify the specification of their quota.
The verification includes :
- There are enough resources (CPU and Memory) on the cluster to allocate the claim which will look at the total amount
of resources of the cluster (worker node) and the sum of all the other ResourceQuotas
- (Optional) It respects the maximum bound value express as a ratio of the cluster resource,
ex: a namespace cannot claim more that a 1/3 of the cluster
- (Optional) In order to have some flexibility it is possible to set an over-commit or under-commit ratio to set what
is claimable compared to the actual resources.
ex: In a development environment you could choose to allow reserving more resources than what is actually usable in reality.
In order to facilitate the adaption of ResourceQuotaClaims it is possible to enforce a default claim for namespaces.
The feature will be activated on namespace that contains the label quota=managed.
Why not use an admission controller ?
It could have been an elegant solution to use the admission controller mechanism in Kubernetes. This would have avoided the use
of a Custom Resource Definition by directly asking to modify a ResourceQuotas. In the meantime this would have left out users on
managed cluster like EKS, AKS or GKE, this is why we implemented the operator pattern instead.
Adding or Scaling-Up a Quota
Installation
Add the CRD
kubectl apply -f https://raw.githubusercontent.com/ca-gip/kotary/master/artifacts/crd.yml
Configuration
Options
Name |
Description |
Mandatory |
Type |
Default |
defaultClaimSpec |
Default claim that will be added to a watched Namespace |
no |
ResourceList |
cpu:2 memory: 6Gi |
ratioMaxAllocationMemory |
Maximum amount of Memory claimable by a Namespace |
no |
Float |
1 |
ratioMaxAllocationCPU |
Maximum amount of CPU claimable by a Namespace |
no |
Float |
1 |
ratioOverCommitMemory |
Memory over-commitment |
no |
Float |
1 |
ratioOverCommitCPU |
CPU over-commitment |
no |
Float |
1 |
Example
In the following sample configuration we set :
- A default claim of 2 CPU and 10Gi of Memory
- 33% of total amount of resource can be claim by a namespace
- An over-commit of 130%
cat <<EOF | kubectl -n kube-system create -f -
apiVersion: v1
kind: ConfigMap
data:
defaultClaimSpec: |
cpu: "2"
memory: "10Gi"
ratioMaxAllocationMemory: "0.33"
ratioMaxAllocationCPU: "0.33"
ratioOverCommitMemory: "1.3"
ratioOverCommitCPU: "1.3"
metadata:
name: kotary-config
EOF
Deployment
Deploy the controller
kubectl apply -f https://raw.githubusercontent.com/ca-gip/kotary/master/artifacts/deployment.yml
(Optional) Deploy the service monitor
kubectl apply -f https://raw.githubusercontent.com/ca-gip/kotary/master/artifacts/metrics.yml
Getting Started
Update a ResourceQuota
To update a ResourceQuotas you will have to create a ResourceQuotaClaims with specification for CPU and Memory.
You can use the same units as the one available in Kubernetes,
please refer to the official documentation
Example
cat <<EOF | kubectl apply -n demo-ns -f -
apiVersion: cagip.github.com/v1
kind: ResourceQuotaClaim
metadata:
name: demo
spec:
memory: 20Gi
cpu: 5
EOF
Status
After creating a ResourceQuotaClaims there are three possibilities:
- Accepted : The claim will be deleted, and the modifications are applied to the ResourceQuota
- Rejected : It was not possible to accept the modification the claim show a status "REJECTED" with details.
- Pending : The claim is requesting less resources than what is currently requested on the namespace, the claim will be accepted once it's possible to downscale
Example of a rejected claim
$ kubectl get quotaclaim
NAME CPU RAM STATUS DETAILS
demo 5 20Gi REJECTED Exceeded Memory allocation limit claiming 20Gi but limited to 18Gi
Example of a pending claim
$ kubectl get quotaclaim
NAME CPU RAM STATUS DETAILS
demo 5 16Gi PENDING Awaiting lower CPU consumption claiming 16Gi but current total of CPU request is 18Gi
Default claim
If you are using the default claim policy, namespace will automatically receive a claim and if all the verifications
pass a managed-quota will be applied.
$ kubectl get resourcequota
NAME CREATED AT
managed-quota 2020-01-24T08:31:32Z
Plan
Implementing ResourceQuota when you already have running workload on your cluster can be a tedious task.
To help you get started you can use our cli kotaplan. It will enable to test various scenarios beforehand
by simulating how the quota could be implemented according to the desired settings.
Here is a quick example
$ kotaplan -label quota=managed -cpuover 0.95 -memover 0.95 -memratio 0.33 -cpuratio 0.33
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Namespaces Details |
+--------------+------+---------+-----------------+---------------+---------+-----------------+---------------+-------------+---------------------------+-------------------------------+
| NAMESPACE | PODS | MEM REQ | CURRENT MEM USE | MEM REQ USAGE | CPU REQ | CURRENT CPU USE | CPU REQ USAGE | FIT DEFAULT | RESPECT MAX ALLOCATION NS | SPEC |
+--------------+------+---------+-----------------+---------------+---------+-----------------+---------------+-------------+---------------------------+-------------------------------+
| team-1-dev | 14 | 7GiB | 859.4MiB | 11.9897 % | 2800 | 6 | 0.2143 % | false | true | CPU : 3360m MEM: 8.4GiB |
| team-1-int | 14 | 7GiB | 852MiB | 11.8868 % | 2800 | 0 | 0 % | false | true | CPU : 3360m MEM: 8.4GiB |
| team-1-prd | 16 | 8GiB | 1.125GiB | 14.0568 % | 3200 | 0 | 0 % | false | true | CPU : 3840m MEM: 9.6GiB |
| team-2-dev | 16 | 8GiB | 1.119GiB | 13.9935 % | 3200 | 0 | 0 % | false | true | CPU : 3840m MEM: 9.6GiB |
| team-2-dev | 8 | 4GiB | 531.4MiB | 12.9745 % | 1600 | 0 | 0 % | false | true | CPU : 1920m MEM: 6GiB |
| team-2-dev | 28 | 9GiB | 1.033GiB | 11.4748 % | 3600 | 6 | 0.1667 % | false | true | CPU : 4320m MEM: 10.8GiB |
| team-3-dev | 0 | 0B | 0B | 0 % | 0 | 0 | 0 % | true | true | CPU : 1000m MEM: 6GiB |
| team-3-prd | 0 | 0B | 0B | 0 % | 0 | 0 | 0 % | true | true | CPU : 1000m MEM: 6GiB |
+--------------+------+---------+-----------------+---------------+---------+-----------------+---------------+-------------+---------------------------+-------------------------------+
| 8 | 96 | | | | | | | | | CPU : 22692M MEM: 64.8 GIB |
+--------------+------+---------+-----------------+---------------+---------+-----------------+---------------+-------------+---------------------------+-------------------------------+
+-------------------------------------------------------------+
| Summary |
+-------------------------------------------------------------+
| Number of nodes 8 |
| Available resources (real) CPU : 64000m MEM: 250.5GiB |
| Available resources (commit) CPU : 60800m MEM: 238GiB |
| Max per NS CPU : 21120m MEM: 82.67GiB |
+-------------------------------------------------------------+
| RESULT OK |
+-------------------------------------------------------------+
Manage
To help you manage effectively your ResourceQuotas you can use the provided Granafa dashboard. You will be able
to set it up according to your configuration by modifying the dashboard variable.
Global
The global section will enable users to check the current running configuration (manual) to size accordingly their claims.
It also shows what is currently available, reserved and claimable in terms of resources.
Namespaces
This section list all the managed namespaces to give a rough idea of what is currently use by running containers. It allows to
rapidly checks which quota should be increased or decreased.
Namespace Details
This section shows a detailed view of the Memory and CPU consumption on a particular namespace. It allows to visually check
what the specifications of the quota, the total amount of request made by containers and their real consumption.
Developed with
Versioning
Since version v1.24.0, we have decided to modify the naming of versions for ease of reading and understanding.
Example: v1.24.0 means that the operator was developed for Kubernetes version 1.24 and that the last 0 corresponds to the various patches we have made to the operator.