Understanding Kubernetes Resource Utilization: Unveiling the Mystery of Memory and CPU Usage Beyond 100%
Table of contents
No headings in the article.
Ever had a moment of surprise while checking the metrics of your node during a Kubernetes experiment? I definitely did! I was scratching my head when I noticed resource utilization going beyond 100%. It got me curious, so I dug deeper to figure out what was going on. In this article, I'll share my findings and demystify how Kubernetes manages to go above and beyond the 100% mark in resource usage.
As you might be familiar, kubectl top
can be used to get the point-in-time resource utilization of various resources in the cluster.
$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
aritra-node-1 878m 94% 6973Mi 102%
After some digging, I found some discussions on Github about the issue.
- The CPU and memory utilization (the actual number) of a node is the actual resource utilization on the node, based on the metrics API reported through Cadvisor. This may include resources not running outside of the pods. This is also why sum of the resources consumed by pods (
kubectl top pod
) running on the node may not match the aggregate resource utilization shown inkubectl top node
The percentages in the above calculation are based on allocatable resources, not the actual capacity of the node
For example
Memory(%)= node_memory_working_set_bytes *100/ Allocatable memory
-
Source: Kubernetes
allocatable = Node capacity - kube-reserved- system-reserved - eviction-threshold
This is a logical concept that guides pod scheduling in Kubernetes. The scheduler assigns pods to nodes based on the allocatable memory/cpu and the requests/limits of the pod. Thus, it is possible that custom pods can end up using resources beyond allocatable (likely for pods without limits). However, if the node resource reaches the eviction-threshold, these pods might get evicted if the node has memory pressure.
From Kubernetes version 1.23, there is an option to get percentage resource utilization based on capacity. This flag is false by default and can be specified explicitly
kubectl top node --show-capacity=true
a@cloudshell:~$ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% gke-cluster-1-aritra-default-pool-cc5f2b3e-kzpe 163m 17% 1062Mi 37% gke-cluster-1-aritra-default-pool-cc5f2b3e-mwgg 156m 16% 1012Mi 36% gke-cluster-1-aritra-default-pool-cc5f2b3e-p0sr 162m 17% 1001Mi 35% a@cloudshell:~$ kubectl top node --show-capacity=true NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% gke-cluster-1-aritra-default-pool-cc5f2b3e-kzpe 163m 8% 1062Mi 27% gke-cluster-1-aritra-default-pool-cc5f2b3e-mwgg 156m 7% 1012Mi 25% gke-cluster-1-aritra-default-pool-cc5f2b3e-p0sr 162m 8% 1001Mi 25%
As you can see, the actual resource utilization is the same, but the percentages are different based on allocatable vs node capacity.
This discussion doesn't apply to
kubectl top pod
that doesn't display percentage utilizationOn the other hand,
kubectl describe node
displays the requests and limits of the pods all of which are based on allocatable
Today, numerous observability tools have alerts set up based on these metrics. To make the most of these tools and effectively respond with appropriate actions, it's crucial to have a solid understanding of the metric specifics.