Kubernetes metrics are essential for maintaining the health and performance of your cluster in the dynamic world of container orchestration. As the leading container orchestrator, Kubernetes provides a wealth of metrics that offer insights into the state and performance of your deployments, nodes, and pods. By effectively monitoring these metrics, you can ensure that your applications run smoothly, resources are optimally utilized, and potential issues are identified and resolved promptly. The expected outcome is a robust, efficient, and reliable Kubernetes environment where you can maximize uptime, enhance performance, and maintain operational stability.
This guide will walk you through the key Kubernetes metrics and how to leverage them for optimal cluster management.
Key Kubernetes metrics to monitor
Monitoring the health of your entire Kubernetes cluster is paramount. It’s essential to know the resource utilization of your cluster, the number of applications running on each node, and whether the nodes are functioning correctly and at what capacity.
Key Metrics
1. Node Resource Usage Metrics: Metrics such as disk and memory utilization, CPU usage, and network bandwidth help determine if you need to adjust the number and size of nodes in the cluster. Monitoring memory and disk usage at the node level provides crucial insights into your cluster’s performance and workload management. When pods exceed their limits, they are terminated. If a node runs low on memory or disk space, the kubelet flags it and begins resource reclamation.
2. Number of Nodes:This metric indicates what a cluster is used for and what costs are incurred, particularly when using cloud providers.
3. Running Pods per Node: This metric shows whether the available nodes are sufficiently sized and capable of handling the pod workload in case a node fails. This is critical if you’re using pod affinity to constrain pod scheduling based on node labels.
4. Memory and CPU Requests and Limits: These define the minimum and maximum resources a node’s kubelet can allocate to containers. Allocatable memory reflects the available memory for pods after accounting for OS and Kubernetes system processes. These metrics inform you whether your nodes have enough capacity for current pod memory requirements and whether the Control Plane can schedule new pods.
Kubernetes Deployments & Pod Metrics
Pod-level monitoring involves three types of metrics: Kubernetes metrics, container metrics, and application metrics.
Kubernetes Metrics
Kubernetes metrics ensure all pods in a deployment are running and healthy. They provide information on the number of pod instances and their expected count. If the number is too low, your cluster may run out of resources. It's also important to track deployment progress and network throughput.
Key Kubernetes metrics include:
- Current Deployment and DaemonSet Metrics:These track important controllers in your Kubernetes cluster, such as the number of pods created by deployments and DaemonSets.
- Missing and Failed Pods: These metrics show the running status and the number of dying pods.
- Pod Restarts: This metric tracks the number of times pods have restarted.
- CrashLoopBackOff Pods:This indicates issues like application crashes or faulty configurations causing pod crashes.
- Running vs. Desired Pods: This metric shows the actual vs. expected number of pod instances.
- Pod Resource Usage vs. Requests and Limits: It’s crucial to compare pod limits and actual CPU and memory usage.
- Available and Unavailable Pods: Monitoring these metrics helps identify pods that are running but not ready to accept traffic, indicating possible configuration problems.
Container Metrics
Container metrics help determine how close you are to configured resource limits and detect pods stuck in CrashLoopBackOff. Key metrics include:
- Container CPU Usage: This metric shows CPU usage against pod limits.
- Container Memory Utilization: This metric shows memory usage against pod limits.
- Network Usage: This metric shows sent and received data packets and bandwidth usage.
Application Metrics
Application metrics measure the performance and availability of applications running inside your Kubernetes pods. These metrics are usually exposed by the applications themselves and depend on the application's business scope. Common application metrics include:
- Application Availability: This metric measures uptime and response times, crucial for optimal performance and user experience.
- Application Health and Performance: This includes performance issues, responsiveness, latency, and error identification in the application layer.
Kubernetes is complex, but understanding the health of your clusters is simple
Monitoring is hard. Kubernetes makes it even harder. But, with a good understanding of Kubernetes, its objects and metrics, it becomes much easier to overcome.
Once you’ve learned these basics, you’ll be much more confident in understanding the state of your own Kubernetes clusters.
PerfectScale is the industry's only Kubernetes cost optimization platform designed to improve cost efficiency, application stability, and resilience. PerfectScale will provide you with all Kubernetes resources and their health displayed on out-of-the-box dashboards. PerfectScale K8s optimization platform is free to start. Give it a try to see if it’s the right solution for you!
Happy building!