Kubernetes CPU Limits: Best Practices for Optimal Performance

Kubernetes CPU limits specify the maximum amount of CPU resources that a container can use. These limits prevent a container from consuming excessive CPU, which otherwise might affect other containers within the same node. By defining CPU limits, Kubernetes ensures that each container operates within its defined resource boundaries, promoting fair CPU resource usage across all services.

When a container exceeds its CPU limit, Kubernetes throttles the CPU usage, meaning it temporarily limits the CPU resources available to the container. This mechanism ensures that the system remains stable and that no single container monopolizes CPU resources, thereby supporting overall system performance and responsiveness.

The Misconception About K8s CPU Limits

A common advice for Kubernetes newcomers is to set CPU limits to prevent any single container from monopolizing CPU resources, thereby protecting other containers. However, this advice is somewhat misleading. It's actually CPU requests, not limits, that ensure a minimum amount of CPU for your workloads. CPU requests guarantee that a container will receive a certain amount of CPU resources, while CPU limits cap the maximum CPU resources a container can use. This distinction is important for understanding how to effectively manage CPU resources in a Kubernetes cluster.

>> Take a look at How you can improve significant improvement in performance by removing CPU Limits

How Kubernetes CPU Limits Work

Kubernetes uses CGroup (Control Groups) parameters to manage CPU limits. These parameters are:

cpu.cfs_period_us: This parameter defines the duration of a "CPU period" in microseconds. By default, Kubernetes sets this to 100,000µs (100ms). This period is the time frame within which the CPU quota is enforced.

cpu.cfs_quota_us: This parameter specifies the total amount of CPU time a container can use within a given period. For example, if you set a CPU limit of 0.5 vCore (500 milli-cores), the quota will be 50,000µs (50ms) per 100ms period.

When a container exceeds its CPU limit, it gets throttled, meaning it must wait for the next CPU period to continue processing. This throttling can severely impact the performance of your application, especially for multi-threaded or multi-process workloads. The throttling mechanism ensures that no single container can consume more than its allocated CPU resources, but it also means that any excess capacity in the CPU cannot be utilized by that container.

Single-Threaded vs. Multi-Threaded Workloads

For single-threaded applications, setting a CPU limit might not cause immediate issues. These applications typically use only one core, so they are less likely to hit the CPU limit quickly. However, for multi-threaded applications, the situation is different. If your container uses multiple cores, it will consume its CPU quota much faster and will be throttled for the remainder of the period. For example, a container using 4 cores with a 0.5 vCore limit will consume its quota in just 12.5ms and be throttled for the remaining 87.5ms of the period. This can lead to significant performance degradation, as the application will spend a considerable amount of time waiting for the next CPU period.

‍

Real-Life Implications

Consider a practical scenario: You have a Kubernetes cluster with a node that has 2 vCPUs. You start two pods with no CPU requests or limits, falling into the BestEffort QoS (Quality of Service) class. These pods will share the CPU resources equally. Now, if you start another two pods with a Guaranteed QoS class and set a CPU request and limit of 0.5 vCore (500m CPU), you might expect these pods to get more CPU. However, due to the CPU limits, these Guaranteed pods will only use up to their limit, even if there is idle CPU available on the node. This scenario highlights the inefficiency of CPU limits in utilizing available resources.

The Drawbacks of Kubernetes CPU Limits

The key takeaway is that CPU limits are not effective for preventing noisy neighbors or protecting nodes from overallocation. Instead, they prevent the use of idle CPU resources, leading to wasted capacity. Therefore, it's generally advisable to avoid setting CPU limits if performance is a priority. CPU limits can lead to underutilization of available resources, which is counterproductive in a resource-constrained environment. They can also cause unpredictable performance, as containers may be throttled even when there is idle CPU capacity available.

When to Use K8s CPU Limits

While CPU limits are generally not recommended for production environments, there are specific scenarios where they can be useful:

Staging Environments: To simulate worst-case scenarios where no idle CPU is available. This can help in testing the behavior of applications under resource constraints.

Consistent Performance: For organizations that prioritize consistent performance over optimal resource utilization, such as Google or managed Kubernetes services like GKE AutoPilot. In such cases, the predictability of performance is more important than the efficient use of resources.

>> Take a look at Best Practices for right-sizing your kubernetes cluster

How to Limit CPU Usage in Kubernetes

While we generally recommend avoiding CPU limits, there may be situations where you need to implement them. Here's how you can set CPU limits in Kubernetes:

1. In your pod or deployment YAML file, add the following under the resources section:

resources:
  limits:
    cpu: "500m"

This sets a CPU limit of 500 millicores (0.5 CPU cores).

2. Apply the configuration:

kubectl apply -f your-deployment.yaml

Example: Let's say you have a web application that typically uses about 200m CPU but occasionally spikes to 400m during high load. You can set up limits like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: web-app
        image: 
        resources:
          requests:
            cpu: "200m"
          limits:
            cpu: "500m"

This configuration requests 200m CPU (typical usage) and sets a limit of 500m to allow for spikes.

>> Take a look at how you can set resource limits and requests in kubernetes improving the cluster efficiency

Kubernetes CPU Limit Best Practices

1. Set CPU Requests Appropriately: Ensure that your CPU requests reflect the relative weight you want each container to have. This guarantees that your workloads receive the necessary CPU resources. CPU requests should be based on the actual needs of the application, taking into account its typical and peak usage patterns.

2. Consider Concurrency: Don't set CPU requests higher than the number of concurrent threads or processes your application can utilize. This ensures that the allocated CPU resources are effectively used.

3. Avoid CPU Limits for Performance: If performance is your goal, avoid setting CPU limits to allow your containers to utilize idle CPU resources. This approach maximizes the use of available resources and improves the overall performance of your applications.

So, it’s clear that understanding and managing CPU resources in Kubernetes is important for optimizing the performance and efficiency of your applications. By setting appropriate CPU requests and avoiding unnecessary CPU limits, you can make sure your workloads run smoothly. Remember, CPU limits are not a silver bullet for preventing noisy neighbors; they are more suited for scenarios where consistent performance is a higher priority than optimal resource utilization.

Automatically Set Kubernetes CPU Limits with PerfectScale by DoiT

At PerfectScale by DoiT, we understand that manually setting CPU limits can be challenging and time-consuming. Our platform automates this process, ensuring optimal resource allocation while maintaining system reliability and performance. While our general recommendation is to avoid CPU limits- - they are inefficient, wasteful and may negatively impact performance. PerfectScale's automation is highly customizable, If predictability and consistency are priorities for your organization, or if certain clusters or workloads benefit from limits, our platform can provide optimized CPU limit values tailored to your specific needs.

Ready to get your clusters optimized? Schedule a demo today with PerfectScale by DoiT.

Kubernetes CPU Limit: Best Practices for Optimal Performance

The Misconception About K8s CPU Limits

How Kubernetes CPU Limits Work

Single-Threaded vs. Multi-Threaded Workloads

Real-Life Implications

The Drawbacks of Kubernetes CPU Limits

When to Use K8s CPU Limits

How to Limit CPU Usage in Kubernetes

Kubernetes CPU Limit Best Practices

Automatically Set Kubernetes CPU Limits with PerfectScale by DoiT

Reduce your cloud bill and improve application performance today

Latest Articles

GPU Optimization with Exceptional PerfectScale Visibility

On Demand Webinar: Manage & Scale GenAI on Kubernetes

GCP Cloud Billing with PerfectScale

About the author

Kubernetes CPU Limit: Best Practices for Optimal Performance

The Misconception About K8s CPU Limits

How Kubernetes CPU Limits Work

Single-Threaded vs. Multi-Threaded Workloads

Real-Life Implications

The Drawbacks of Kubernetes CPU Limits

When to Use K8s CPU Limits

How to Limit CPU Usage in Kubernetes

Kubernetes CPU Limit Best Practices

Automatically Set Kubernetes CPU Limits with PerfectScale by DoiT

Reduce your cloud bill and improve application performance today

Kubernetes CPU Limits - To Set or not to Set

Kubernetes Resource Limits vs Requests: Ultimate Guide

8 Tips For Rightsizing Your Kubernetes Cluster

Latest Articles

GPU Optimization with Exceptional PerfectScale Visibility

On Demand Webinar: Manage & Scale GenAI on Kubernetes

GCP Cloud Billing with PerfectScale

About the author