How to Optimize Karpenter for Efficiency and Cost

Karpenter is the Kubernetes-native autoscaler designed to manage k8s resource efficiency. It dynamically provisions nodes in response to pending workloads, ensuring that applications have the necessary resources precisely when needed. This approach aims to enhance operational efficiency and reduce cloud expenditures.

However, achieving both efficiency and cost-effectiveness with Karpenter isn't easy. While Karpenter offers powerful capabilities, its default settings and some recommended best practices might unintentionally lead to higher costs, even if they improve performance. Therefore, it's crucial to not only follow general guidelines but also adjust them to meet your specific workload needs and cost considerations.

In this article, we’ll discuss Karpenter, its architecture, best practices for cost and efficiency, and lastly share our experience that best practices should be tailored according to your environment.

What is Karpenter?

Karpenter is a modern, Kubernetes-native autoscaler designed to address the dynamic needs of containerized workloads. Unlike traditional autoscaling tools, Karpenter is built to automatically and rapidly provision nodes based on the demands of your cluster. Its real-time responsiveness helps ensure that your applications have the required resources exactly when needed, thereby reducing both latency and overhead.

How Does Karpenter Work?

Karpenter is a Kubernetes-native autoscaler designed to dynamically adjust the size of your Kubernetes cluster based on real-time workload demands. At its core, Karpenter continuously monitors the state of your cluster, including metrics from both pods and nodes. This monitoring allows Karpenter to make informed decisions about scaling actions. When it detects that the current resources are insufficient to handle the workload, Karpenter initiates a scaling-up process. This involves provisioning new nodes with the appropriate instance types and sizes that best match the resource requirements of the pending pods. Conversely, when the workload decreases and nodes become underutilized, Karpenter safely scales down the cluster by de-provisioning these nodes, ensuring that running workloads are not disrupted.

One of the key strengths of Karpenter is its ability to optimize resource allocation, which helps in reducing operational costs. It achieves this by selecting the most cost-effective instance types and sizes and by efficiently packing workloads onto nodes to maximize resource utilization. However, it's very important to note that Karpenter is only able to optimize resource allocation if the pods are right-sized. It looks at container resource requests and scheduling constraints in order to perform node selection. PerfectScale can help with pod right-sizing, ensuring that Karpenter has accurate information to work with. For more information on how PerfectScale can enhance Karpenter's effectiveness, check out this post

Karpenter's decision-making process is driven by a set of customizable policies and configurations. Users can define custom provisioning logic using NodePool Custom Resource Definitions (CRDs), specifying parameters such as instance types, zones, and resource limits. This allows for fine-grained control over how resources are allocated and managed within the cluster. Scaling policies can be defined to set minimum and maximum node counts, as well as cooldown periods to control the frequency of scaling actions.

Karpenter is originally developed by AWS to enhance node lifecycle management within Amazon EKS clusters. Then seeing its potential, Microsoft introduced a provider (NAP) for running Karpenter on Azure Kubernetes Service (AKS) providing similar benefits for AKS users.

>> Take a look at the Karpenter: The Ultimate

Key Features of Karpeneter:

Rapid Node Provisioning: One of Karpenter’s standout features is its ability to spin up new nodes quickly. It continuously monitors your cluster, identifying when pods are waiting to be scheduled due to a lack of resources. By provisioning nodes on demand, it minimizes the waiting time and keeps your applications running smoothly.

Workload-Aware Instance Selection: Karpenter doesn't just add compute capacity; it adds the right kind of compute. It dynamically assesses the resource requirements of incoming workloads (such as CPU, memory, and even GPU needs) and selects the most appropriate instance types available in your cloud environment. This workload-aware approach ensures that you aren’t overpaying for unnecessary capacity and you get the optimal performance for your specific applications.

Cloud-Native Integration: It is designed from the ground up with modern cloud infrastructures in mind, Karpenter seamlessly integrates with major cloud providers. It uses native APIs to make intelligent decisions based on current pricing, available instance types, and regional capacity.

>> Learn more about the Karpenter Pitfalls

Node Lifecycle and Disruption Processes

‍Node Expiration: One of Karpenter's key features is node expiration, controlled by the expireAfter parameter. This setting tells the lifespan of a node, ensuring that nodes are periodically recycled to incorporate the latest configurations and security updates. By default, nodes are set to expire after 30 days (720 hours), but this duration can be customized to align with specific operational requirements. Upon reaching their expiration, Karpenter initiates a graceful shutdown process: it taints the node to prevent new pods from being scheduled, evicts existing pods while respecting their Pod Disruption Budgets (PDBs), and finally, terminates the node. This approach maintains cluster stability and minimizes service interruptions.

‍Disruption: Karpenter has consolidation policies to optimize resource utilization and reduce costs. Consolidation identifies opportunities to remove underutilized nodes by either redistributing their workloads to other nodes with available capacity or replacing them with more cost-effective instances. Karpenter evaluates nodes for consolidation based on their utilization and to decrease overall expenses.

The behavior of consolidation is controlled by two configurations:

a. consolidationPolicy: It determines when a node is considered “consolidatable.” You can choose between:

WhenEmpty: The nodes are consolidated only when they have no running (non-daemon) pods.

WhenEmptyOrUnderutilized: This consolidation policy allows for the removal of nodes that are either completely empty or only lightly used

b. consolidateAfter: This sets a delay after a scheduling event before Karpenter checks if consolidation is possible. It helps avoid unnecessary node churn due to short-term workload changes.

Now, let’s look at the three types of consolidation strategies Karpenter uses:

1. Empty Node Consolidation

This is the simplest case. If a node has no meaningful pods on it (just daemonsets, for example), it gets shut down immediately. These can be deleted in parallel across your cluster.

2. Multi-Node Consolidation

This is a more complex optimization where Karpenter tries to replace two or more underutilized nodes with a single, cheaper one. It guess the best combination of nodes to consolidate.

3. Single Node Consolidation

In this case, each node is evaluated individually. If the workloads on one node can either move to other existing nodes or be replaced by a cheaper instance, Karpenter will trigger that swap.

Drift management: Drift occurs when a node's actual state diverges from its desired configuration due to changes in the NodePool or EC2NodeClass specifications. Karpenter continuously monitors for such inconsistencies and automatically corrects them by updating or replacing the affected nodes. This self-healing capability maintains consistency across the cluster and ensures that all nodes adhere to the defined configurations, results in reliability and good performance.

To provide granular control over disruptions, Karpenter provides both pod-level and node-level annotations. By adding the karpenter.sh/do-not-disrupt: "true" annotation to a pod or node, users can prevent Karpenter from disrupting these resources during consolidation or drift management activities. It useful for workloads that require high availability or have long-running processes that should not be interrupted.

‍

How is a decision made behind the scenes?

Karpenter evaluates different factors to determine the most appropriate nodes for consolidation:

a. Nodes with Fewer Pods: It prioritizing nodes that host fewer pods ensures that the consolidation process affects the least number of workloads, thereby reducing potential disruption.

b. Nodes Nearing Expiration: The nodes that are approaching their predefined expiration time are considered prime candidates for consolidation, aligning with maintenance schedules and resource optimization strategies.

c. Nodes Running Low-Priority Pods: By targeting nodes that predominantly run lower-priority pods, Karpenter ensures that critical workloads remain unaffected during the consolidation process.

If a node cannot be removed, you can check Karpenter logs for detailed events explaining the reasons.

Best Practices for Optimizing Karpenter

Let’s discuss the best practices to optimize Karpenter:

1. Configuring Node Expiration: You should set the expireAfter parameter to ensure that nodes are periodically replaced, incorporating the latest security patches and performance improvements. This proactive approach reduces the risk of long-term drift and potential vulnerabilities. It's essential to set an appropriate expiration time based on your workload type to balance security and cost-effectiveness.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: nodepool
spec:
  template:
    spec:
      expireAfter: 168h # 7 days

2. Setting the Termination Grace Period : The terminationGracePeriod parameter defines the maximum duration Karpenter will wait for a node to drain before forcefully terminating it. A well-configured termination grace period balances giving pods sufficient time to exit gracefully and ensuring nodes are cycled promptly to save costs. When you set this period too short, it may lead to abrupt terminations, risking unstable workloads, while an excessively long period can delay cost savings. It's advisable to calculate a grace period according to your environment.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: nodepool
spec:
  template:
    spec:
      terminationGracePeriod: 30m

3. Using the karpenter.sh/do-not-disrupt Annotation: You should apply this annotation to critical pods to prevent them from being evicted during consolidation activities. While it safeguards essential workloads, overusing this annotation can block node deletion during scheduled refreshes, leading to resource inefficiencies. You should reserve this annotation for truly critical, short-lived processes or interactive jobs, and avoid applying it to long-running stateful services unless absolutely necessary.

apiVersion: v1
kind: Node
metadata:
  annotations:
    karpenter.sh/do-not-disrupt: "true"  #Node -Level Control

4. Applying Pod Disruption Budgets (PDBs): PDBs ensure application availability during node disruption events by specifying the minimum number of pods that must remain available during disruptions. When you configure PDBs, it requires choosing between minAvailable and maxUnavailable settings based on your workload's Service Level Agreements (SLAs). You should integrate PDBs with Karpenter that allow for graceful node draining and minimize downtime. You should also regularly update PDBs to reflect autoscaling dynamics to maintain their effectiveness.

5. NodePool Configuration and Layered Constraints: You could design NodePools tailored to different workloads enhancing resource utilization and resilience. For example, separate NodePools for stateful and stateless workloads that allow for optimized instance type selections, etc. You can also utilize node selectors, affinities, and tolerations to further refine scheduling, ensuring workloads are placed on appropriate nodes without compromising resilience.

6. Consolidation Policies for Cost Optimization: Karpenter's consolidation feature optimizes resource usage by identifying underutilized nodes and consolidating workloads onto fewer nodes. This process can lead to cost savings through efficient bin-packing and resource aggregation. However, it's essential to balance the potential disruptions during consolidation with the benefits of cost savings and operational efficiency improvements.

7. Balancing Spot and On-Demand Instances: You could use a mix of Spot and On-Demand instances that allows you to capitalize on cost benefits while ensuring stability for critical components. You should configure NodePools with appropriate weighting and instance limits to align with your Savings Plan commitments. This strategy enables you to optimize costs without compromising the reliability of essential workloads.

‍

Our Story: When Best Practices Backfire

- Written by Olexandr Veleten

In our journey, we followed the Karpenter's recommended best practices. We configured nodes with the expireAfter parameter to ensure regular rotation.

For our critical workloads, we applied the karpenter.sh/do-not-disrupt annotation to prevent disruptions during consolidation activities. On paper, this approach seemed flawless.

When the nodes reached their expiration, Karpenter initiated the provisioning of new nodes as expected. But a complication arose: the existing nodes couldn't be terminated because the do-not-disrupt annotation prevented the eviction of the critical pods they hosted. This resulted in a temporary scenario where both the old and new nodes ran concurrently, effectively doubling our capacity and, consequently, our costs.

To address this, we set terminationGracePeriod parameter, which defines the maximum duration for node draining before forceful termination. While this parameter ensures that nodes are eventually decommissioned, it introduces its own set of challenges. If multiple stateful nodes are terminated simultaneously without properly configured Pod Disruption Budgets (PDBs), the cluster's stability could be compromised.

Having these complexities, we decided to disable the expireAfter setting for our stateful workloads. Instead, we opted for manual updates of these nodes during scheduled maintenance windows or along with Kubernetes upgrades, roughly every six months. This approach allowed us to maintain control over node lifecycle events, ensuring both cost efficiency and cluster stability.

This experience highlighted a crucial lesson: while best practices serve as valuable guidelines, they aren't one-size-fits-all solutions. It's imperative to assess and tailor configurations to align with the unique demands of your workloads and operational environment. By doing so, you can harness the full potential of tools like Karpenter without unexpected consequences.

Cst and Usage graph — Cost and Usage graph

‍

PerfectScale Dashboard — PerfectScale's Dashboard

Karpeneter vs. Cluster Autoscaler

Karpenter represents a more modern, flexible approach to Kubernetes cluster scaling, offering faster provisioning times and more efficient resource utilization. It's particularly well-suited for dynamic environments with varying workload requirements. Cluster Autoscaler, on the other hand, is a more established solution that works well with predefined node groups and offers broader cloud provider support. It's a reliable choice for more static environments or when working with multiple cloud providers.

>> See here how to get the most out of Karpenter with smart pod right-sizing.

Get the Most Out of the Karpenter with PerfectScale

Integrating Karpenter with PerfectScale can significantly enhance your Kubernetes cluster's efficiency. While Karpenter offers intelligent, just-in-time node provisioning, it may lack deep insights into your workloads' historical resource utilization and reliability needs. PerfectScale fills this gap by analyzing workload patterns and providing optimization recommendations. This synergy has enabled customers to achieve an additional 30 to 50% cost reduction beyond what Karpenter alone offers. For example, PerfectScale can identify scenarios where Karpenter might overprovision resources and suggest configurations to prevent unnecessary costs and potential reliability impacts. Try PerfectScale to see how they can enhance your Karpenter-managed cluster. Sign up or Book a demo to learn more.

How to Optimize Karpenter for Efficiency and Cost

What is Karpenter?

How Does Karpenter Work?

Key Features of Karpeneter:

Node Lifecycle and Disruption Processes

How is a decision made behind the scenes?

Best Practices for Optimizing Karpenter

Our Story: When Best Practices Backfire

‍

Karpeneter vs. Cluster Autoscaler

Get the Most Out of the Karpenter with PerfectScale

Reduce your cloud bill and improve application performance today

Latest Articles

AKS Cost Optimization Best Practices

Karpenter: What is Karpenter and Avoiding Common Pitfalls

How to build a simple AI agent to troubleshoot Kubernetes

About the author

How to Optimize Karpenter for Efficiency and Cost

What is Karpenter?

How Does Karpenter Work?

Key Features of Karpeneter:

Node Lifecycle and Disruption Processes

How is a decision made behind the scenes?

Best Practices for Optimizing Karpenter

Our Story: When Best Practices Backfire

‍

Karpeneter vs. Cluster Autoscaler

Get the Most Out of the Karpenter with PerfectScale

Reduce your cloud bill and improve application performance today

Karpenter vs Cluster Autoscaler: The Ultimate Guide

Karpenter: What is Karpenter and Avoiding Common Pitfalls

Karpenter Monitoring with Prometheus

Latest Articles

AKS Cost Optimization Best Practices

Karpenter: What is Karpenter and Avoiding Common Pitfalls

How to build a simple AI agent to troubleshoot Kubernetes

About the author