Karpenter: The Ultimate Guide

Welcome, everyone! Today, we’re exploring the most great innovations in the Kubernetes: Karpenter. If you've ever struggled with the challenge of managing and scaling your Kubernetes clusters, you're not alone. Traditional autoscaling tools fall short when it comes to quickly adapting to changing workloads, resulting in wasted resources and higher costs. That's where Karpenter comes in.

In this ultimate guide, we’ll cover how to implement Karpenter in Kubernetes for cost-effective node provisioning. Compare Karpenter vs Cluster Autoscaler, understand best practices, and optimize your EKS workloads.

Let’s dig in!

What is Karpenter?

Karpenter is a modern autoscaling tool designed for Kubernetes developed by AWS. It automatically adjusts the number of compute resources (or nodes) in your cluster to match the workload. Unlike older autoscalers that often relied on fixed rules and static configurations, Karpenter uses real-time data to scale resources quickly and dynamically provisions the right compute resources based on the specific needs of the workloads running in the cluster. It was built to overcome the limitations of traditional autoscaling by offering a more dynamic, flexible approach.

Key Features of Karpenter:

Rapid Scaling: It can quickly add or remove nodes to match your workload in real time.

Cost Efficiency: By provisioning only what you need, Karpenter helps reduce wasted resources and lower costs.

Dynamic Provisioning: It automatically selects the best available resources based on your requirements, ensuring your applications run smoothly.

Seamless Integration: Karpenter works well with major cloud platforms, making it easy to incorporate into your existing environment.

How Karpenter Works?

Karpenter is a Kubernetes-native autoscaler designed to dynamically adjust the size of your Kubernetes cluster based on real-time workload demands. At its core, Karpenter continuously monitors the state of your cluster, including metrics from both pods and nodes. This monitoring allows Karpenter to make informed decisions about scaling actions. When it detects that the current resources are insufficient to handle the workload, Karpenter initiates a scaling-up process. This involves provisioning new nodes with the appropriate instance types and sizes that best match the resource requirements of the pending pods. Conversely, when the workload decreases and nodes become underutilized, Karpenter safely scales down the cluster by de-provisioning these nodes, ensuring that running workloads are not disrupted.

One of the key strengths of Karpenter is its ability to optimize resource allocation, which helps in reducing operational costs. It achieves this by selecting the most cost-effective instance types and sizes and by efficiently packing workloads onto nodes to maximize resource utilization. However, it's very important to note that Karpenter is only able to optimize resource allocation if the pods are right-sized. It looks at container resource requests and scheduling constraints in order to perform node selection. PerfectScale can help with pod right-sizing, ensuring that Karpenter has accurate information to work with. For more information on how PerfectScale can enhance Karpenter's effectiveness, check out this post

‍

‍Karpenter's decision-making process is driven by a set of customizable policies and configurations. Users can define custom provisioning logic using NodePool Custom Resource Definitions (CRDs), specifying parameters such as instance types, zones, and resource limits. This allows for fine-grained control over how resources are allocated and managed within the cluster. Scaling policies can be defined to set minimum and maximum node counts, as well as cooldown periods to control the frequency of scaling actions.

How to Install Karpenter on AWS EKS

Step 1. Set Up Your Environment

Before you start, make sure you have the following tools installed and configured:

a. AWS CLI: Command-line interface for AWS.

b. kubectl: Kubernetes command-line tool.

c. eksctl (>= v0.180.0): CLI for creating and managing EKS clusters.

d. helm: Kubernetes package manager.

Configure the AWS CLI with a user that has sufficient privileges to create an EKS cluster. Verify the CLI can authenticate properly by running:

aws sts get-caller-identity

Step 2: Configure Environment Variables

Next, set up some environment variables for Karpenter and Kubernetes:

export KARPENTER_NAMESPACE="kube-system"
export KARPENTER_VERSION="1.0.3"
export K8S_VERSION="1.30"
export AWS_PARTITION="aws"
export CLUSTER_NAME="${USER}-karpenter-demo"
export AWS_DEFAULT_REGION="us-west-2"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export ARM_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-arm64/recommended/image_id --query Parameter.Value --output text)"
export AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"

Step 3: Create Your EKS Cluster and Install Karpenter

Use eksctl with a configuration file to create your EKS cluster and install Karpenter simultaneously. Here’s how:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "${K8S_VERSION}"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}
iam:
  withOIDC: true

managedNodeGroups:
  - name: ${CLUSTER_NAME}-ng
    labels: { role: worker }
    instanceType: m5.large
    desiredCapacity: 2
    minSize: 1
    maxSize: 10
    tags:
      karpenter.sh/discovery: ${CLUSTER_NAME}
    volumeSize: 20
    iam:
      withAddonPolicies:
        externalDNS: true
        certManager: true
        awsLoadBalancerController: true
        albIngress: true
        ebs: true
        efs: true
        imageBuilder: true
        cloudWatch: true
    ssh:
      allow: true

karpenter:
  version: 'v0.37.0'
  createServiceAccount: true
  withSpotInterruptionQueue: true

Create the cluster using:

eksctl create cluster -f cluster-config.yaml

This configuration not only creates the cluster but also installs Karpenter and sets up the SpotInterruptionQueue, which allows Karpenter to replace spot instances before they die.

It's important to note that even though Karpenter does dynamic node provisioning, we still need a predefined nodegroup with 2 nodes. This initial nodegroup serves as a baseline for running critical system pods and ensures that there's always a minimum capacity available for the cluster to function, even if Karpenter encounters issues.

Step 4: Create a NodePool for Auto-Scaling

In this step, we're creating both a NodePool and an EC2NodeClass. The NodePool defines the requirements and limits for the nodes that Karpenter will provision, while the EC2NodeClass specifies the details of the EC2 instances that will be created. This NodePool uses securityGroupSelectorTerms and subnetSelectorTerms to discover resources for launching nodes.

envsubst < karpenter-nodepool.yaml | kubectl apply -f -
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      expireAfter: 720h
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2
  role: "KarpenterNodeRole-${CLUSTER_NAME}"
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}"
  amiSelectorTerms:
    - id: "${ARM_AMI_ID}"
    - id: "${AMD_AMI_ID}"

Step 5: Test Dynamic Node Provisioning

Deploy a sample application to test Karpenter's scaling capabilities:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 2000
      containers:
      - name: inflate
        image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
        resources:
          requests:
            cpu: 1
        securityContext:
          allowPrivilegeEscalation: false

‍

Scale Up: Increase the replicas of your deployment to simulate load.

kubectl scale deployment inflate --replicas 5
kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

‍

Observe: Check the logs in the Karpenter namespace to see nodes being added.

kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

Step 6. Manual Node Management and Cleanup

If you need to delete a node manually, Karpenter will handle the graceful shutdown

kubectl delete node "${NODE_NAME}"

Finally, once you’re done testing, clean up your resources to avoid unnecessary charges:

helm uninstall karpenter --namespace "${KARPENTER_NAMESPACE}"
aws cloudformation delete-stack --stack-name "Karpenter-${CLUSTER_NAME}"
aws ec2 describe-launch-templates --filters "Name=tag:karpenter.k8s.aws/cluster,Values=${CLUSTER_NAME}" |
    jq -r ".LaunchTemplates[].LaunchTemplateName" |
    xargs -I{} aws ec2 delete-launch-template --launch-template-name {}
eksctl delete cluster --name "${CLUSTER_NAME}"

Advantages of Karpenter

Karpenter does more than scaling. Here are some of the advantages of Karpenter:

1. Cost Optimization

Karpenter helps in cost optimization by dynamically provisioning the most cost-effective compute resources based on real-time workload demands. It supports a wide range of instance types, including spot instances, which are cheaper than on-demand instances. By intelligently selecting and scaling down underutilized nodes, Karpenter helps organizations minimize their cloud infrastructure costs.
It's important to note that pod right-sizing is important for actual optimization of node selection. PerfectScale can help with this process, making sure that your pods are requesting the appropriate resources. Readers can use PerfectScale's InfraFit Plugin to evaluate and fine-tune their NodePool configurations. InfraFit provides a granular view of resource allocation across the various nodes supporting your Kubernetes clusters, helping you make informed decisions about your infrastructure

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: cost-optimized
spec:
  template:
    spec:
      requirements:
        - key: “karpenter.sh/capacity-type”
          operator: In
          values: ["spot", "on-demand"]
  disruption:
    consolidationPolicy: WhenUnderutilized
  limits:
      cpu: "100"
      memory: "200Gi"

This configuration allows both spot and on-demand instances to be used, ensuring flexibility in cost optimization. By enabling consolidationPolicy: WhenUnderutilized, Karpenter actively removes underutilized nodes to minimize costs.

2. Support for Diverse Workloads

Karpenter is designed to handle a variety of workloads, including machine learning (ML) and generative AI applications. These workloads often have unique resource requirements and can be highly variable in nature. Karpenter's flexible provisioning capabilities ensure that the right type and amount of resources are available to meet the specific needs of these complex workloads, as long as the pods have been right-sized. For right-sizing the pods, PerfectScale provides plugins and insights to optimize your workload resource allocation.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: ml-workloads
spec:
  template:
    spec:
      requirements:
        - key: “karpenter.sh/capacity-type”
          operator: In
          values: ["on-demand"]
        - key: “node.kubernetes.io/instance-type”
          operator: In
          values: ["p3.2xlarge", "p3.8xlarge"]
  disruption:
    consolidationPolicy: WhenEmpty
    consolidateAfter: "60s" "p3.8xlarge"]
  ttlSecondsAfterEmpty: 60

This configuration ensures that only on-demand p3.2xlarge and p3.8xlarge instances are used. Nodes are terminated 60 seconds after becoming empty, conserving resources.

3. Simplified Upgrades and Patching

Managing upgrades and patches in a Kubernetes environment can be challenging. Karpenter simplifies this process by enabling node replacements and rolling updates. It can automatically provision new nodes with the latest patches and updates, and gracefully decommission old nodes. This automated approach reduces the operational burden and enhances cluster security and stability.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: upgrade-friendly
spec:
  template:
    spec:
      requirements:
        - key: “karpenter.sh/capacity-type”
          operator: In
          values: ["on-demand"]
  disruption:
    expireAfter: "720h"
    consolidationPolicy: WhenUnderutilized

Here, expireAfter: 720h ensures nodes are replaced after 30 days, keeping your cluster secure and up-to-date.

4. Kubernetes Native

Being Kubernetes-native, Karpenter integrates seamlessly with the Kubernetes ecosystem. It leverages Kubernetes APIs and works in harmony with other Kubernetes components, such as the scheduler and controller manager. This native integration ensures that Karpenter can efficiently manage resources and scale applications without requiring significant changes to existing Kubernetes setups. It also benefits from the robustness and reliability of the Kubernetes platform.

5. Advanced Scheduling Capabilities

Karpenter provides advanced scheduling features like bin-packing and topology-aware scheduling. Bin-packing optimizes resource utilization by packing workloads onto fewer nodes, reducing the overall number of nodes required. Topology-aware scheduling ensures that workloads are distributed in a way that maximizes performance and resilience, taking into account factors like network latency and fault domains. These advanced scheduling capabilities help in achieving better resource efficiency and application performance.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: advanced-scheduling
spec:
  template:
    spec:
      requirements:
        - key: “karpenter.sh/capacity-type”
          operator: In
          values: ["on-demand"]
  disruption:
    consolidationPolicy: WhenUnderutilized
    consolidateAfter: "15s"
  limits:
    cpu: "200"
    memory: "400Gi"

The consolidation field enables bin-packing, which optimizes resource utilization by packing workloads onto fewer nodes.

6. Enhanced Scalability and Performance

Karpenter's real-time data analysis and integration with cloud provider APIs enable it to respond quickly to changing workload demands. This ensures that applications have the necessary resources to maintain optimal performance, even under varying loads. By dynamically scaling resources up and down, Karpenter enhances the overall scalability and responsiveness of Kubernetes clusters.

>> Take a look at How to Optimize Karpenter for Efficiency and Cost?

Karpenter Optimization via Consolidation

Karpenter optimizes cost using a feature called consolidation. Let's say your Kubernetes cluster worker nodes look like this: over time, you have four EC2 worker nodes running. The first worker node is running efficiently with all pods tightly packed, ensuring no resource misutilization. However, the last three worker nodes are not as efficient, with a decent amount of resource misutilization and a lot of empty space.

You might wonder how the cluster worker nodes ended up like this. It can happen that over time, traffic increased, and the Horizontal Pod Autoscaler created many pods, prompting Karpenter to provision more nodes. At one point, all four nodes had four pods each, assuming each EC2 can host up to four application pods. As traffic decreased, some pods got terminated, leaving the EC2 instances underutilized.

With Karpenter in the node pool, you can enable the disruption and consolidation policy when underutilized. Karpenter will automatically detect underutilized EC2 instances and bin-pack those pods. For example, it can move pods from the last two EC2 instances onto the second one and then terminate the third and fourth EC2 instances. This leads to better utilization of worker nodes and reduced costs.

Karpenter o0ptimization — Karpenter Optimization

Consider another scenario where the second and third worker nodes are m5.xlarge instances. Even if Karpenter bin-packs these two pods into one EC2 instance, there will still be resource misuse, as 50% of the EC2 instance will remain underutilized. Karpenter is smart enough to create a new m5.large EC2 instance instead of an m5.xlarge instance and move the two pods from the larger instances into this newly provisioned instance. This way, Karpenter gets rid of both larger nodes and launches a smaller node to accommodate both pods, leading to better selection of worker nodes and reduced costs.

Karpenter cost optimization — Karpenter Optimization

Karpenter works to reduce cluster costs by:

- Removing empty nodes.

- Removing nodes by moving pods to other underutilized nodes.

- Replacing nodes with cheaper variants.

You can turn on the consolidation feature by specifying it under the disruption field. You can specify consolidateAfter, which determines how long Karpenter will wait before disrupting an underutilized EC2 instance. The consolidation policy can be set to WhenEmpty or WhenUnderutilized.

- When Empty: Karpenter will only consider nodes for consolidation that contain no workload pods. It's okay if system pods or daemon sets are running, but Karpenter will only consider the node empty if there are no workload pods or application pods running.

-When Underutilized: Karpenter will attempt to remove or replace nodes when a node is underutilized and could be changed to reduce costs.

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  disruption:
    consolidateAfter: 30s
    consolidationPolicy: WhenEmpty
    expireAfter: Never
  limits:
    cpu: "10"
  template:
    metadata:
      labels:
        nodepool: default
    spec:
      requirements:
      - key: “node.kubernetes.io/instance-type”
        operator: In
        values:
        - m5.large
        - m5.xlarge
        - m5.2xlarge
        - c5.large

Karpenter will start saving you money. However, you might not want certain nodes or pods to be consolidated, even if the node is very underutilized or empty. There are ways to control these different kinds of disruptions, such as consolidation, expiration, drift, etc. Once you understand the normal behavior of these disruptions, you can learn how to control them to prevent unwanted disruptions.
‍

Note: Karpenter's consolidation feature can lead to reliability issues for memory burstable pods. Consolidation and scheduling in general work by comparing the pods' resource requests vs. the amount of allocatable resources on a node. The resource limits are not considered. As an example, pods that have a memory limit that is larger than the memory request can burst above the request. If several pods on the same node burst at the same time, this can cause some of the pods to be terminated due to an out-of-memory (OOM) condition. Consolidation can make this more likely to occur as it works to pack pods onto nodes only considering their requests.

>>Take a look How you can get the most out of Karpenter with PerfectScale?

Karpenter vs Cluster Autoscaler

Karpenter represents a more modern, flexible approach to Kubernetes cluster scaling, offering faster provisioning times and more efficient resource utilization. It's particularly well-suited for dynamic environments with varying workload requirements. Cluster Autoscaler, on the other hand, is a more established solution that works well with predefined node groups and offers broader cloud provider support. It's a reliable choice for more static environments or when working with multiple cloud providers.

	Karpenter	Cluster Autoscaler
Scaling Approach	Just-in-time, pod-driven provisioning	Node group-based scaling
Node Provisioning Speed	Faster (typically < 1 min)	Slower (can take several mins)
Node Group Management	No need for predefined node groups	Requires predefined node groups
Resource Optimization	More granular, can provision exact resources needed	Less granular, scale based on predefined node types
Scaling Granularity	Can scale individual nodes	Scale entire node groups
Customization	Highly customizable through CRDs	Less customizable, relies on existing node groups
Scaling Down	More aggressive and efficient	More conservative, can be slower

‍

>> Take a look at detailted guide of Karpenter vs. Cluster Autoscaler

Karpenter Best Practices

1. Use Karpenter for Dynamic Workloads

Karpenter performs good in environments where workloads have fluctuating capacity needs. Unlike Amazon EC2 Auto Scaling Groups (ASGs) and Managed Node Groups, which rely on AWS-level metrics like EC2 CPU load, Karpenter integrates more closely with Kubernetes native APIs. This integration allows for more flexible and efficient scaling, especially for workloads that experience high, spiky demand or have diverse compute requirements. While ASGs and MNGs are suitable for static and consistent workloads, Karpenter is ideal for dynamic environments. You can also use a mix of dynamically and statically managed nodes to meet your specific requirements.

2. Run Karpenter Controller on EKS Fargate or a Dedicated Node Group

Karpenter is installed using a Helm chart, which deploys the Karpenter controller and a webhook pod as a Deployment. It is important to ensure that these components run on a stable environment. We recommend running the Karpenter controller on EKS Fargate or a dedicated node group. This setup ensures that Karpenter itself is not managed by Karpenter, thereby avoiding potential disruptions.

>> Take a look also at How to use Karpenter with AWS Reserved Instances

3. Exclude Unnecessary Instance Types

When configuring Karpenter, it is essential to exclude instance types that do not fit your workload requirements. For example, if your workloads do not require large Graviton instances, you can exclude them using the node.kubernetes.io/instance-type key. This exclusion helps in optimizing resource utilization and cost. PerfectScale's InfraFit plugin can provide valuable insights into which instance types are most suitable for your workloads, helping you make informed decisions about which types to include or exclude.

4. Enable Interruption Handling for Spot Instances

Karpenter supports native interruption handling, which is important for managing Spot Instances. Spot Instances can be interrupted with little notice, and Karpenter can handle these interruptions gracefully. By configuring the --interruption-queue CLI argument with the name of an SQS queue, Karpenter can taint, drain, and terminate affected nodes ahead of time, ensuring that workloads are moved to new nodes before the interruption occurs.

5. Configure for Private EKS Clusters

If you are running an Amazon EKS cluster in a VPC without outbound internet access, you need to configure your environment according to the private cluster requirements. This configuration includes creating an STS VPC regional endpoint and an SSM VPC endpoint. These endpoints are necessary for Karpenter to function correctly in a private cluster.
‍

6. Overprovisioning to Improve Responsiveness

In scenarios where you expect a sudden surge in workload, such as a data pipeline process that needs to launch a large number of pods simultaneously, overprovisioning can significantly improve responsiveness. This involves deploying a "dummy" workload with a low PriorityClass to reserve capacity in advance. When the actual workload is deployed, the "dummy" pods are evicted, making room for the new pods to start almost immediately. However, it's important to note that this approach trades off resource efficiency for responsiveness. Overprovisioning can lead to increased costs and resource waste, as you are maintaining additional capacity that may not always be utilized. Therefore, use this strategy judiciously and monitor your resource usage and costs closely. To implement overprovisioning, follow these steps:

1. Deploy the "dummy" workload:

kubectl apply -f dummy-workload.yaml

2. Deploy the actual workload:

kubectl apply -f workload.yaml

3. Scale down the "dummy" workload if not needed:

kubectl scale deployment dummy-workload --replicas 0

‍

>> Take a look at Guide to monitoring Karpenter with Prometheus

NodePools Best Practices

1. Create Multiple NodePools for Different Requirements

When different teams share a cluster, or when workloads have varying OS or instance type requirements, it is advisable to create multiple NodePools. For example, one team may require GPU instances for machine learning workloads, while another team may need general-purpose instances. By creating multiple NodePools, you can ensure that each team has access to the most appropriate resources.

2. Use Mutually Exclusive or Weighted NodePools

To provide consistent scheduling behavior, create NodePools that are either mutually exclusive or weighted. If multiple NodePools match a workload, Karpenter will randomly choose one, leading to unexpected results. For example, you can create a NodePool for GPU instances with specific taints and another for general compute instances with node affinities. This setup ensures that workloads are scheduled on the appropriate nodes.
In a Kubernetes environment, managing costs and assigning them to the appropriate teams can be a complex task. One effective strategy is to use mutually exclusive NodePools, which can help in assigning cost ownership to different billing teams.

Let’s take an example scenario of different billing teams and how they can use mutually exclusive NodePools:

a. Define NodePools with Specific Constraints:

Create NodePools with constraints that match the resource requirements of each team. For example, you can create a NodePool for GPU instances and another for general-purpose instances.

# NodePool for GPU Instances with Taints
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: gpu
spec:
  disruption:
    consolidateAfter: 1m0s
    consolidationPolicy: WhenEmpty
    expireAfter: Never
  template:
    metadata: {}
    spec:
      nodeClassRef:
        name: default
      requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - p3.8xlarge
        - p3.16xlarge
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - on-demand
      taints:
      - effect: NoSchedule
        key: nvidia.com/gpu
        value: "true"

# NodePool for General Compute Instances
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: generalcompute
spec:
  disruption:
    expireAfter: Never
  template:
    metadata:
      labels:
        billing-team: team-b
    spec:
      nodeClassRef:
        name: default
      requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - m5.large
        - m5.xlarge
        - m5.2xlarge
        - c5.large
        - c5.xlarge
        - c5a.large
        - c5a.xlarge
        - r5.large
        - r5.xlarge
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - on-demand

b. Deploy Workloads with Specific Affinities and Tolerations:

Ensure that the workloads are scheduled on the appropriate NodePools by using node affinities and tolerations.

# Deployment of GPU Workload with Tolerations
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gpu-workload
spec:
  replicas: 10
  template:
    spec:
      tolerations:
      - key: "nvidia.com/gpu"
        operator: "Exists"
        effect: "NoSchedule"

# Deployment for General Compute Workload with Node Affinity
apiVersion: apps/v1
kind: Deployment
metadata:
  name: general-workload
spec:
  replicas: 50
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: "billing-team"
                operator: "In"
                values: ["team-b"]

With this, you can ensure that Team A's GPU workloads are scheduled on GPU instances, and Team B's general compute workloads are scheduled on general-purpose instances. This not only optimizes resource utilization but also ensures that the costs are accurately attributed to each team. For example, if Team A's GPU instances are more expensive, the cost will be reflected in their billing, allowing for better budget management and accountability. Similarly, Team B will only be charged for the general-purpose instances they use, preventing any unexpected cost overruns.

3. Use Timers to Automatically Delete Nodes

Karpenter allows you to set timers on provisioned nodes to automatically delete them when they are no longer needed. This feature is useful for upgrading nodes, as it enables you to retire and replace nodes with updated versions. You can configure node expiry using the spec.disruption.expireAfter field in the NodePool specification.

4. Avoid Overly Constraining Instance Types

When using Spot Instances, avoid placing too many constraints on the instance types that Karpenter can provision. Karpenter uses the Price Capacity Optimized allocation strategy to provision instances from the deepest pools with the lowest risk of interruption. By allowing Karpenter to use a diverse set of instance types, you can optimize the availability and cost of your Spot Instances.

>> Take a look at Nodepool Selection Strategies: Performance vs. Cost

Karpenter pitfalls

Some of the Karpenter's pitfalls are:

1. Misaligned Expectations with Node Right-Sizing

Karpenter's dynamic node provisioning can lead to nodes that are either too small (causing pod scheduling issues) or too large (resulting in resource wastage).

2. Consolidation Challenges and Service Disruptions

Aggressive node consolidation can disrupt services, especially stateful applications like Prometheus, by causing frequent restarts.

3. Importance of Proper Pod Disruption Budgets (PDBs)

Without correctly configured PDBs, Karpenter might simultaneously disrupt multiple pods during scaling events, leading to service downtime.

4. Handling EC2 Limits and VPC CIDR Exhaustion

Karpenter's automated EC2 instance provisioning can quickly consume EC2 quotas and exhaust VPC CIDR blocks, especially during rapid scaling, leading to provisioning failures.

5. Spot Instance Interruptions

Utilizing spot instances can lead to unexpected interruptions, affecting workload stability.

>> Take a look at How to Avoid Karpenter's Pitfalls by Ivan

Enhancing Karpenter with PerfectScale by DoiT

KarpenterPerfectScale — PS: Infrafit Plugin

PerfectScale plus Karpenter can improve your Kubernetes cluster's efficiency and additional 30 to 50% in cost reductions on top of what can achieved with Karpenter alone. Through a series of key steps, PerfectScale’s insights empower Karpenter to make data-driven, intelligent decisions for node provisioning:

1. Resource Analysis: PerfectScale provides detailed insights into workload behavior and resource utilization, allowing teams to identify underutilized or overprovisioned pods.

2. Pod Right-Sizing: Using PerfectScale’s InfraFit plugin, you can fine-tune NodePool configurations to ensure pods are right-sized, aligning with real workload needs.

3. NodePool Optimization: With a granular view of resource allocation, PerfectScale helps teams adjust NodePools based on workload requirements, recommending instance types, and other settings for optimal cost and performance.

4. Continuous Monitoring: PerfectScale continuously monitors your workloads, alerting you to potential inefficiencies and offering actionable recommendations. This real-time insight enables proactive cluster management, maximizing the benefits of Karpenter’s dynamic scaling capabilities.

By combining Karpenter’s dynamic provisioning with PerfectScale’s insights, teams can achieve a truly optimized, cost-efficient Kubernetes environment. Try PerfectScale by DoiT to see how they can enhance your Karpenter-managed cluster. Sign up or Book a demo to learn more.

FAQs

Q1. Does Karpenter replace Cluster Autoscaler?

No- Karpenter can function alongside Cluster Autoscaler, but it's designed as a more flexible and efficient alternative for dynamic node provisioning in Kubernetes clusters.

Q2. How does Karpenter handle Spot Instance interruptions?

Karpenter detects Spot interruption notices and proactively provisions replacement nodes to minimize workload disruption.

Q3. Can Karpenter be used with multiple cloud providers?

While Karpenter is primarily designed for AWS, its architecture allows for adaptation to other cloud providers with appropriate configurations.

Q4. What are the prerequisites for installing Karpenter?

The prerequisites include a Kubernetes cluster, appropriate IAM roles, and necessary permissions for node provisioning.

Q5. How does Karpenter ensure workload availability during node termination?

Karpenter cordons and drains nodes before termination, ensuring that workloads are safely rescheduled to maintain availability.

Q6. How quickly can Karpenter scale my cluster?

Karpenter is designed for rapid scaling, often provisioning new nodes within minutes to handle workload changes.

AWS Karpenter: The Ultimate Guide to Kubernetes Autoscaling