Guide to KEDA (Kubernetes Event-Driven Autoscaler) with an example

In this article, we will introduce KEDA (Kubernetes Event-Driven Autoscaler), and walk through an example of using KEDA for Cron-based scaling.

But first, let’s talk about load-based scaling vs event-driven scaling. Kubernetes provides the scalability and flexibility necessary to handle workloads of different sizes, but choosing the right scaling strategy is difficult.

Load-based Scaling

Load-based scaling is the traditional approach where Kubernetes adjusts the number of replicas in a deployment based on metrics like CPU or memory usage. For example, the Horizontal Pod Autoscaler (HPA) increases or decreases the number of pods to maintain a target CPU utilization percentage. Load-based scaling has the advantage of depending on metrics, such as CPU and memory usage, which are familiar to most system administrators. It allows for real-time adjustments to workloads, dynamically responding to changes in demand. However, this approach has its limitations. It is restricted to limited triggers, as it cannot scale based on custom metrics or external events. Furthermore, load-based scaling may experience a lag in response, making it less effective in handling sudden spikes or drops in highly dynamic workloads.

Event-Driven Scaling

Event-driven scaling adjusts the number of replicas based on external events or custom metrics. It’s ideal for scenarios like processing messages from a queue, where the number of items dictates the scaling needs. The event-driven approach works well for scale-to-zero scenarios, such as Function-as-a-Service (FaaS) applications, where workloads scale down to zero when not in use. This is where KEDA comes in.

	Load-Based Scaling	Event-Driven Scaling
Trigger Mechanism	Adjusts the number of pod replicas based on resource usage metrics like CPU or memory utilization.	Adjusts the number of pod replicas based on external events or custom metrics, such as message queue length or HTTP request count.
Use Cases	Suitable for applications with predictable workloads where resource consumption correlates with demand.	Ideal for event-driven architectures, such as processing tasks from a message queue or handling scheduled jobs.
Scalability	Scales between a minimum and maximum number of replicas but doesn't inherently scale to zero.	Can scale applications down to zero replicas when there are no events to process, conserving resources.
Configuration	Configured using Kubernetes' Horizontal Pod Autoscaler (HPA), focusing on resource utilization thresholds.	Configured using tools like KEDA, which support various event sources and custom metrics for scaling decisions.
Resource Efficiency	May lead to over-provisioning if resource usage doesn't accurately represent workload demand.	Enhances resource efficiency by scaling precisely based on event load, reducing unnecessary resource consumption.

‍

What is KEDA (Kubernetes Event-Driven Autoscaler)?

KEDA (Kubernetes Event-Driven Autoscaler) is an open-source project that extends Kubernetes' scaling capabilities by enabling event-driven scaling for any container workload.

KEDA was created by Microsoft and Red Hat to bridge the gap between Kubernetes' scaling and the event-driven architecture. Since its launch, KEDA has become a CNCF project, with a growing community and wide adoption in production environments.

KEDA supports over 65 scalers, including Azure Service Bus, AWS SQS Queue, Kafka, Prometheus, and more, allowing it to handle a wide range of event sources. It integrates with Kubernetes' Horizontal Pod Autoscaler (HPA), increasing its capabilities without introducing complexity. KEDA is lightweight and introduces minimal overhead, making it an efficient choice for production environments.

How Kubernetes Event-Driven Autoscaler Works

Event Detection: KEDA monitors various event sources (e.g., message queues, databases) using components called scalers. Each scaler is designed for a specific event source and knows how to query it for metrics.

Metric Evaluation: When an event occurs, the scaler evaluates its associated metrics. For example, a scaler monitoring a message queue might check the number of pending messages.

Scaling Decision: Based on the metrics, KEDA determines whether to adjust the number of application instances (pods). If the metric exceeds a defined threshold, KEDA instructs Kubernetes to scale up the application; if it's below the threshold, it scales down.

Integration with Kubernetes: KEDA acts as a metrics server within Kubernetes, providing these event-based metrics to the Horizontal Pod Autoscaler (HPA). This integration allows Kubernetes to make informed scaling decisions based on both traditional metrics (like CPU usage) and external event metrics.

>> Take a Look at How HPA works and its Best Practices.

KEDA Architecture

The architecture of KEDA has different components working together:

Scalers: These are responsible for connecting to external event sources and retrieving metrics. KEDA supports a wide range of scalers for different event sources, including message queues, databases, and monitoring systems.

Metrics Server: Kubernetes Event-Driven Autoscaler includes a metrics server that exposes the retrieved metrics to Kubernetes. This allows the HPA to access these metrics and make scaling decisions based on them.

Operator: The KEDA operator manages the lifecycle of the scalers and ensures they are properly configured and running. It also handles scaling the application up or down based on the metrics provided by the scalers.

Admission Webhooks: Kubernetes Event-Driven Autoscaler uses admission webhooks to validate resource changes and prevent misconfigurations. For example, it makes sure that multiple ScaledObject resources do not target the same application, which could lead to conflicting scaling behaviors.

KEDA Scalers

K8s Event-Driven Autoscaler scalers define the events or custom metrics that trigger scaling. Some scalers examples include:

Message queues: It can scale based on the number of messages in an Azure Service Bus queue or Kafka topic.

Database metrics: It adjusts the number of replicas based on the size of a Redis stream or MySQL binlog.

Cron-based: It scales workloads up or down at specific times using cron expressions.

Each scaler is configured using simple YAML, allowing users to define scaling behavior in a declarative manner.

>> Want to Maximize Cost Savings by Putting Your Kubernetes Resources to Sleep During Off-Hours? Check out this post.

Tutorial: Cron-based Scaling with KEDA

Let’s consider a scenario where you need to scale a job to run only during business hours. This can be achieved using KEDA’s Cron scaler.

Step 1: Install KEDA

You should install KEDA in your cluster using Helm or other methods:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Step 2: Define the Deployment and the Cron Scaler

Define Deployment:

envsubst < keda-deployment.yaml | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-cron-job
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-cron-job
  template:
    metadata:
      labels:
        app: my-cron-job
    spec:
      containers:
        - name: busybox
          image: busybox
          command: [“sleep”, “3600”]

Then, create a `ScaledObject` that defines the scaling behavior based on a cron schedule:

envsubst < scaledobject.yaml | kubectl apply -f -
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaler
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-cron-job
  triggers:
  - type: cron
    metadata:
      timezone: Asia/Kolkata
      start: 0 6 * * * # At 6:00 AM
      end: 0 20 * * * # At 8:00 PM
      desiredReplicas: "10"

With this configuration, the `my-cron-job` deployment will scale to ten replicas from 6 AM to 8 PM in the Asia/Kolkata timezone and scale down to zero outside these hours.

Want to know more about the Event-Driven Autoscaling with KEDA? Join us for an engaging session with Zbyněk Roubalík, CTO of Kedify and KEDA Project Maintainer.

KEDA is an excellent choice for dynamic and event-heavy workloads. Under the hood, KEDA uses the Horizontal Pod Autoscaler (HPA) to implement its scaling decisions. While HPA traditionally depends on metrics like CPU and memory usage, KEDA introduces event-driven triggers, eliminating the need to monitor resource utilization directly.

In PerfectScale, you can easily jump over HPA using the switcher above the table.

From there, you can see the HPA view provides a clear overview of workloads utilizing Horizontal Pod Autoscaler (HPA). This feature enables users to quickly identify the workloads where HPA has been introduced and adjust HPA thresholds with the provided informative tooltips that offer tailored recommendations. These recommendations are particularly helpful in optimizing scaling decisions, minimizing resource waste, and ensuring efficient operation of workloads.

Column	Description
HPA	It indicates whether HPA has been introduced for the workload. You can easily sort the column by clicking the header or apply specific filters.
CPU (%)	It displays the trigger for HPA by CPU. There are two types of indicators to be aware of: - A red indication signifies that the threshold is below 60%, indicating potential significant CPU waste. - A yellow indication suggests that the threshold falls between 60% and 80%, pointing to potential moderate CPU waste.
Memory (%)	It displays the trigger for HPA by Memory. There are two types of indicators to be aware of: - A red indication signifies that the threshold is below 60%, indicating potential significant Memory waste. - A yellow indication suggests that the threshold falls between 60% and 80%, pointing to potential moderate Memory waste.
Custom Metric	It indicates whether a Custom metric has been detected.

‍

At PerfectScale by DoiT, we discover HPA configurations for workloads in clusters and optimize the pod resource requests and limits while accounting for horizontal scaling behaviors. With KEDA, this process becomes even more seamless, as scaling decisions are decoupled from resource metrics, focusing solely on external events or custom triggers. This enables teams to achieve cost efficiency and performance consistency without over-provisioning or under-utilizing resources. Sign up or Book a demo with the PerfectScale today!

Guide to KEDA (Kubernetes Event-Driven Autoscaler)

Load-based Scaling

Event-Driven Scaling

What is KEDA (Kubernetes Event-Driven Autoscaler)?

How Kubernetes Event-Driven Autoscaler Works

KEDA Architecture

KEDA Scalers

Tutorial: Cron-based Scaling with KEDA

Step 1: Install KEDA

Step 2: Define the Deployment and the Cron Scaler

Reduce your cloud bill and improve application performance today

Latest Articles

AKS Cost Optimization Best Practices

How to Optimize Karpenter for Efficiency and Cost

Karpenter: What is Karpenter and Avoiding Common Pitfalls

About the author

Guide to KEDA (Kubernetes Event-Driven Autoscaler)

Load-based Scaling

Event-Driven Scaling

What is KEDA (Kubernetes Event-Driven Autoscaler)?

How Kubernetes Event-Driven Autoscaler Works

KEDA Architecture

KEDA Scalers

Tutorial: Cron-based Scaling with KEDA

Step 1: Install KEDA

Step 2: Define the Deployment and the Cron Scaler

Reduce your cloud bill and improve application performance today

Kubernetes Horizontal Pod Autoscaler (HPA)

Kubernetes Vertical Pod Autoscaler

Putting K8s Resources to Sleep with KEDA

Latest Articles

AKS Cost Optimization Best Practices

How to Optimize Karpenter for Efficiency and Cost

Karpenter: What is Karpenter and Avoiding Common Pitfalls

About the author