In this article, we will introduce KEDA (Kubernetes Event-Driven Autoscaler), and walk through an example of using KEDA for Cron-based scaling.
But first, let’s talk about load-based scaling vs event-driven scaling. Kubernetes provides the scalability and flexibility necessary to handle workloads of different sizes, but choosing the right scaling strategy is difficult.
Load-based Scaling
Load-based scaling is the traditional approach where Kubernetes adjusts the number of replicas in a deployment based on metrics like CPU or memory usage. For example, the Horizontal Pod Autoscaler (HPA) increases or decreases the number of pods to maintain a target CPU utilization percentage. Load-based scaling has the advantage of depending on metrics, such as CPU and memory usage, which are familiar to most system administrators. It allows for real-time adjustments to workloads, dynamically responding to changes in demand. However, this approach has its limitations. It is restricted to limited triggers, as it cannot scale based on custom metrics or external events. Furthermore, load-based scaling may experience a lag in response, making it less effective in handling sudden spikes or drops in highly dynamic workloads.
Event-Driven Scaling
Event-driven scaling adjusts the number of replicas based on external events or custom metrics. It’s ideal for scenarios like processing messages from a queue, where the number of items dictates the scaling needs. The event-driven approach works well for scale-to-zero scenarios, such as Function-as-a-Service (FaaS) applications, where workloads scale down to zero when not in use. This is where KEDA comes in.
What is KEDA (Kubernetes Event-Driven Autoscaler)?
KEDA (Kubernetes Event-Driven Autoscaler) is an open-source project that extends Kubernetes' scaling capabilities by enabling event-driven scaling for any container workload.
KEDA was created by Microsoft and Red Hat to bridge the gap between Kubernetes' scaling and the event-driven architecture. Since its launch, KEDA has become a CNCF project, with a growing community and wide adoption in production environments.
KEDA supports over 65 scalers, including Azure Service Bus, AWS SQS Queue, Kafka, Prometheus, and more, allowing it to handle a wide range of event sources. It integrates with Kubernetes' Horizontal Pod Autoscaler (HPA), increasing its capabilities without introducing complexity. KEDA is lightweight and introduces minimal overhead, making it an efficient choice for production environments.
How Kubernetes Event-Driven Autoscaler Works
Event Detection: KEDA monitors various event sources (e.g., message queues, databases) using components called scalers. Each scaler is designed for a specific event source and knows how to query it for metrics.
Metric Evaluation: When an event occurs, the scaler evaluates its associated metrics. For example, a scaler monitoring a message queue might check the number of pending messages.
Scaling Decision: Based on the metrics, KEDA determines whether to adjust the number of application instances (pods). If the metric exceeds a defined threshold, KEDA instructs Kubernetes to scale up the application; if it's below the threshold, it scales down.
Integration with Kubernetes: KEDA acts as a metrics server within Kubernetes, providing these event-based metrics to the Horizontal Pod Autoscaler (HPA). This integration allows Kubernetes to make informed scaling decisions based on both traditional metrics (like CPU usage) and external event metrics.
>> Take a Look at How HPA works and its Best Practices.
KEDA Architecture
The architecture of KEDA has different components working together:
Scalers: These are responsible for connecting to external event sources and retrieving metrics. KEDA supports a wide range of scalers for different event sources, including message queues, databases, and monitoring systems.
Metrics Server: Kubernetes Event-Driven Autoscaler includes a metrics server that exposes the retrieved metrics to Kubernetes. This allows the HPA to access these metrics and make scaling decisions based on them.
Operator: The KEDA operator manages the lifecycle of the scalers and ensures they are properly configured and running. It also handles scaling the application up or down based on the metrics provided by the scalers.
Admission Webhooks: Kubernetes Event-Driven Autoscaler uses admission webhooks to validate resource changes and prevent misconfigurations. For example, it makes sure that multiple ScaledObject resources do not target the same application, which could lead to conflicting scaling behaviors.
KEDA Scalers
K8s Event-Driven Autoscaler scalers define the events or custom metrics that trigger scaling. Some scalers examples include:
Message queues: It can scale based on the number of messages in an Azure Service Bus queue or Kafka topic.
Database metrics: It adjusts the number of replicas based on the size of a Redis stream or MySQL binlog.
Cron-based: It scales workloads up or down at specific times using cron expressions.
Each scaler is configured using simple YAML, allowing users to define scaling behavior in a declarative manner.
>> Want to Maximize Cost Savings by Putting Your Kubernetes Resources to Sleep During Off-Hours? Check out this post.
Tutorial: Cron-based Scaling with KEDA
Let’s consider a scenario where you need to scale a job to run only during business hours. This can be achieved using KEDA’s Cron scaler.
Step 1: Install KEDA
You should install KEDA in your cluster using Helm or other methods:
Step 2: Define the Deployment and the Cron Scaler
Define Deployment:
Then, create a `ScaledObject` that defines the scaling behavior based on a cron schedule:
With this configuration, the `my-cron-job` deployment will scale to ten replicas from 6 AM to 8 PM in the Asia/Kolkata timezone and scale down to zero outside these hours.
Want to know more about the Event-Driven Autoscaling with KEDA? Join us for an engaging session with Zbyněk Roubalík, CTO of Kedify and KEDA Project Maintainer.
KEDA is an excellent choice for dynamic and event-heavy workloads. Under the hood, KEDA uses the Horizontal Pod Autoscaler (HPA) to implement its scaling decisions. While HPA traditionally depends on metrics like CPU and memory usage, KEDA introduces event-driven triggers, eliminating the need to monitor resource utilization directly.
In PerfectScale, you can easily jump over HPA using the switcher above the table.
From there, you can see the HPA view provides a clear overview of workloads utilizing Horizontal Pod Autoscaler (HPA). This feature enables users to quickly identify the workloads where HPA has been introduced and adjust HPA thresholds with the provided informative tooltips that offer tailored recommendations. These recommendations are particularly helpful in optimizing scaling decisions, minimizing resource waste, and ensuring efficient operation of workloads.
At PerfectScale, we discover HPA configurations for workloads in clusters and optimize the pod resource requests and limits while accounting for horizontal scaling behaviors. With KEDA, this process becomes even more seamless, as scaling decisions are decoupled from resource metrics, focusing solely on external events or custom triggers. This enables teams to achieve cost efficiency and performance consistency without over-provisioning or under-utilizing resources. Sign up or Book a demo with the PerfectScale today!