Last week AWS announced the general availability of EKS Auto Mode. While on the surface it’s just a switch on the management console interface - this new feature packs a lot of value. And if you were wondering - no, it’s not free. Beside the cost of vendor-lock - it will also cost you a little more for each node it connects to your cluster. You can check out the prices for each instance type here.
What’s in the Bundle?
According to the docs - EKS Auto mode brings the following capabilities to the table:
- Application load balancing
- Block Storage
- Compute Autoscaling
- GPU support
- Cluster DNS
- Pod and service networking
Which basically means that it takes care of the following add-ons that we previously had to install and configure separately:
- AWS Load Balancer Controller
- EBS CSI driver
- VPC CNI add-on
- core-dns
- Karpenter (!)
- Nvidia and Neuron Device Plugins
All these make a lot of sense (to me). Most clusters out there already have EBS CSI, ALB Controller and VPC CNI enabled. Also Karpenter is slowly but surely becoming the go-to autoscaling solution for EKS. The interesting thing here is the default support of GPU instances - AWS definitely see what we are seeing - more and more businesses running AI/ML workloads on their own infrastructure. Which makes our upcoming GPU optimization support ever more relevant.
Yes, sure - if you have all of these add-ons already automated and managed with IaC - it’s not such a huge deal. But if you’re only starting out - this definitely changes the onboarding experience and also going forward - takes a lot of the maintenance burden off the ops team shoulders.
Bye Bye Cluster Autoscaler!
By integrating managed Karpenter node provisioning in the EKS Auto Mode AWS practically voids the need for using cluster-autoscaler ever again. Just in time node provisioning is faster, more cost efficient and covers all of our autoscaling needs. And if there are critical workloads we need to always be immediately available - auto mode comes with a dedicated built-in system NodePool.
If you want to learn more about the advantages of using Karpenter and how to get the most out of it - read here and here.
Troubleshooting EKS Auto Mode
And the moment we say “maintenance” - we’re faced with the dilemma of all managed services. Yes, we are relieved of the maintenance burden - but it also means we give up control. Now that the pods for all of the add-ons are nowhere to be seen - how do we access their logs and metrics? How do we know why something went wrong? As we know it inevitably will, right?
I already faced this when trying to enable auto mode on an existing cluster. As you can see in the image below - I failed miserably 4 times in a row. Mind it - I had to wait an hour for each one of the failures.
Why? No idea. All the AWS console told me is that the update attempts failed but there were no errors:
(BTW - if you want to see me fail - I’ll release a video describing this experience next week.)
But in general - AWS documentation suggests we retrieve node logs with the help of AWS CLI and persist them to S3 with the help of the new NodeDiagnostic CRD.
What isn’t totally clear is how we can, for example, access the logs and metrics of Karpenter or the ALB Controller. And we know that they can have issues. We even created our own very popular Grafana dashboard for Karpenter troubleshooting.
Getting Started with AWS EKS Auto Mode
While enabling it on an existing cluster definitely has issues, getting started with EKS Auto Mode from scratch is easy - as eksctl create cluster now has the --enable-auto-mode flag. Or you can use the following very simple yaml config:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: auto
region: eu-central-1
autoModeConfig:
enabled: true
Terraform EKS module has also been updated. If you’re using another IaC tool - check its documentation.
Spinning up a new auto-mode cluster is actually a breeze - as no nodes are needed. And that’s a huge game changer in itself. Managed Karpenter means now we don’t need to have neither a managed node group nor a Fargate config in our cluster. Just deploy your workloads and the nodes will come.
Again - if you want to see my onboarding experience - watch out for the video next week.
EKS Auto Mode and PerfectScale
To sum things up - EKS Auto Mode makes provisioning and managing a fully featured cluster much easier by packaging most of the necessary components in one ready-to-use bundle. With Karpenter now being the built-in autoscaling mechanism - these clusters are cost effective out of the box - relying on just in time node provisioning. Yet, a truly optimized cluster also requires responsible pod right-sizing and careful NodePool fine-tuning. And this is where PerfectScale shines. Just as with self-managed Karpenter - PerfectScale can give you additional 30 to 50% cost reduction - while improving your workloads reliability. And it’s also fully automated!
So, ready to put your Kubernetes optimization in auto mode? What’s holding you back?