What's core check-list using Manaed Node Groups as Spot Instance?
Introduction
Thank you for clicking through to my arcticle. I've been a DevOps engineer for 2 years in dev-team of 7 engineers.
My name is MINSEOK, LEE, but I use Unchaptered as an alias on the interenet. So, you can call me anythings "MINSEOK, LEE" or "Unchaptered" to ask something.
[Notice]
This post is more of a translation of a blog post by "Masatoshi Hayashi" in aws official docs.
Topic
Spot Instance's Warning Point and Usage Type
Previous Restrictions of Spot Instance as Custom Node Group in Amazon EKS.
Now Restrictions of Spot Instance as Managed Node Group in Amazon EKS.
Prerequisites
Spot Instance's Warning Point and Usage Type
When use Amazon EC2 Spot Instance, clients can use EC2 with discounts price until capacity pools.
[Warning Point]
The EC2 Spot Instance can be interrupted with a 2 minute notification.
[Usage Type]
API Endpoint with stateless.
Batch Computing
Machine Learning Workload.
Big Data ETL using Apache Spark
Application to process queue.
CI/CD Pipeilne
Previous Restrictions of Spot Instance as Custom Node Group in Amazon EKS
Before December 2020, clients must use Custom Node Group to use Spot Instance.
[Restrictions]
Set Configuration
Maintainence
Deploy toos to process Spot Instance's inturruption.
Deploy new AMI.
Auto Scaling Group
Now Restrictions of Spot Instance as Managed Node Group in Amazon EKS.
Use spot instance for fault-tolerant applications
What is the fault-tolerant applications?
Use more than 2 spot instance in same time for availability.
Use all availability zones
Is this instance family supported in the all availability zones?
Some kind of instance family isn't supported in all availaibility zones.
So, you can check "Is this instance family is supported in the all availability zones?"Consider EFS(Elastic File System), not using EBS according to your workloads.
EKS managed node groups with Spot Instances: a look under the hood
We've learned some constraints and recommendations.
However, there are a few additional considerations that aren't in this article.
Allocation Strategy : Capacity-Optimized.
When Node Group is scaled out, new instance will be launched from the most-available capacity pools. This works have two advantages.
Decrease the number of spot interruptions in node groups
Increase the resilience of the application.
No additional Configuration Tools : Node Termination Handler
Managed Node Groups handle Spot Interruption
The underlying Auto Scaling Group is opted-in to Capacity Rebalancing, which means one of spot instances will be elevated risk of interruption and get ec2 instance rebalancing recommendation.
Managed Node Groups's Instance Family
The more instance types configured in Managed Node Group, the higher the probaility that EC2 Auto Scale will launch a spot instance.Automatic Pod Balancing in each Availability Zones
Draining for Balancing
If node group needs to rebalance capacity between AZs, it will automatically drain the pods from the instances, are being scaled in.Automatic Tagging
Auto Scaling Group are automatically tagged
The Kubernetes CAS(Cluster Auto Scaler) auto discovery functionality.
Conlusion
After eksctl, version 0.33, you can start
You can start EC2 Spot Instance, after eksctl (version 0.33).
No configuration overhead
No operation overhead
When resources are scarce, such as when traffic spikes, you may need to use Kubenetes Karpenter.