What's the Amazon EKS?

Mar 05, 2024

Contents

Introduction Topic The stability of Amazon EKS Amazon EKS have five core values.What's the SRM(Shared Responsibility Model)?Amazon EKS Control Plane The stability of Control Plane The scale-up of Control Plane The stability of Data Plane Stability 1 - Self Recovery Stability 1.a. - General 3 way for health check Stability 1.b. - Probe, Health Check in Kubernetes Stability 2 - Minimize Effected Area Stability 2.a. - Pod Scheduling Stability 2.b. - podAntiAffinity Stability 2.c. - topologySpreadConstraints Stability 2.d. - podAntiAffinity & topologySpreadConstraints Stability 2.e. - Maintain High Importance Service Stability 3 - Fast Scale-out Stability 3.a. - HPA(Horizontal Pod AutoScaler)Stability 3.b. - CAS(Cluster Auto Scaler)Scability 3.c. - Cons of CAS Stability 3.d. Karpenter Amazon EKS Version Update Update EKS Version in-place Update process of EKS as in-place update Cons of in-place update Update EKS Version as Blue Green Manage EKS Cluster References

Introduction

Thank you for clicking through to my arcticle. I've been a DevOps engineer for 2 years in dev-team of 7 engineers.

My name is MINSEOK, LEE, but I use Unchaptered as an alias on the interenet. So, you can call me anythings "MINSEOK, LEE" or "Unchaptered" to ask something.

Topic

Attending AWS Summit Seoul 2023, I learned about Amazon EKS.

In this article, I'll cover the following:

The stability of Amazon EKS
The stability of Control Plane
The stability of Data Plane
The stability of Version Update

The stability of Amazon EKS

Basically, kubernetes is devided into Control Plane and Data Plane.
If you use kubernetes, not using Saas, you must manage all elements of kubernetes.
But Amazon EKS run "reliable container application" with "minimal resources" on AWS
Amazon EKS manage Control Plane to reduce the maintainence of resources for kubernetes. Also, EKS supports Self Managed EC2, Managed Node Group, Fargate(serverless compute) for Data Plane to reduce maintainence of application

Amazon EKS have five core values.

In this article, we talked about Stability.
In general, what's the meaning of the stability?
We have 2 kind of key concept as following:
- HA(High Availability) : 99.999%
- DR(Disaster Recovery) : RTO / RPO
  - References on these previous documents.
  - For building and maintain HA, the each components of kubernetes must be stable.

What's the SRM(Shared Responsibility Model)?

AWS has a responsibility both HW of EKS and stability for components of Control Plane.
Customer has a responsibility of Data Plane contains application code.
For example, customer must consider the pod uploaded application code and deployment and services for exposing.

And Cross-Account ENI is exists to connect between control plane and data plane. AWS provides Croos-Account ENI in target AZs to securely communicate between AWS and customer's VPC.

Amazon EKS Control Plane

EKS Control Plane ensures a reliability for each elements.

API Server : handle kubernetes api
Cloud Controller : connect aws cloud and kubernetes
Controller Manager : manage several controller
Scheduler : controll pod's deployed into right node.
etcd : save kubernetes data

The stability of Control Plane

The API Server Instance reinforce high availability as Active-Active and minimum replicas is 2.MUM to
The etcd instance is implemented as Key-Value Storage and minimum replicas is 3.

The same things between api server instance and etcd instance.
- Implement as auto scaling group for scalabtiliy
- Implement as distributed in multi-az for high availability.

The scale-up of Control Plane

In unexpected situation, control plane instance must scale up using scaling signal.
- In accordance with CPU, MEM usage, Control Plane instance must be scale up/down.
- In accordance with Disk usage and rest, Control Plane instance must be scale up/down.
Customer(engineer) don't set up using API or console.
AWS reinforce the performance of control plane
No additional price

The stability of Data Plane

Amazon EKS supports HA with the 3 way
- Self Recovery
- Minimize Effected Area
- Fast Scale-out

Stability 1 - Self Recovery

Kubenretes have owned self recovery function with status management machinsm.
When you create resources in kubernetes, you'ld write kubernetes manifest files as YAML format. In there, you record desire state.

Several Controllers will continuously try to change current state into desired status. It called "kubernetes control loop". It's separated this 3 steps.
- Observe
- Diff
- Act

There are several ways to create a pod.
The simplest way is directly create pods, but it's less useful because you don't have any controller to monitor and recover pods.
Therefore you will typically create pods using deployment.
Deployment use RS(replicaSet) to keep the count of pods.

Stability 1.a. - General 3 way for health check

Genral Health Check Way with deployded application

HTTP Request
Command
TCP Socket

Stability 1.b. - Probe, Health Check in Kubernetes

In kubernetes, kubelet(in worker node) use probe to act HealthCheck

readinessProbe
- Command → Readiness (5s) : Is application ready to listen request?
```
readinessProbe:
  exec:
    command:
     - cat
     - /tmp/healthy

initialDelaySeconds: 5
periodSeconds: 5
```
- The worker node has installed kubelet.
- The kubelet use readinessProbe for health check.
  If some trouble is occured in node group, kubelet remove pod in service endpoint. When trrafic is comming, unhealthy pods don't receive request.
  Only healthy pods will receive request.
- After trouble is fixed, kubelet contians fixed pods into service endpoint.
livenessProbe
- HTTP GET → Liveness (5s) : Is application ready to listen request?
```
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
- The worker node has installed kubelet.
- If some trouble is occured in node group, kubelet remove pods in service endpoint. And then kubelet restart container.
- After trouble is fixed, kubelet contains fixed pods into service endpoiint.

Stability 2 - Minimize Effected Area

If you operate service as MSA environment, some service's trouble is spreaded into other service. These trouble can effect Kubernetes clsuter.

Stability 2.a. - Pod Scheduling

Kubernetes Manifest File

apiVersion: apps/v1
kind: Deployment
metaData:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 8
  selector:
    matchLabels:
      app: nginx
  template:
    labels:
      app: nginx
    spec:
      containers:
       - name: nginx
         image: nginx:1.7.9
         port:
         - containerPort: 80

In manifest file, replicas is setting on 8 count.
The kubernetes sched decideded where is pods' position into each node.
- sched : set priority of pod deployment
- affinity : pods' preference for each nodes.
- request / limit
- node resources : node's CPU and MEM usage.

Sometime, pods may be deployed in some nodes.

In this case...
If some nodes, contains many pods, is dying, remaining system couldn't cover the basic traffic.
Becuase kubelect heath check period is 5 minute(default),
the worst situation is kubelet knows node's trouble after 5 mintues.
In this period, pod isn't deployded as a unhealth status.

Stability 2.b. - podAntiAffinity

The most simplest solution is podAntiAffinity setting.
It ensures that all pods are distributed uniformly across all nodes.

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
        matchExpression:
        - key: app
          operator: ln
          values:
          - nginx
      topologyKey: "kubernetes.io/hostname"

  containers:
  - name: nginx
    image: nginx

Stability 2.c. - topologySpreadConstraints

The other way is topologySpreadConstraints setting.

spec:
  topologySpreadConstraints:
   - maxSkew: 1
     topologyKey: topology.kubernetes.io/zone
     whenUnsatisfiable: DoNotSchedule
     labelSelector:
       matchLabels:
         app: nginx

  containers:
   - name: nginx
     image: nginx

Stability 2.d. - podAntiAffinity & topologySpreadConstraints

By setting podAntiAffinity and topologySpreadConstraints, you can distribute pods across all nodes by availability zones.

Stability 2.e. - Maintain High Importance Service

He've take the example to help our understand.

Now 4 pods is running on Node Group.
Now 2 type of pods is running on Node Group.
- First type is pods for ordering system.
- Second type is pods for review system.
The important pods for business model is pods for odrering system.

When the rest amount of CPU or MEM is not enough in node, node exit some pods. In this case, if the order-related pods are exited, this can cause significant problem for our business.
To prevent it, there is a way to allocate resources to container.
There are 2 keys that can be assigned.
- Request : minimum amount to deploy pod.
- Limit : maximum amount to consume pod.
  → Exceeding the limit will cause CPU to throttle, MEM to have out of memory. And then, the allocation process is killed.

Through container resource allocation, you can specify QoS(Quality of Service).
- Guaranteed > Burstable > BestEffort to ensure resource utilization
- BestEffort : no setting with Request, Limit
  → If node resource is enough, pod is continuously executed.
  → But if node rsource is not enough, pod is first killed.
- Burstable : have setting with Request, Limit (Request < Limit)
  → If node resource is enough, pod is continuously executed.
  → If pods is little over than Request, burst function is supported under Limit
  → If node resource is not enough, pod is second killed.
- Guaranteed : have setting with Request, Limit (Request == Limit)
  → Guaranteed to keep the pod alive for asap.
  → Won't termiate until node resource are extremely low.

Stability 3 - Fast Scale-out

When spike traffic is occured, you can use the following services.
- HPA (Horizontal Pod Autoscaler)
- CAS (Cluster Auto Scaler)
- Karpenter

Stability 3.a. - HPA(Horizontal Pod AutoScaler)

You have 2 nodes and the pods are running normally.
One of the pods have heavy traffic. And then pod's cpu/mem usage metric is collected by Metric Server. HPA Controller use collected metric to controller pods.
HPA Controller change the value of RS(ReplicaSets) for creating new pods.

Stability 3.b. - CAS(Cluster Auto Scaler)

If cpu/mem is not enough, HPA Controller occure Pending Pod(s).
In this case, node group need more node to change pending pod(s) into pod(s).
CAS change Auto Scaling Group's capacity amouont to create new node.
After new node is provisioning, pending pod(s) is changed into pod(s).

Scability 3.c. - Cons of CAS

CAS has the following cons.
- Not fast to scale out
- When MSA, you need to create node group for each instance type.
  (One CAS can handle one ASG, one ASG can use one instance type)
- If you wanna use Spot instance, you need to deploy new node group.

Stability 3.d. Karpenter

The karpenter directly create node.
If pending pod is exists, the karpenter create ec2 instance(fleet)
Basically asg is more slow than single ec2 provisioning.

Becuase the karpetner create fit-size instance family, the scale pricing is more cheaper than CAS. And when node is not working enough, karpenter kill the rest ec2 instance(fleet)

Amazon EKS Version Update

Slow Major Release : 150 days
Minor Release : 3 months
Technical Supports : 14 months

Update EKS Version in-place

In-place update can choose only next version
Update EKS Cluter

Update EKS NodeGroup

Update process of EKS as in-place update

For your understading, some example exists.

Have 3 running worker node.
Create each new worker node in each running worker node.
Send Cordon Command to running worker node.
Cordon command means "Don't create new pod in this worker node"
Sche's target don't contain this node.
Send Drain Command to running worker node.
The pod in running worker node is moved into new worker node.

This way don't ensure Availability.
So you must set up PDB(Pod Disruption Budget) options.
Basically drain request don't ensure the same amount of pod count, service encounter issue about previous traffic. So, you must set up minAvailable amount.
```
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: nginx
spec:
  minAvailable: 4
  selector:
    matchLabels:
      app: nginx
```

Cons of in-place update

Cant't roll back previous version
Version compatibility issue can cause unexpected failure.
Requires version to be updated on step at a time.
If you missed an update cycle, you'll have to do it N times.

Update EKS Version as Blue Green

You can use blue green update using Route53's weighting feature.

Manage EKS Cluster

Finally, you can integrate several tools to automate EKS cluster management.
- Amazon CDK
- ArgoCD
- Add-ons
- GitHub

References

Amazon EKS, 중요한 건 꺾이지 않는 안정성 - 최용호 솔루션즈 아키텍트, AWS / 오연주 솔루션즈 아키텍트, AWS :: AWS Summit Seoul 2023