What's the Amazon EKS?

이민석's avatar
Mar 05, 2024
What's the Amazon EKS?

Introduction

Thank you for clicking through to my arcticle. I've been a DevOps engineer for 2 years in dev-team of 7 engineers.

My name is MINSEOK, LEE, but I use Unchaptered as an alias on the interenet. So, you can call me anythings "MINSEOK, LEE" or "Unchaptered" to ask something.

Topic

Attending AWS Summit Seoul 2023, I learned about Amazon EKS.

In this article, I'll cover the following:

  1. The stability of Amazon EKS

  2. The stability of Control Plane

  3. The stability of Data Plane

  4. The stability of Version Update

The stability of Amazon EKS

  • Basically, kubernetes is devided into Control Plane and Data Plane.
    If you use kubernetes, not using Saas, you must manage all elements of kubernetes.

  • But Amazon EKS run "reliable container application" with "minimal resources" on AWS

  • Amazon EKS manage Control Plane to reduce the maintainence of resources for kubernetes. Also, EKS supports Self Managed EC2, Managed Node Group, Fargate(serverless compute) for Data Plane to reduce maintainence of application

Amazon EKS have five core values.

What's the SRM(Shared Responsibility Model)?

  • AWS has a responsibility both HW of EKS and stability for components of Control Plane.

  • Customer has a responsibility of Data Plane contains application code.
    For example, customer must consider the pod uploaded application code and deployment and services for exposing.

  • And Cross-Account ENI is exists to connect between control plane and data plane. AWS provides Croos-Account ENI in target AZs to securely communicate between AWS and customer's VPC.

Amazon EKS Control Plane

EKS Control Plane ensures a reliability for each elements.

  1. API Server : handle kubernetes api

  2. Cloud Controller : connect aws cloud and kubernetes

  3. Controller Manager : manage several controller

  4. Scheduler : controll pod's deployed into right node.

  5. etcd : save kubernetes data

The stability of Control Plane

  • The API Server Instance reinforce high availability as Active-Active and minimum replicas is 2.MUM to

  • The etcd instance is implemented as Key-Value Storage and minimum replicas is 3.

  • The same things between api server instance and etcd instance.

    • Implement as auto scaling group for scalabtiliy

    • Implement as distributed in multi-az for high availability.

The scale-up of Control Plane

  • In unexpected situation, control plane instance must scale up using scaling signal.

    • In accordance with CPU, MEM usage, Control Plane instance must be scale up/down.

    • In accordance with Disk usage and rest, Control Plane instance must be scale up/down.

  • Customer(engineer) don't set up using API or console.
    AWS reinforce the performance of control plane

  • No additional price

The stability of Data Plane

  • Amazon EKS supports HA with the 3 way

    • Self Recovery

    • Minimize Effected Area

    • Fast Scale-out

Stability 1 - Self Recovery

  • Kubenretes have owned self recovery function with status management machinsm.

  • When you create resources in kubernetes, you'ld write kubernetes manifest files as YAML format. In there, you record desire state.

  • Several Controllers will continuously try to change current state into desired status. It called "kubernetes control loop". It's separated this 3 steps.

    • Observe

    • Diff

    • Act

  • There are several ways to create a pod.

  • The simplest way is directly create pods, but it's less useful because you don't have any controller to monitor and recover pods.

  • Therefore you will typically create pods using deployment.
    Deployment use RS(replicaSet) to keep the count of pods.

Stability 1.a. - General 3 way for health check

Genral Health Check Way with deployded application

  1. HTTP Request

  2. Command

  3. TCP Socket

Stability 1.b. - Probe, Health Check in Kubernetes

In kubernetes, kubelet(in worker node) use probe to act HealthCheck

  • readinessProbe

    • Command → Readiness (5s) : Is application ready to listen request?

      readinessProbe:
        exec:
          command:
           - cat
           - /tmp/healthy
      
      initialDelaySeconds: 5
      periodSeconds: 5
    • The worker node has installed kubelet.

    • The kubelet use readinessProbe for health check.
      If some trouble is occured in node group, kubelet remove pod in service endpoint. When trrafic is comming, unhealthy pods don't receive request.
      Only healthy pods will receive request.

    • After trouble is fixed, kubelet contians fixed pods into service endpoint.

  • livenessProbe

    • HTTP GET → Liveness (5s) : Is application ready to listen request?

      livenessProbe:
        httpGet:
          path: /healthz
          port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
    • The worker node has installed kubelet.

    • If some trouble is occured in node group, kubelet remove pods in service endpoint. And then kubelet restart container.

    • After trouble is fixed, kubelet contains fixed pods into service endpoiint.

Stability 2 - Minimize Effected Area

  • If you operate service as MSA environment, some service's trouble is spreaded into other service. These trouble can effect Kubernetes clsuter.

Stability 2.a. - Pod Scheduling

  • Kubernetes Manifest File

apiVersion: apps/v1
kind: Deployment
metaData:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 8
  selector:
    matchLabels:
      app: nginx
  template:
    labels:
      app: nginx
    spec:
      containers:
       - name: nginx
         image: nginx:1.7.9
         port:
         - containerPort: 80
  • In manifest file, replicas is setting on 8 count.

  • The kubernetes sched decideded where is pods' position into each node.

    • sched : set priority of pod deployment

    • affinity : pods' preference for each nodes.

    • request / limit

    • node resources : node's CPU and MEM usage.

  • Sometime, pods may be deployed in some nodes.

  • In this case...
    If some nodes, contains many pods, is dying, remaining system couldn't cover the basic traffic.

  • Becuase kubelect heath check period is 5 minute(default),
    the worst situation is kubelet knows node's trouble after 5 mintues.

  • In this period, pod isn't deployded as a unhealth status.

Stability 2.b. - podAntiAffinity

  • The most simplest solution is podAntiAffinity setting.
    It ensures that all pods are distributed uniformly across all nodes.

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
        matchExpression:
        - key: app
          operator: ln
          values:
          - nginx
      topologyKey: "kubernetes.io/hostname"

  containers:
  - name: nginx
    image: nginx

Stability 2.c. - topologySpreadConstraints

  • The other way is topologySpreadConstraints setting.

spec:
  topologySpreadConstraints:
   - maxSkew: 1
     topologyKey: topology.kubernetes.io/zone
     whenUnsatisfiable: DoNotSchedule
     labelSelector:
       matchLabels:
         app: nginx

  containers:
   - name: nginx
     image: nginx

Stability 2.d. - podAntiAffinity & topologySpreadConstraints

  • By setting podAntiAffinity and topologySpreadConstraints, you can distribute pods across all nodes by availability zones.

Stability 2.e. - Maintain High Importance Service

He've take the example to help our understand.

  • Now 4 pods is running on Node Group.

  • Now 2 type of pods is running on Node Group.

    • First type is pods for ordering system.

    • Second type is pods for review system.

  • The important pods for business model is pods for odrering system.

  • When the rest amount of CPU or MEM is not enough in node, node exit some pods. In this case, if the order-related pods are exited, this can cause significant problem for our business.

  • To prevent it, there is a way to allocate resources to container.
    There are 2 keys that can be assigned.

    • Request : minimum amount to deploy pod.

    • Limit : maximum amount to consume pod.
      → Exceeding the limit will cause CPU to throttle, MEM to have out of memory. And then, the allocation process is killed.

  • Through container resource allocation, you can specify QoS(Quality of Service).

    • Guaranteed > Burstable > BestEffort to ensure resource utilization

    • BestEffort : no setting with Request, Limit
      → If node resource is enough, pod is continuously executed.
      → But if node rsource is not enough, pod is first killed.

    • Burstable : have setting with Request, Limit (Request < Limit)
      → If node resource is enough, pod is continuously executed.
      → If pods is little over than Request, burst function is supported under Limit
      → If node resource is not enough, pod is second killed.

    • Guaranteed : have setting with Request, Limit (Request == Limit)
      → Guaranteed to keep the pod alive for asap.
      → Won't termiate until node resource are extremely low.

Stability 3 - Fast Scale-out

  • When spike traffic is occured, you can use the following services.

    • HPA (Horizontal Pod Autoscaler)

    • CAS (Cluster Auto Scaler)

    • Karpenter

Stability 3.a. - HPA(Horizontal Pod AutoScaler)

  • You have 2 nodes and the pods are running normally.

  • One of the pods have heavy traffic. And then pod's cpu/mem usage metric is collected by Metric Server. HPA Controller use collected metric to controller pods.

  • HPA Controller change the value of RS(ReplicaSets) for creating new pods.

Stability 3.b. - CAS(Cluster Auto Scaler)

  • If cpu/mem is not enough, HPA Controller occure Pending Pod(s).
    In this case, node group need more node to change pending pod(s) into pod(s).

  • CAS change Auto Scaling Group's capacity amouont to create new node.
    After new node is provisioning, pending pod(s) is changed into pod(s).

Scability 3.c. - Cons of CAS

  • CAS has the following cons.

    • Not fast to scale out

    • When MSA, you need to create node group for each instance type.
      (One CAS can handle one ASG, one ASG can use one instance type)

    • If you wanna use Spot instance, you need to deploy new node group.

Stability 3.d. Karpenter

  • The karpenter directly create node.

  • If pending pod is exists, the karpenter create ec2 instance(fleet)
    Basically asg is more slow than single ec2 provisioning.

  • Becuase the karpetner create fit-size instance family, the scale pricing is more cheaper than CAS. And when node is not working enough, karpenter kill the rest ec2 instance(fleet)

Amazon EKS Version Update

  • Slow Major Release : 150 days

  • Minor Release : 3 months

  • Technical Supports : 14 months

Update EKS Version in-place

  • In-place update can choose only next version

  • Update EKS Cluter

  • Update EKS NodeGroup

Update process of EKS as in-place update

For your understading, some example exists.

  1. Have 3 running worker node.

  2. Create each new worker node in each running worker node.

  3. Send Cordon Command to running worker node.
    Cordon command means "Don't create new pod in this worker node"
    Sche's target don't contain this node.

  4. Send Drain Command to running worker node.
    The pod in running worker node is moved into new worker node.

  • This way don't ensure Availability.
    So you must set up PDB(Pod Disruption Budget) options.
    Basically drain request don't ensure the same amount of pod count, service encounter issue about previous traffic. So, you must set up minAvailable amount.

    apiVersion: policy/v1
    kind: PodDisruptionBudget
    metadata:
      name: nginx
    spec:
      minAvailable: 4
      selector:
        matchLabels:
          app: nginx

Cons of in-place update

  1. Cant't roll back previous version

  2. Version compatibility issue can cause unexpected failure.

  3. Requires version to be updated on step at a time.
    If you missed an update cycle, you'll have to do it N times.

Update EKS Version as Blue Green

  • You can use blue green update using Route53's weighting feature.

Manage EKS Cluster

  • Finally, you can integrate several tools to automate EKS cluster management.

    • Amazon CDK

    • ArgoCD

    • Add-ons

    • GitHub

References

Share article
RSSPowered by inblog