Introduction
Thank you for clicking through to my arcticle. I've been a DevOps engineer for 2 years in dev-team of 7 engineers.
My name is MINSEOK, LEE, but I use Unchaptered as an alias on the interenet. So, you can call me anythings "MINSEOK, LEE" or "Unchaptered" to ask something.
Topic
In this article, we cover this topics...
When delete pod or deployment.
Something goes worng with Worker Node.
When kubelet is shutdown in Worker Node.
When containerd shutdown in Worker Node.
When containerd shutdown and recovered in Worker Node, is kube-scheduler normally work?
Something goes worng with Master Node.
When kube-scheduler is shutdown in Master Node.
When kubelet is shutdown in Master Node.
When containerd is shutdown in Master Node.
When delete pod or deployment.
[When you only use pod]
If you delete pod, the pod isn't recovered.
[When you use deployment for pods]
If you delete pod, the pod is automatically recovered.
The rs(replicaSet) re-generate insufficient pods, because the rs try to maintain a fixed amount of pods.
Something goes worng with Worker Node.
When kubelet is shutdown in Worker Node.
When containerd shutdown in Worker Node.
When containerd shutdown and recovered in Worker Node, is kube-scheduler normally work?
When kubelet is shutdown in Worker Node.
You can stop kubelet with this command.
systemctl stop kubelet
When kubelet is shutdown, pod isn't re-generated.
New pods will become pending.
When containerd shutdown in Worker Node.
You can stop conatinerd with this command.
systemctl stop containerd
When conatinerd is shutdown in specific worker node.
In that worker node, new pod can't be generated.
In that worker node, exists pod is directly dead.
But kubernetes might not realize that pod is dead for up to 5 min(default), becuase kubeletcheck pods in each 5 minutes.
After kubernetes realize that pod is dead, kubelet regenerate new pods in other worker node.
When containerd shutdown and recovered in Worker Node, is kube-scheduler normally work?
You can stop containerd
systemctl stop containerd
After then, change rs value from 3 to 9.
kubectl scale deployment del-deploy --replicas=9
Check pods' count with this command
kubectl get pods -o w
> The 1 worker node have 1 pod.
This pod is already disapear, and new pod is generated in other nodes.
> The 2 worker node have each 4 pod.After then, change rs value from 9 to 3.
kubectl scale deployment del-deploy --replicas=0
Check pods' count with this command
kubectl get pods -o w
> The 3 woker node have 0 pod.
Start conatinerd with this command.
systemctl start containerd
Wait a few minutes, change rs from 0 to 9.
kubectl scale deployment del-deploy --replicas=9
Check pods' count with this command
kubectl get pods -o w
> The 3 worker node have each 3 pods.
Something goes worng with Master Node.
When kube-scheduler is shutdown in Master Node.
When kubelet is shutdown in Master Node.
When containerd is shutdown in Master Node.
When kube-scheduler is shutdown in Master Node.
You can find pod of kube-scheduler.
kubectl get pods -n kubge-system -o wide
Output
# ... # kube-scheduler 이름은 조금씩 다를 수 있음 kube-scheduler-m-k8s
You can delete kube-scheduler with this command.
kubectl delete pod kube-scheduler-m-k8s -n kube-system
Some elements is fully-maneged in master node of kubernetes.
Therefore kube-scheduler is automatically recovered.
When kubelet is shutdown in Master Node.
[Conclusion]
When kubelet is stopped, kubernetes system is stable.
In below, you typed "delete kube-scheduler". But sysin window isn't closed.
Your "delete kube-scheduler" is recorded in etcd usiung kube-apiserver.
So, when you check health status using "kubectl get pods -n kube-system", the status of kube-scheduler'ld be Terminating.
Because kube-scheduler is actually running, there're no issue with pods scale out by changing the rs(replicaSet) in deployment.
[Process]
You can stop kubelet with this command.
systemctl stop kubelet
You can delete kube-scheduler with this command
kubectl delete pod kube-scheduler-m-k8s -n kube-system
(While few mintues) Check kube-scheduler with this command.
kubectl get pods -n kube-system
Output
kube-scheduler-m-k8s 1/1 Terminating 7 3m40s
Create deployment of nginx with this command
kubectl create deployment nginx --image=nginx
Output
deployment.apps/nginx created
Check pods of nginx with this command
kubectl get pods
Output
nginx-6799fc88d8-t9c22 0/1 ContainerCreating 0 5s
nginx-6799fc88d8-t9c22 1/1 Running 0 5s
Change rs(replicaSet) values of nginx's deployment with this command.
kubectl scale deployment nginx --replicas=3
Check pods of nginx with this command.
kubectl get pods
Output
nginx-6799fc88d8-t9c22 1/1 Running 0 5s nginx-6799fc88c8-gzskt 0/1 ContainerCreating 0 5s nginx-6799fc88c8-t9c22 0/1 ContainerCreating 0 5s
nginx-6799fc88d8-t9c22 1/1 Running 0 5s nginx-6799fc88c8-gzskt 0/1 Running 0 5s nginx-6799fc88c8-t9c22 0/1 Running 0 5s
When containerd is shutdown in Master Node.
[Conclusion]
The behavior of docker and containerd isn't same.
With kubernetes 1.25 and containerd, after containerd is deleted, existing pod is not deleted. It means that even if containerd is terminated, the k8s system remains stable.
With docker, if docker dies in master node, any kind of kube-apiservercommand won't work.
However, when using conatinerd, event if containerd on the master nodes dies, kube-apiserver is still alive, so the commands is normally working.
The coredns-** try to regenerate into other worker node.
The daemon isn't effected.
As a little unique aspect...
Because kube-apiserver is terminated...
When you try to delete kube-scheduler, etcd's value is changed and the status of kube-scheduler become Terminating. However, since we don't have containerd, we can't pass that command into others.
If we restart conatinerd, existing kube-scheduler will be terminated.
And new kube-scheduler will be regenerated.
[Process]
You can stop conatainerd with this command.
systemctl stop containerd
Check status with this command.
systemctgl status containerd
Conclusion
The Master Node and Worker Node have HA.
Even if a particular resource is unavailable, most resources have the ability to recocver automatically.
However, for worker nodes, it seems like safe to increase the number to N for HA.
But if you use Amazon EKS, you don't have to think this problems.