kubeadm on Ubuntu 22.04 설치시 트러블 슈팅

kubelet health check 안되는 이슈

  • sudo kubeadm init
    [init] Using Kubernetes version: v1.25.1
    [preflight] Running pre-flight checks
          [WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
          [WARNING SystemVerification]: missing optional cgroups: blkio
    [preflight] Pulling images required for setting up a Kubernetes cluster
    [preflight] This might take a minute or two, depending on the speed of your internet connection
    [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    ...
    [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
    [kubelet-check] Initial timeout of 40s passed.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
    [kubelet-check] It seems like the kubelet isn't running or healthy.
    [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.


Unfortunately, an error has occurred:  
timed out waiting for the condition

This error is likely caused by:  
\- The kubelet is not running  
\- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:  
\- 'systemctl status kubelet'  
\- 'journalctl -xeu kubelet'

kubeadm 설치시에 위와 같이 kubelet의 health check가 되지 않았습니다. (사실 [WARNING Swap]: swap is enabled;) 이 부분에서 swap memory가 설정되어 있는지 아닌지를 체크할 수 있었습니다. 이런 문구가 있다면 swap 메모리가 설정이 되어 있어서 설치에 실패할 겁니다. swapoff -a 로 해제해주세요.

  • systemctl status kubelet 결과
● kubelet.service - kubelet: The Kubernetes Node Agent  
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)  
Drop-In: /etc/systemd/system/kubelet.service.d  
└─10-kubeadm.conf  
Active: activating (auto-restart) (Result: exit-code) since Mon 2022-09-19 00:14:41 UTC; 5s ago  
Docs: [https://kubernetes.io/docs/home/](https://kubernetes.io/docs/home/)  
Process: 3454 ExecStart=/usr/bin/kubelet $KUBELET\_KUBECONFIG\_ARGS $KUBELET\_CONFIG\_ARGS $KUBELET\_KUBEADM\_ARGS $KUBELET\_EXTRA\_ARGS (code=exited, status=1/FAILURE)  
Main PID: 3454 (code=exited, status=1/FAILURE)  
CPU: 37ms

kubelet이 실행이 종료가 되어 있었고, 로그를 확인했습니다.

  • journalctl -xeu kubelet
Sep 19 00:15:43 noelbirdk8smaster systemd\[1\]: Started kubelet: The Kubernetes Node Agent.  
░░ Subject: A start job for unit kubelet.service has finished successfully  
░░ Defined-By: systemd  
░░ Support: [http://www.ubuntu.com/support](http://www.ubuntu.com/support)  
░░  
░░ A start job for unit kubelet.service has finished successfully.  
░░  
░░ The job identifier is 13011.  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.262689 3510 server.go:200\] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.264320 3510 server.go:413\] "Kubelet version" kubeletVersion="v1.25.0"  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.264391 3510 server.go:415\] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.264504 3510 server.go:825\] "Client rotation is on, will bootstrap in background"  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.265239 3510 certificate\_store.go:130\] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.266589 3510 server.go:660\] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"  
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: E0919 00:15:43.266659 3510 run.go:74\] "command failed" err="failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /pro>  
Sep 19 00:15:43 noelbirdk8smaster systemd\[1\]: kubelet.service: Main process exited, code=exited, status=1/FAILURE

로그를 확인해보니 swap on is not supported라고 떴습니다. please disable swap이라고 합니다.

swap 메모리가 잡혀있는지를 체크하니 다음과 같이 메모리가 잡혀있었습니다.


noelbird@noelbirdk8smaster:~$ free -h  
total used free shared buff/cache available  
Mem: 1.9Gi 275Mi 734Mi 3.0Mi 899Mi 1.4Gi  
Swap: 2.0Gi 0B 2.0Gi

swap 메모리를 해제하고 확인하니 확실히 없어져 있었습니다.


noelbird@noelbirdk8smaster:

~$ sudo swapoff -a  
noelbird@noelbirdk8smaster:~

$ free -h  
total used free shared buff/cache available  
Mem: 1.9Gi 258Mi 751Mi 3.0Mi 900Mi 1.5Gi  
Swap: 0B 0B 0B

https://stackoverflow.com/questions/60297810/kubelet-config-yaml-is-missing-when-restart-work-node-docker-service


기타 Trouble shooting

Container runtime 설치

아래의 설치 사항에 따라서 Ubuntu 22.04에 대한 설치 매뉴얼이 별도로 있었습니다.
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems

runtime 설치할 때 브릿지 트래픽

https://kubernetes.io/docs/setup/production-environment/container-runtimes/#forwarding-ipv4-and-letting-iptables-see-bridged-traffic

Hyper-V

사실 저는 윈도우즈의 Hyper-V에서 가상 머신을 돌려서 master node, worker node를 각각 만들었습니다.
트러블슈팅: 생각보다 hyper-v에서 파일 교환이 어려워서 SMB 타입으로 윈도우에서 공유를 했습니다.

https://www.maketecheasier.com/mount-windows-share-folder-linux/

k8s 버전 1.24부터 도커 엔진을 사용할 수 없음(cri-o, containerd를 사용해야 함)

기존에 docker container를 컨테이너 런타임으로 사용하는 경우에
docker container가 cri(container runtime interface)를 지원하지 않기 때문에
더 이상 k8s에서 docker container를 런타임으로 사용할 수 없는 듯 합니다.


Note: Docker Engine does not implement the CRI which is a requirement for a container runtime to work with Kubernetes. For that reason, an additional service cri-dockerd has to be installed. cri-dockerd is a project based on the legacy built-in Docker Engine support that was removed from the kubelet in version 1.24.

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-runtime

containerd의 config에 Systemd를 Cgroup으로 설정하는 방법

containerd config default > /etc/containerd/config.toml

config.toml에서 SystemdCgroup을 true로 변경

  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    BinaryName = ""
    CriuImagePath = ""
    CriuPath = ""
    CriuWorkPath = ""
    IoGid = 0
    IoUid = 0
    NoNewKeyring = false
    NoPivotRoot = false
    Root = ""
    ShimCgroup = ""
    SystemdCgroup = true

https://github.com/containerd/containerd/issues/4581
https://github.com/containerd/containerd/issues/4581#issuecomment-733704174

혹시 cgroup 설정 때문에 막히는 경우에, 시도해볼 수 있는 다른 방법(docker의 cgroup을 설정)

https://almost-native.tistory.com/415

kubectl auto complete, k로 단축해서 사용하도록 설정


source <(kubectl completion bash) # setup autocomplete in bash into the current shell, bash-completion package should be installed first.  
echo "source <(kubectl completion bash)" >> ~/.bashrc # add autocomplete permanently to your bash shell.

alias k=kubectl  
complete -o default -F \_\_start\_kubectl k

참고 사이트
https://kubernetes.io/docs/reference/kubectl/cheatsheet/

```

댓글

Designed by JB FACTORY