kubeadm on Ubuntu 22.04 설치시 트러블 슈팅
- advanced/Devops
- 2022. 9. 19.
kubelet health check 안되는 이슈
- sudo kubeadm init
[init] Using Kubernetes version: v1.25.1
[preflight] Running pre-flight checks
[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
...
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
\- The kubelet is not running
\- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
\- 'systemctl status kubelet'
\- 'journalctl -xeu kubelet'
kubeadm 설치시에 위와 같이 kubelet의 health check가 되지 않았습니다. (사실 [WARNING Swap]: swap is enabled;
) 이 부분에서 swap memory가 설정되어 있는지 아닌지를 체크할 수 있었습니다. 이런 문구가 있다면 swap 메모리가 설정이 되어 있어서 설치에 실패할 겁니다. swapoff -a
로 해제해주세요.
- systemctl status kubelet 결과
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2022-09-19 00:14:41 UTC; 5s ago
Docs: [https://kubernetes.io/docs/home/](https://kubernetes.io/docs/home/)
Process: 3454 ExecStart=/usr/bin/kubelet $KUBELET\_KUBECONFIG\_ARGS $KUBELET\_CONFIG\_ARGS $KUBELET\_KUBEADM\_ARGS $KUBELET\_EXTRA\_ARGS (code=exited, status=1/FAILURE)
Main PID: 3454 (code=exited, status=1/FAILURE)
CPU: 37ms
kubelet이 실행이 종료가 되어 있었고, 로그를 확인했습니다.
- journalctl -xeu kubelet
Sep 19 00:15:43 noelbirdk8smaster systemd\[1\]: Started kubelet: The Kubernetes Node Agent.
░░ Subject: A start job for unit kubelet.service has finished successfully
░░ Defined-By: systemd
░░ Support: [http://www.ubuntu.com/support](http://www.ubuntu.com/support)
░░
░░ A start job for unit kubelet.service has finished successfully.
░░
░░ The job identifier is 13011.
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.262689 3510 server.go:200\] "--pod-infra-container-image will not be pruned by the image garbage collector in kubelet and should also be set in the remote runtime"
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.264320 3510 server.go:413\] "Kubelet version" kubeletVersion="v1.25.0"
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.264391 3510 server.go:415\] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.264504 3510 server.go:825\] "Client rotation is on, will bootstrap in background"
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.265239 3510 certificate\_store.go:130\] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: I0919 00:15:43.266589 3510 server.go:660\] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
Sep 19 00:15:43 noelbirdk8smaster kubelet\[3510\]: E0919 00:15:43.266659 3510 run.go:74\] "command failed" err="failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /pro>
Sep 19 00:15:43 noelbirdk8smaster systemd\[1\]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
로그를 확인해보니 swap on is not supported라고 떴습니다. please disable swap이라고 합니다.
swap 메모리가 잡혀있는지를 체크하니 다음과 같이 메모리가 잡혀있었습니다.
noelbird@noelbirdk8smaster:~$ free -h
total used free shared buff/cache available
Mem: 1.9Gi 275Mi 734Mi 3.0Mi 899Mi 1.4Gi
Swap: 2.0Gi 0B 2.0Gi
swap 메모리를 해제하고 확인하니 확실히 없어져 있었습니다.
noelbird@noelbirdk8smaster:
~$ sudo swapoff -a
noelbird@noelbirdk8smaster:~
$ free -h
total used free shared buff/cache available
Mem: 1.9Gi 258Mi 751Mi 3.0Mi 900Mi 1.5Gi
Swap: 0B 0B 0B
기타 Trouble shooting
Container runtime 설치
아래의 설치 사항에 따라서 Ubuntu 22.04에 대한 설치 매뉴얼이 별도로 있었습니다.
https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems
runtime 설치할 때 브릿지 트래픽
Hyper-V
사실 저는 윈도우즈의 Hyper-V에서 가상 머신을 돌려서 master node, worker node를 각각 만들었습니다.
트러블슈팅: 생각보다 hyper-v에서 파일 교환이 어려워서 SMB 타입으로 윈도우에서 공유를 했습니다.
https://www.maketecheasier.com/mount-windows-share-folder-linux/
k8s 버전 1.24부터 도커 엔진을 사용할 수 없음(cri-o, containerd를 사용해야 함)
기존에 docker container를 컨테이너 런타임으로 사용하는 경우에
docker container가 cri(container runtime interface)를 지원하지 않기 때문에
더 이상 k8s에서 docker container를 런타임으로 사용할 수 없는 듯 합니다.
Note: Docker Engine does not implement the CRI which is a requirement for a container runtime to work with Kubernetes. For that reason, an additional service cri-dockerd has to be installed. cri-dockerd is a project based on the legacy built-in Docker Engine support that was removed from the kubelet in version 1.24.
containerd의 config에 Systemd를 Cgroup으로 설정하는 방법
containerd config default > /etc/containerd/config.toml
config.toml에서 SystemdCgroup을 true로 변경
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = ""
ShimCgroup = ""
SystemdCgroup = true
https://github.com/containerd/containerd/issues/4581
https://github.com/containerd/containerd/issues/4581#issuecomment-733704174
혹시 cgroup 설정 때문에 막히는 경우에, 시도해볼 수 있는 다른 방법(docker의 cgroup을 설정)
https://almost-native.tistory.com/415
kubectl auto complete, k로 단축해서 사용하도록 설정
source <(kubectl completion bash) # setup autocomplete in bash into the current shell, bash-completion package should be installed first.
echo "source <(kubectl completion bash)" >> ~/.bashrc # add autocomplete permanently to your bash shell.
alias k=kubectl
complete -o default -F \_\_start\_kubectl k
참고 사이트
https://kubernetes.io/docs/reference/kubectl/cheatsheet/
```
'advanced > Devops' 카테고리의 다른 글
[kubernetes] worker node의 ip가 바뀐 경우 해결 방법 (0) | 2022.09.20 |
---|---|
[kubeadm & helm] CNI(calico) & jenkins 설치 (0) | 2022.09.20 |
[CKAD] 2일차 - job & cronjob (0) | 2022.09.16 |
[CKAD] 1일차 (0) | 2022.09.15 |
[CKA] 12일차 - HA 설정(ETCD 서버) (0) | 2022.09.02 |