If you are setting up a k8s cluster using Vagrant
and kubeadm
, you may have encountered a problem where master
node is somehow unable to locate a pod that is running on another worker
node within the same cluster.
vagrant@kubemaster1:~$ kubectl get poNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATEScka-node-app 1/1 Running 0 84s 10.32.0.2 kubeworker2 <none> <none>
vagrant@kubemaster1:~$ kubectl exec -it cka-node-app -- sh
error: unable to upgrade connection: pod does not exist
cka-node-app
is definitely running on kubeworker2
but master
node is unable to locate it.
Running kubectl exec
with verbose flag-v=10
,
curl -k -v -XPOST -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "X-Stream-Protocol-Version: channel.k8s.io" -H "User-Agent: kubectl/v1.20.2 (linux/amd64) kubernetes/faecb19" 'https://192.168.60.2:6443/api/v1/namespaces/default/pods/cka-node-app/exec?command=%2Fbin%2Fbash&container=cka-node-app&stdin=true&stdout=true&tty=true'
I0130 07:14:48.012674 15693 round_trippers.go:445] POST https://192.168.60.2:6443/api/v1/namespaces/default/pods/cka-node-app/exec?command=%2Fbin%2Fbash&container=cka-node-app&stdin=true&stdout=true&tty=true 404 Not Found in 22 milliseconds
After some google-fu, I realised it could be due to the nodes being registered with incorrect internal IP address.
vagrant@kubemaster1:~$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIMEkubemaster1 NotReady control-plane,master 7m27s v1.20.2 10.0.2.15 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker1 NotReady <none> 9s v1.20.2 10.0.2.15 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker2 Ready <none> 2m33s v1.20.2 10.0.2.15 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2
Indeed, all the nodes have INTERNAL-IP
registered as 10.0.2.15
.
Vagrant typically assigns two interfaces lo, enp0s3
to all VMs. The first, for which all hosts are assigned the IP address 10.0.2.15
, is for external traffic that gets NATed.
vagrant@kubemaster1:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:ae:b0:fc:20:e8 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global enp0s3 <-------
valid_lft forever preferred_lft forever
inet6 fe80::ae:b0ff:fefc:20e8/64 scope link
valid_lft forever preferred_lft forever3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:ac:20:fc brd ff:ff:ff:ff:ff:ff
inet 192.168.60.2/24 brd 192.168.60.255 scope global enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:feac:20fc/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
When creating a cluster with kubeadm
, the first non-loopback interface IP address will be used as the INTERNAL-IP
. In this case, we have enp0s3
with IP address 10.0.2.15
.
How can we fix this?
kubeadm
ships with configuration for how systemd should run the kubelet
.
When kubeadm init
or kubeadm join
commands are used, kubeadm
will generate various configuration files for kubelet
to read off.
For instance-specific configuration, kubeadm
also generates an environment file to /var/lib/kubelet/kubeadm-flags.env
, which contains a list of flags to pass to kubelet
when it starts.
root@kubemaster1:/home/vagrant# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.2"
It will also read off any additional flags passed by users/admins from /etc/default/kubelet
file via KUBELET_EXTRA_ARGS
.
The /etc/default/kubelet
does not exist by default and has to be manually created.
To solve the incorrect nodeINTERNAL-IP
problem, we can pass in the — node-ip
flag and specify the correct IP address to use.
For my use case, the IP addresses are:
kubemaster1: 192.168.60.2
kubeworker1: 192.168.60.101
kubeworker2: 192.168.60.102
Run the following on all nodes before running kubeadm init
or kubeadm join
commands:
$ sudo su
# echo 'KUBELET_EXTRA_ARGS="--node-ip=192.168.60.2"' >> /etc/default/kubelet
After which, you can run initialise the cluster and add nodes to it.
Now the nodes should be registered with the correct INTERNAL-IP
.
root@kubemaster1:/home/vagrant# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIMEkubemaster1 NotReady control-plane,master 12m v1.20.2 192.168.60.2 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker1 NotReady <none> 5m40s v1.20.2 192.168.60.101 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker2 NotReady <none> 7s v1.20.2 192.168.60.102 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2
What about the problem regarding master
node being unable to resolve pods on other worker
nodes?
root@kubemaster1:/home/vagrant# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cka-node-app 1/1 Running 0 46s 10.32.0.2 kubeworker1 <none> <none>root@kubemaster1:/home/vagrant# kubectl exec -it cka-node-app -- sh
#
Thinking of getting your hands dirty and running your own cluster?
Check our my repo for provisioning your VMs and setting up k8s cluster with Ansible playbook.