Running k8s clusters on local machine using Vagrant? Read on

Jian Hao
4 min readJan 31, 2021

If you are setting up a k8s cluster using Vagrant and kubeadm , you may have encountered a problem where master node is somehow unable to locate a pod that is running on another worker node within the same cluster.

vagrant@kubemaster1:~$ kubectl get poNAME           READY   STATUS    RESTARTS   AGE   IP          NODE          NOMINATED NODE   READINESS GATEScka-node-app   1/1     Running   0          84s   10.32.0.2   kubeworker2   <none>           <none>

vagrant@kubemaster1:~$ kubectl exec -it cka-node-app -- sh
error: unable to upgrade connection: pod does not exist

cka-node-app is definitely running on kubeworker2 but master node is unable to locate it.

Running kubectl exec with verbose flag-v=10 ,

curl -k -v -XPOST  -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "X-Stream-Protocol-Version: channel.k8s.io" -H "User-Agent: kubectl/v1.20.2 (linux/amd64) kubernetes/faecb19" 'https://192.168.60.2:6443/api/v1/namespaces/default/pods/cka-node-app/exec?command=%2Fbin%2Fbash&container=cka-node-app&stdin=true&stdout=true&tty=true'
I0130 07:14:48.012674 15693 round_trippers.go:445] POST https://192.168.60.2:6443/api/v1/namespaces/default/pods/cka-node-app/exec?command=%2Fbin%2Fbash&container=cka-node-app&stdin=true&stdout=true&tty=true 404 Not Found in 22 milliseconds

After some google-fu, I realised it could be due to the nodes being registered with incorrect internal IP address.

vagrant@kubemaster1:~$ kubectl get nodes -o wide

NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubemaster1 NotReady control-plane,master 7m27s v1.20.2 10.0.2.15 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker1 NotReady <none> 9s v1.20.2 10.0.2.15 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker2 Ready <none> 2m33s v1.20.2 10.0.2.15 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2

Indeed, all the nodes have INTERNAL-IP registered as 10.0.2.15 .

Vagrant typically assigns two interfaces lo, enp0s3 to all VMs. The first, for which all hosts are assigned the IP address 10.0.2.15, is for external traffic that gets NATed.

vagrant@kubemaster1:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:ae:b0:fc:20:e8 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global enp0s3 <-------
valid_lft forever preferred_lft forever
inet6 fe80::ae:b0ff:fefc:20e8/64 scope link
valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:ac:20:fc brd ff:ff:ff:ff:ff:ff
inet 192.168.60.2/24 brd 192.168.60.255 scope global enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:feac:20fc/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever

When creating a cluster with kubeadm , the first non-loopback interface IP address will be used as the INTERNAL-IP . In this case, we have enp0s3 with IP address 10.0.2.15 .

How can we fix this?

kubeadm ships with configuration for how systemd should run the kubelet.

When kubeadm init or kubeadm join commands are used, kubeadm will generate various configuration files for kubelet to read off.

For instance-specific configuration, kubeadm also generates an environment file to /var/lib/kubelet/kubeadm-flags.env , which contains a list of flags to pass to kubelet when it starts.

root@kubemaster1:/home/vagrant# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.2"

It will also read off any additional flags passed by users/admins from /etc/default/kubelet file via KUBELET_EXTRA_ARGS .

The /etc/default/kubelet does not exist by default and has to be manually created.

To solve the incorrect nodeINTERNAL-IP problem, we can pass in the — node-ip flag and specify the correct IP address to use.

For my use case, the IP addresses are:

kubemaster1: 192.168.60.2
kubeworker1: 192.168.60.101
kubeworker2: 192.168.60.102

Run the following on all nodes before running kubeadm init or kubeadm join commands:

$ sudo su
# echo 'KUBELET_EXTRA_ARGS="--node-ip=192.168.60.2"' >> /etc/default/kubelet

After which, you can run initialise the cluster and add nodes to it.

Now the nodes should be registered with the correct INTERNAL-IP .

root@kubemaster1:/home/vagrant# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubemaster1 NotReady control-plane,master 12m v1.20.2 192.168.60.2 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker1 NotReady <none> 5m40s v1.20.2 192.168.60.101 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2kubeworker2 NotReady <none> 7s v1.20.2 192.168.60.102 <none> Ubuntu 16.04.7 LTS 4.4.0-197-generic docker://20.10.2

What about the problem regarding master node being unable to resolve pods on other worker nodes?

root@kubemaster1:/home/vagrant# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cka-node-app 1/1 Running 0 46s 10.32.0.2 kubeworker1 <none> <none>
root@kubemaster1:/home/vagrant# kubectl exec -it cka-node-app -- sh
#

--

--