Pods sometimes don't get canal IPs, instead Docker bridge IPs

sneumann · February 28, 2020, 2:32pm

Hi,

We are running a mid-sized rancher on top of an OpenNebula VM cluster with 18.04.4 LTS for the worker VMs. Rancher is 2.3.5, with K8S 1.17.2-rancher1-2. Rancher is running on separate containers outside the K8S. Networking is via flannel (canal) without “Project Network Isolation”, which has 10.42.X.Y addresses.

We are intermittently (every few weeks) seeing the issue that in the “kubectl get pods -o wide” (or in the Pod overview in Rancher) we see pod IP addresses from the Docker bridge 172.17.X.Y, rather than the 10.42.X.Y from canal. Hence the 172.17.X.Y pods are not reachable from pods running on different worker nodes. This causes all sorts of havoc, from non-reachable apps, K8S metric-server being unavailable, causing rancher to be unhappy, to even the K8S DNS breaking.

We can rescue this simply by re-deploying the pods, where usually they then get a correct 10.42.X.Y address.

Any ideas how to debug (or fix!) this are welcome. Could that be a race condition ? Where ? Is that a failure of the canal running on each node ? Which logs to look at ?

Yours,
Steffen

superseb · February 28, 2020, 10:51pm

This is tracked in https://github.com/rancher/rancher/issues/23284

sneumann · March 1, 2020, 10:49am

Great, thanks for the link. We’ll chime in over there. Yours, Steffen

Topic		Replies	Views
Wrong IP given to Pod Rancher	6	2105	December 17, 2019
Issue: Pod showing Unavailable "coredns-648988c467-7mw6p" Rancher	0	1167	December 8, 2020
Kubernetes: resolve external DNS from pod Rancher 1.x	1	1194	September 9, 2016
Kubernetes Service Addresses Rancher 1.x	1	1168	September 7, 2016
Ingress Routing Getting Lost Rancher	3	1822	May 18, 2019

Pods sometimes don't get canal IPs, instead Docker bridge IPs

Related topics