Docker Network interfaces are bouncing

We’ve had a cluster running for a couple months on bare metal (Ubuntu 18.04) and have recently seen the network start ‘bouncing’ .

Our clust is arranged as three blades running Rancher HA configuration, and nine additional blades for the workers. It was working fine - at least we hadn’t noticed this problem before. It is quite noticeable now as health checks are randomly and momentarily failing, connections between pods stall or drop, and connections to outside the cluster momentarily hang periodically.

I’ve googled for answers and see a couple people posting similar results but no answers. I’ve tried various network diagnostics and so far have come up empty…

During the process of diagnosing, I’ve updated to kubernetes 1.14.5 and Docker 18.9.7 without any impact on the problem. And have tried rebooting individual blades - I have not (yet) tried rebooting the entire cluster.

The blades are lightly loaded - load average around 1 - most have 24 cores (some 32). All have 96GB RAM and 1TB Raid-1 - so it is not a resource constraint.

Any suggestions on what to try are greatly appreciated.

This is what I’m seeing in /var/log/syslog on one of the blades:

Aug 15 06:35:41 FLL01S07 kernel: [75848.255089] docker0: port 1(veth5dc0782) entered blocking state
Aug 15 06:35:41 FLL01S07 kernel: [75848.255093] docker0: port 1(veth5dc0782) entered disabled state
Aug 15 06:35:41 FLL01S07 kernel: [75848.255182] device veth5dc0782 entered promiscuous mode
Aug 15 06:35:41 FLL01S07 kernel: [75848.255320] IPv6: ADDRCONF(NETDEV_UP): veth5dc0782: link is not ready
Aug 15 06:35:41 FLL01S07 networkd-dispatcher[1078]: WARNING:Unknown index 845 seen, reloading interface list
Aug 15 06:35:41 FLL01S07 systemd-udevd[20295]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Aug 15 06:35:41 FLL01S07 systemd-udevd[20295]: Could not generate persistent MAC address for veth6e28cd1: No such file or directory
Aug 15 06:35:41 FLL01S07 systemd-udevd[20297]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Aug 15 06:35:41 FLL01S07 systemd-udevd[20297]: Could not generate persistent MAC address for veth5dc0782: No such file or directory
Aug 15 06:35:41 FLL01S07 containerd[1258]: time=“2019-08-15T06:35:41.255225538-04:00” level=info msg=“shim containerd-shim started” address="/containerd-shim/moby/8d4ef71171ac2d0fb21f52ad769ed37c6fa2d536ff09064577533a08989c1405/shim.sock" debug=false pid=20327
Aug 15 06:35:41 FLL01S07 kernel: [75848.487507] eth0: renamed from veth6e28cd1
Aug 15 06:35:41 FLL01S07 systemd-networkd[29798]: veth5dc0782: Gained carrier
Aug 15 06:35:41 FLL01S07 systemd-networkd[29798]: docker0: Gained carrier
Aug 15 06:35:41 FLL01S07 kernel: [75848.511465] IPv6: ADDRCONF(NETDEV_CHANGE): veth5dc0782: link becomes ready
Aug 15 06:35:41 FLL01S07 kernel: [75848.511520] docker0: port 1(veth5dc0782) entered blocking state
Aug 15 06:35:41 FLL01S07 kernel: [75848.511523] docker0: port 1(veth5dc0782) entered forwarding state
Aug 15 06:35:41 FLL01S07 containerd[1258]: time=“2019-08-15T06:35:41.870759012-04:00” level=info msg=“shim reaped” id=8d4ef71171ac2d0fb21f52ad769ed37c6fa2d536ff09064577533a08989c1405
Aug 15 06:35:41 FLL01S07 dockerd[1922]: time=“2019-08-15T06:35:41.879945285-04:00” level=info msg=“ignoring event” module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Aug 15 06:35:41 FLL01S07 kernel: [75848.936083] veth6e28cd1: renamed from eth0
Aug 15 06:35:41 FLL01S07 systemd-networkd[29798]: veth5dc0782: Lost carrier