Hi all
I have a problem with my cluster networking.
I did at one time try and install multus, then I tried to remove it.
Now, on a daily basis, I lose all my flannel interface routes, eg if I do an ip route show I will see:
default via 192.168.30.254 dev enx00e04c680237 proto dhcp src 192.168.30.15 metric 100
192.168.30.0/24 dev enx00e04c680237 proto kernel scope link src 192.168.30.15 metric 100
192.168.30.1 dev enx00e04c680237 proto dhcp scope link src 192.168.30.15 metric 100
192.168.30.254 dev enx00e04c680237 proto dhcp scope link src 192.168.30.15 metric 100
192.168.40.0/24 via 192.168.30.1 dev enx00e04c680237 proto dhcp src 192.168.30.15 metric 100
If I restart the k3s-agent service then I will get:
default via 192.168.30.254 dev enx00e04c680237 proto dhcp src 192.168.30.15 metric 100
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
10.42.10.0/24 via 10.42.10.0 dev flannel.1 onlink
10.42.11.0/24 via 10.42.11.0 dev flannel.1 onlink
10.42.12.0/24 via 10.42.12.0 dev flannel.1 onlink
10.42.13.0/24 via 10.42.13.0 dev flannel.1 onlink
10.42.14.0/24 via 10.42.14.0 dev flannel.1 onlink
192.168.30.0/24 dev enx00e04c680237 proto kernel scope link src 192.168.30.15 metric 100
192.168.30.1 dev enx00e04c680237 proto dhcp scope link src 192.168.30.15 metric 100
192.168.30.254 dev enx00e04c680237 proto dhcp scope link src 192.168.30.15 metric 100
192.168.40.0/24 via 192.168.30.1 dev enx00e04c680237 proto dhcp src 192.168.30.15 metric 100
All will now be well for the next 24 hours. I’m assuming some daily scheduled job runs that is causing this but I don’t have a clue where to start looking. Does anyone have any ideas?
TIA
Daz
I actually think the 24 hours thing was a red herring and here is the real deal:
Aug 31 18:59:05 k8s-x86-1 systemd-networkd[658]: enx00e04c680237: Lost carrier
Aug 31 18:59:05 k8s-x86-1 systemd-networkd[658]: enx00e04c680237: DHCP lease lost
Aug 31 18:59:05 k8s-x86-1 systemd-networkd[658]: enx00e04c680237: DHCPv6 lease lost
Aug 31 18:59:05 k8s-x86-1 dbus-daemon[673]: [system] Activating via systemd: service name=‘org.freedesktop.hostname1’ unit=‘dbus-org.freedesktop.hostname1.service’ requested by ‘:1.4’ (uid=101 pid=658 comm="/lib/systemd/systemd-networkd " label=“unconfined”)
Aug 31 18:59:05 k8s-x86-1 systemd-timesyncd[608]: No network connectivity, watching for changes.
Aug 31 18:59:05 k8s-x86-1 systemd-networkd[658]: flannel.1: Link DOWN
Aug 31 18:59:05 k8s-x86-1 systemd-networkd[658]: flannel.1: Lost carrier
Aug 31 18:59:05 k8s-x86-1 systemd[1]: Starting Hostname Service…
Aug 31 18:59:05 k8s-x86-1 kernel: [735595.865208] usb 2-2: new SuperSpeed USB device number 8 using xhci_hcd
Aug 31 18:59:05 k8s-x86-1 dbus-daemon[673]: [system] Successfully activated service ‘org.freedesktop.hostname1’
Aug 31 18:59:05 k8s-x86-1 systemd[1]: Started Hostname Service.
Aug 31 18:59:05 k8s-x86-1 systemd-hostnamed[2816384]: Hostname set to (static)
Aug 31 18:59:05 k8s-x86-1 kernel: [735595.885831] usb 2-2: New USB device found, idVendor=0bda, idProduct=8156, bcdDevice=31.00
Aug 31 18:59:05 k8s-x86-1 kernel: [735595.885835] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=6
Aug 31 18:59:05 k8s-x86-1 kernel: [735595.885837] usb 2-2: Product: USB 10/100/1G/2.5G LAN
Aug 31 18:59:05 k8s-x86-1 kernel: [735595.885839] usb 2-2: Manufacturer: Realtek
Aug 31 18:59:05 k8s-x86-1 kernel: [735595.885840] usb 2-2: SerialNumber: 001000001
Aug 31 18:59:05 k8s-x86-1 kernel: [735596.017913] usb 2-2: reset SuperSpeed USB device number 8 using xhci_hcd
Aug 31 18:59:05 k8s-x86-1 kernel: [735596.043304] r8152 2-2:1.0: load rtl8156b-2 v1 04/15/21 successfully
Aug 31 18:59:05 k8s-x86-1 networkd-dispatcher[684]: WARNING:Unknown index 48 seen, reloading interface list
Aug 31 18:59:05 k8s-x86-1 systemd-timesyncd[608]: Network configuration changed, trying to establish connection.
Aug 31 18:59:05 k8s-x86-1 kernel: [735596.078368] r8152 2-2:1.0 eth0: v1.12.13
Aug 31 18:59:05 k8s-x86-1 systemd-udevd[2816380]: Using default interface naming scheme ‘v249’.
Aug 31 18:59:05 k8s-x86-1 kernel: [735596.100702] r8152 2-2:1.0 enx00e04c680237: renamed from eth0
Aug 31 18:59:05 k8s-x86-1 systemd-networkd[658]: eth0: Interface name change detected, renamed to enx00e04c680237.
The reason I initially added the multus was because I had added a 2.5Gb USB nic.
It would appear from the syslog that the USB nic is dropping out.
I have now added a script /etc/NetworkManager/dispatcher.d/10-flannel-dispatcher.sh containing:
#/bin/sh
DEVICE=${1}
STATE=${2}
if [ “$DEVICE” = “enx00e04c680237” ]; then
if [ “$STATE” = “up” ]; then
systemctl restart k3s-agent.service
fi
fi
Hopefully, that will sort it out.