Looking for a peice of advice on troubleshooting an issue with Rancher + Calico on a bare metal Ubuntu 20.04.
Here is the issue.
We have few Rancher (2.5.7) clusters built on top of Ubuntu 20.04 running on KVM(Proxmox) VMs.
All clusters have similar setup and use Calico as CNI. Everything works like a charm.
The other day we decided to add a bare metal Ubuntu 20.04 node to one of the clusters.
And everything worked pretty well - Rancher shows new node as healthy and k8s scheudles pods there - however,
it turned out that pods on that node can’t access service network - 10.43. Specifically they can’t access DNS at 10.43.0.10.
If I do “nc 10.43.0.10 53” on VM Ubuntu host - it connects to DNS pod through service network with no issues. If I’m trying to do the same on a bere metal - connection hangs.
Ubuntu set up is exactly the same for VM and BM. All VMs and BMs are on the same vlan. For the sake of expetiment we configured only one NIC on BM with no fancy stuff like bonding.
calicoctl shows all the BGP peers Established.
I tried to create a fresh cluster and reproduced the same problem - cluster built of VMs works with no issues and each VM(and pods there) can connect to service network, once I add BM - BM is having issues connecting to service network.
My guess is that issue is somewhere with iptables, but I’m not sure how to troubleshoot WHY iptables will be different on BM an on VM.
Will greatly appreaciate any piece of advice.
Thank you.
DK