I’m using rancher since several months and I’m very happy with it. Until today I was using rancher to deploy small containers with all services for each slack deployed on a single host.
Now I’d like to connect an LDAP stack to others containers located on different hosts. During my test I discovered that the overlay network and specially the ip addresses are unreachable between hosts !
Here is my context:
Host (A) with 2 containers 1/ and 2/
Host (B) with 1 container (LDAP) 3/
1/ 2/ 3/ have an ip address given by rancher overlay network in 10.42.x.x
From 1/ I can ping 2/ (and the reverse work as well) but from 1/ or 2/ I got a timeout when I try to ping 3/
It’s a really simple case, in my production env I’ve 5 actives hosts with 15 containers and I reproduce the trouble on all containers.
How can I troubleshoot a network issue with rancher ?
I change my firewall rules to permit these ports, and according to your link it seems that now I’ve another trouble.
Indeed here it’s an extract of my netfilter nat table:
> Chain CATTLE_PREROUTING (1 references)
num target prot opt source destination
1 MARK tcp – 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL tcp dpt:1639 MARK set 0x668a0
2 DNAT tcp – 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL tcp dpt:1639 to:10.99.139.37:639
3 MARK tcp – 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL tcp dpt:1389 MARK set 0x668a0
4 DNAT tcp – 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL tcp dpt:1389 to:10.99.139.37:389
5 DNAT tcp – 10.99.0.0/16 10.99.0.1 tcp dpt:53 to:169.254.169.250
6 DNAT udp – 10.99.0.0/16 10.99.0.1 udp dpt:53 to:169.254.169.250
7 MARK all – !10.99.0.0/16 169.254.169.250 MAC 02:BD:E1:97:06:AD MARK set 0x272e
8 MARK all – !10.99.0.0/16 169.254.169.250 MAC 02:BD:E1:6E:0D:43 MARK set 0x21f1d
9 MARK all – !10.99.0.0/16 169.254.169.250 MAC 02:BD:E1:E5:E2:68 MARK set 0x1c1f8
10 MARK all – !10.99.0.0/16 169.254.169.250 MAC 02:BD:E1:6A:BA:AB MARK set 0x23e79
So it’s missing:
2 DNAT udp – 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL udp dpt:4500 to:10.42.179.222:4500
3 DNAT udp – 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL udp dpt:500 to:10.42.179.222:500
I tried to restart the containers “rancher/agent:v1.0.2” and “rncher/agent-instance:v0.8.3 “/etc/init.d/agent-in” 4 days ago Up 4 minutes 0.0.0.0:500->500/udp, 0.0.0.0:4500->4500/udp” but the rules still missing…
Sniffing the network and specially the 500 port I can see request form others hosts but without reply coming from my host, which confirm that 500 udp port isn’t forward to the container.
How can I regen the missing rule without doing it by hands ?