[HELP] Firewall NAT rules on host1 replaced by host2

I have two hosts, both with:

  • Docker: 17.03.0-ce
  • Kernel Version: 3.19.0-80-generic
  • Operating System: Ubuntu 14.04.5 LTS

Also:

  • Rancher v1.5.2

First I add host1 to a new Rancher Cattle environment, added some stacks with some exposed ports, resulting in a bunch of NAT rules consistent with the ports displayed on the Ports tab in the host1 page in the UI. So far so good.

When I add the host2 to the environment and add some stack with its own ports host1 NAT rules are replaced by the rules that belongs to _host2_. Host2 table also correctly holds its own rules. :fearful: No need to say that host1 services no longer are accesible. When I stop host2 docker daemon the host1 rules return to the right ones. I tried even restarting the host2 server.

Also tried restart network-services and scheduler on the Infrastructures stack

I really don’t have much more information and I can’t mess around too much with these servers since they hold critical services. I manage another four Rancher environments and I never had this issue before.

Any help will be appreciated on trying to pinpoint the cause of this strange behavior.

@rsilva4 Can you please share more details? Rancher programs various rules depending on services and ports published by them. Do the two stacks have the same ports?

iptables-save command output from all the hosts would be very useful.

I will try to provide more details. My last try was just adding the second host to the environment no stacks or services associated with it. The result was an immediate change on the NAT rules on the first host and since there were no services on host2 all the rules basically disappear. I can share my current iptables (https://transfer.sh/xPLt9/host1_iptables) for host1 right now but for host2 I can’t do it right now I will try to make some tests after service hours.

I will also try to collect logs from rancher agent and server, do you suggest collecting anything else during my tests?

How were the hosts created? Is there a VM template or similar involved? (i.e. do they both have the same Rancher uuid on disk because one is a copy of the other)

Hosts are bare metal, I can’t be sure how the original OS installation was made (servers are years old) but i’m pretty confident that they were installed from a Ubuntu ISO.

Both hosts are under Puppet configuration management and share common configs but I don’t find anything relevant to this problem.

How can I check for the same Rancher uuid on disk?

So I test against this issue again with the steps described bellow:

  • Exported IP tables from both hosts before adding host2 to rancher.
  • Docker is already running in host2 just need to add Rancher Agent.
  • Added Rancher Agent. Node shows up on UI.
  • Exported IP tables again.
  • Also collected logs for:
    • Rancher Agent
    • Syslog (docker daemon)

@vincent @leodotcloud I’m more confortable in sharing those via PM. Sending now.

Can you point me to the code that affects the host Iptables? I was looking around rancher and cattle repos but I can’t find anything…

As mentioned in the private chat @rsilva4 your iptables custom rules are dropping the traffic before they hit the CATTLE_FORWARD chain.