Too much data transfer in a short time can cause Rancher networking (IPSec) to fail

Recently I observed that too much data transfer in short time through the load balancer can cause rancher network to fail. The detail is as below,

  1. Add 3 hosts (A,B,C) to rancher.
  2. Deploy a private docker registry on the host A, a load balancer on the host B. Add a selector rule to access the registry through the load balancer.
  3. Push a large image (~1GB).
  4. Pull the image from the host C. I observed that the downloading speed was very fast because of a very fast network connection between C and A that 300MB can be downloaded within 3-4s. The rancher network went down before finish downloading the image.

After I cancel downloading the docker image, the network restored after few minutes.
Downloading the docker image in a slower connection 3-4MB/s does not pose any problem.

Does any one have any idea about this issue?

Update: Rancher server version 1.5.1

@truongdo What do you mean by fail? Is this consistently reproducible? Where are your hosts running? What kind of network connectivity do the hosts have?

I meant the health check containers are restarted and the network connection between other containers are also broken (cannot ping).

I couldn’t reproduce it right now, not sure if it is somehow fixed in rancher v1.5.3 (I’ve upgraded the rancher server).
The hosts are Vultr VPS and since they are deployed in the same region (Japan), they have a quite fast network connection (10 gigabit redundant network according to Vultr website)