Rancher NFS - I/O errors between managed and host network

Summary :

  • Nfs Server in docker nfs 4 run with a “managed” network (ip like 10.42.x.x).
  • Rancher Driver (rancher : 1.6.14 & rancher-nfs 0.8.5 or Rancher 1.6.16 & rancher-nfs 0.9) run with “host” network.
  • Stack with rancher nfs volume as usual.

Every thing works fine if i have rancher volume nfs mounted only on 1 host, if a launch stack with rancher nfs volume on another host, i experiencing i/o errors on read alternatively and randomly on each stack.

Detailed description :

Server 1 : Rancher Server

  • Type : bare metal
  • Os : Rancher v1.0.4 or v1.3.0
  • DB : external maraidb in docker

Server 2 : Rancher Data

  • Type : bare metal
  • Os : Rancher v1.0.4 or v1.3.0 (with Kernel headers) or Ubuntu 16.04
  • Rancher agent
  • NFS Server in docker using “managed” network and fixed ip (10.42.x.x)

Server 3 : Rancher HOST 01

  • Type : bare metal
  • Os : Rancher v1.0.4 or v1.3.0
  • Rancher agent
  • Simple stack : App01 with alpine and nginx : with Rancher NFS Volume mounted

Server 4 : Rancher HOST 02

  • Type : bare metal
  • Os : Rancher v1.0.4 or v1.3.0
  • Rancher agent
  • Simple stack : App02 with alpine and nginx : with Rancher NFS Volume mounted

Step to reproduce

connect to shell on App01 and App02 read single line file content on each in loop and get i/o errors randomly.

Important:

If i run NFS Server docker with “host” network and target Rancher Data real ip in rancher driver it seems to work well.

Is there somebody have the same issue ?

Thanks.

Hello,

I have exactly the same issue here.

When my nfs server is on the managed network, the rancher-nfs “plugin” seem to have inconsistencies (io errors) that can lead, when i have more than ~20 mounts (some on 3 or 4 containers) to the rancher server “freezing”.

I had to go back to the nfs server container being on the host network with the Rancher dns service discovery enabled, and the [service].[stack] hostname of the nfs service in the ranche-nfs configuration.

I’d rather have my nfs server protected inside the Rancher private network, but, as far as my tests go, it doesn’t seem to be stable that way here too.