Rancher-nfs: struggling to get it working

Hi,

I recently upgraded to 1.2 and have been trying to get the new rancher-nfs working. I’m loosely following the instructions here but using rancher-nfs instead of convoy-nfs: http://rancher.com/setting-shared-volumes-convoy-nfs/

docker-nfs is working, the rancher-nfs driver is working but it won’t let me mount a volume. I see the following error in the nfs-driver logs

12/2/2016 12:50:46 PMtime="2016-12-02T12:50:46Z" level=info msg=mount.request name="test_volume"
12/2/2016 12:50:46 PMtime="2016-12-02T12:50:46Z" level=error msg="Failed to mount test_volume: Failed mount -o proto=tcp,port=2049,nfsvers=4 10.42.208.44://test_volume /var/lib/rancher/volumes/rancher-nfs/test_volume: mount.nfs: Failed to find 'tcp' protocol"
12/2/2016 12:50:46 PMtime="2016-12-02T12:50:46Z" level=error msg=mount.response error="Failed mount -o proto=tcp,port=2049,nfsvers=4 10.42.208.44://test_volume /var/lib/rancher/volumes/rancher-nfs/test_volume: mount.nfs: Failed to find 'tcp' protocol"
12/2/2016 12:50:46 PMtime="2016-12-02T12:50:46Z" level=info msg=unmount.request name="test_volume"
12/2/2016 12:50:46 PMtime="2016-12-02T12:50:46Z" level=info msg=unmount.response

The volume remains in the detached state in the UI:

My configuration:

    image: rancher/storage-nfs:v0.6.0
    environment:
      MOUNT_DIR: /
      MOUNT_OPTS: proto=tcp,port=2049,nfsvers=4
      NFS_SERVER: 10.42.208.44
2 Likes

I have also left out the mount options because I saw the same error message about the “missing” tcp protocol. Make sure you can mount the NFS share on the target hosts by hand first.

Is there any documentation for the Rancher-NFS service? http://rancher.com/setting-shared-volumes-convoy-nfs/ looks too old and may no longer be relevant.

http://docs.rancher.com/rancher/v1.2/en/rancher-services/storage-service/

I found that, but that’s really more a generic document about the Storage Services, and only briefly mentions Rancher NFS. I was looking for something more like the old Convoy NFS instructions at https://github.com/rancher/rancher/wiki/Setup:-convoy-nfs .

Maybe Rancher NFS doesn’t need much, but a paragraph or two describing what Rancher NFS is, why we want to use it, does it need NFSv4, and a couple of quick example might help to sell it.

for me it worked by setting MOUNT_OPTS to nfsvers=4

Can’t get it to work either. Could someone please post short instructions?
Otherwise I think AWS solution for nfs is the right way to go.

Hi,

the cpuguy83/nfs-image mentioned in the convoy guide won’t work out of the box, better go for an alternative.

In brief you need to make sure your /etc/exports file is correctly configured and permissions are correctly set.

The following should work with an nfs server on ubuntu xenial assuming /exports is the directory you want to share.

$ sudo mkdir /exports
$ sudo chown nobody:nogroup /exports

Then edit /etc/exports adding the following

/exports          *(rw,sync,no_root_squash,no_subtree_check,fsid=0)

Make sure your firewall is not blocking ports 111/udp and 2049/tcp.

As mentioned before, when mounting your nfs share to rancher just put “/” as your mount directory and “nfsvers=4” in the options, omitting “proto=tcp,port=2049” and it should work.

1 Like

I’m having similar issues and published my investigation under issue #6938

Some items that might be worth sharing,

  • You can manually mount a NAS from within the NFS container.
  • It appears NFS v4 is the only working version. NFS v2 and v3 are not, as of December 2016, mountable.
  • I have been unable to redirect the NFS container’s shared volume: /var/lib/rancher/volumes.
  • Rancher does not allow me to use storage-pools created in the Rancher UI, it creates them automatically.
  • Automatically created storage-pools fail because the drive it is altering is protected.
  • Automatically created storage-pools fail because the folder does not exist.
  • I can manually create the folder but Rancher will have read-only permissions.

Is anybody else having a problem starting up the stack with the following errors in the nfs-driver service containers?

12/14/2016 1:34:12 PMtime="2016-12-14T19:34:12Z" level=info msg=Running
12/14/2016 1:34:12 PMtime="2016-12-14T19:34:12Z" level=info msg=Starting
12/14/2016 1:34:12 PMtime="2016-12-14T19:34:12Z" level=fatal msg="Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
12/14/2016 1:34:28 PMtime="2016-12-14T19:34:28Z" level=info msg=Running
12/14/2016 1:34:28 PMtime="2016-12-14T19:34:28Z" level=info msg=Starting
12/14/2016 1:34:28 PMtime="2016-12-14T19:34:28Z" level=fatal msg="Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
12/14/2016 1:34:58 PMtime="2016-12-14T19:34:58Z" level=info msg=Running
12/14/2016 1:34:58 PMtime="2016-12-14T19:34:58Z" level=info msg=Starting

Thanks for the example nfs server config @NikFerios, I personally hate fsid=0 as that’s what presents your mount directory as / and not what youre exporting. Also +1 on no_root_squash.

Thanks from me as well, after reading over your configuration @NikFerios, I was able to see that I was missing the rw flag in my own configuration. What a simple thing to miss… opps ._.

I updated and closed the ticket associated with the issue I was experiencing, here’s the [link] if anyone is interested or in need.

REF: https://github.com/rancher/rancher/issues/6938

@ragaar I haven’t gotten NFSv3 to work, after many hours of troubleshooting. Perhaps rancher-nfs doesn’t support NFSv3, as you say in your ticket. NFSv3 is pretty widely used, so Rancher-NFS should support it.

Do you have some good evidence that NFSv3 isn’t supported by Rancher-NFS?

In the rancher UI, when selecting rancher-nfs from the community registry it says, in the description, “Supported for NFS v4”.

Aside from the catalog description I have no supporting evidence.

I didn’t investigate the source code, and I would be thrilled to see an example someone got working because, like you said, v3 is so widely used.

I had been using NFSv3 prior to rancher 1.2 release and it was working beautifully!

Hey @Stefan_Lasiewski can you mount the nfsv3 share successfully on the host outside of rancher? If so can I get some more info to help troubleshoot this. Namely:

  1. The exact command you run on the host to successfully mount an NFS share manually
  2. The relevant output from cat /proc/mounts on the host after you’ve successfully mounted a volume. You can grep for it using the dir name you mounted, should look similar to:
    $ mount -t nfs 192.168.42.186:/var/nfs/two /root/mnttest
    $ grep mnttest /proc/mounts
    192.168.42.186:/var/nfs/two /root/mnttest nfs4 rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.42.183,local_lock=none,addr=192.168.42.186 0 0
  3. Can you share the contents of the NFS server settings? Specifically the contents of /etc/exports. Make sure to mask any sensitive info.

Thanks for replying @aemneina . Yes, I can mount the NFSv3 share on the hosts, outside of Rancher.

  • The exact command you run on the host to successfully mount an NFS share manually

It’s in /etc/fstab:

    nfs.example.org:/vol/rancher     /mnt/nfs/rancher       nfs     rw      0       0

And this works too:

# mount -t nfs nfs.example.org:/vol/rancher /mnt/nfs/ranchertest

The relevant output from cat /proc/mounts on the host after you’ve successfully mounted a volume.

[root@node4 ~]# grep /mnt/nfs/rancher /proc/mounts
nfs.example.org:/vol/rancher /mnt/nfs/rancher nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.100.100,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=192.168.100.100 0 0
  • Can you share the contents of the NFS server settings? Specifically the contents of /etc/exports. Make sure to mask any sensitive info.

The NFS server has some fairly standard settings:

# cat exports
/vol/rancher    -sec=sys,rw=192.168.100.101,anon=0,nosuid

I see that the Rancher-NFS image is based on Ubuntu 16.04 , and proto=udp is a supported option according to the manpage, so I’m a little confused why I would get errors like these:

root@25967d14845c:/# mount --verbose -t nfs nfs.example.org:/vol/rancher /mnt/nfs/rancher -o proto=udp
mount.nfs: Failed to find 'udp' protocol
root@25967d14845c:/# mount --verbose -t nfs nfs.example.org:/vol/rancher /mnt/nfs/rancher -o nfsvers=3,proto=udp
mount.nfs: Failed to find 'udp' protocol
root@25967d14845c:/# mount --verbose -t nfs nfs.example.org:/vol/rancher /mnt/nfs/rancher -o nfsvers=3,proto=tcp
mount.nfs: Failed to find 'tcp' protocol

I can reach ports 2049 and 111 from the container:

root@25967d14845c:/# nc -z -w1 192.168.100.100 2049 ; echo $?
0
root@25967d14845c:/# nc -z -w1 192.168.100.100 111 ; echo $?
0

Additionally, this error is suspicious, and I don’t know what to make of it

root@25967d14845c:/# rpcinfo -p 192.168.100.100
192.168.100.100: RPC: Unknown host
root@25967d14845c:/#

root@25967d14845c:/# showmount -e 192.168.100.100
clnt_create: RPC: Unknown host
root@25967d14845c:/#

Do you pass options to the nfs driver when you fire it up?

I can see when you manually mount you get both proto=tcp and mountproto=udp as options. Does trying to mount, within the container, with those options produce any joy?

We don’t pass many options on the host. The defaults seem to work fine.

I seem to get the same errors when using proto=tcp and mountproto=udp.

Also, I’ve tried clientaddr=IP.of.the.host because sometimes, mount is using something like clientip=10.42.x.y.

1 Like

It would be good if we had at least a recipe of how to tune nfs4 properly: it doesn’t authorize some operations on the server.

@argent-smith what operations are failing for you? What is your nfs server config?
I’m using /export *(rw,async,no_root_squash) and havent noticed any issues with that.