Nfs-client-provisioner, random inability to mount volumes

Hi,

I’ve deployed the nfs-client-provisioner and it usually works OK. But sometimes when I redeploy a pod, I end up receiving the following error with no way to deploy a new pod:

MountVolume.SetUp failed for volume “pvc-dba9311c-a7c4-11e8-b39a-00505685234f” : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f /opt/rke/var/lib/kubelet/pods/46471c56-a7cc-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-dba9311c-a7c4-11e8-b39a-00505685234f Output: mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use ‘-o nolock’ to keep locks local, or start statd.

On the node, you can see the following doing a mount -t nfs and grepping for the volume

la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f on /opt/rke/var/lib/kubelet/pods/ddacae69-a7c4-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-dba9311c-a7c4-11e8-b39a-00505685234f type nfs4 (rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.85.175.19,local_lock=none,addr=10.241.42.10)
la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f on /opt/rke/var/lib/kubelet/pods/ddacae69-a7c4-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-dba9311c-a7c4-11e8-b39a-00505685234f type nfs4 (rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.85.175.19,local_lock=none,addr=10.241.42.10)
la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f/etc-ssl-nginx on /opt/rke/var/lib/kubelet/pods/ddacae69-a7c4-11e8-b39a-00505685234f/volume-subpaths/pvc-dba9311c-a7c4-11e8-b39a-00505685234f/zabbix-4-web-nginx/1 type nfs4 (rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.85.175.19,local_lock=none,addr=10.241.42.10)
la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f/etc-ssl-nginx on /opt/rke/var/lib/kubelet/pods/ddacae69-a7c4-11e8-b39a-00505685234f/volume-subpaths/pvc-dba9311c-a7c4-11e8-b39a-00505685234f/zabbix-4-web-nginx/1 type nfs4 (rw,relatime,vers=4.0,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.85.175.19,local_lock=none,addr=10.241.42.10)

I can cleanly unmount those and remove the pod without a problem.

If I try to mount the volume manually using the command in the error, I get:

sudo mount -t nfs la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f /opt/rke/var/lib/kubelet/pods/46471c56-a7cc-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-dba9311c-a7c4-11e8-b39a-00505685234f
mount: mounting la-6pnasvmnfs02.internal.ieeeglobalspec.com:/k8s_vols/qt_cluster_max_test/zabbix-zabbix-4-web-nginx-nfs-pvc-dba9311c-a7c4-11e8-b39a-00505685234f on /opt/rke/var/lib/kubelet/pods/46471c56-a7cc-11e8-b39a-00505685234f/volumes/kubernetes.io~nfs/pvc-dba9311c-a7c4-11e8-b39a-00505685234f failed: Permission denied

Any ideas? Other NFS volumes are working fine on this host. Rancher 2.0.7 running ROS on top of VMWare with NetApp doing the NFS.

It looks like it does this when you bring up a second pod that needs to mount the same volume. Even if the PVC is set to Read Write Many. The NFS export has oplocks disabled.

How did you install the ‘nfs-client-provisioner’ in 2.0? I ask because the UI doesn’t provide fields to pass in the needed hostname and export path params, and I see no way to access helm via cli. If I can get the provisioner installed, I’ll be happy to see if I hit the same issue as you.

I did use the helm version nfs-client-provisioner. If you look at the info details, it gives you the fields you need to put in. You have to manually add the answers.

One hint, when you get to options for storageclass, it needs to be camel case vs the directions which are lower case. So storageClass.whatever