Convoy NFS stuck initializing w/ AWS EFS


#1

I’m relatively new to the Rancher ecosystem, but i have a 3 node test cluster running on AWS and I’m trying to get convoy-nfs working w/ EFS. The nodes are all the latest Amazon Linux AMI.

I saw the other topic related to this, but the discussion went to Github and didn’t offer me any solutions and appeared related to the convoy command-line tool.

I’m simply trying to start convoy-nfs from the catalog. I’ve verified I can mount the NFS share on one of my Rancher Node’s directly.

Here are the logs from one of the convoy_nfs containers:

8/1/2016 10:19:34 AMWaiting for metadata 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Execing [/usr/bin/nsenter --mount=/proc/6982/ns/mnt -F -- /var/lib/docker/devicemapper/mnt/7cf2e6672717fb1180964e523a84c48631dd8b64045cc34d40156f527686badc/rootfs/var/lib/rancher/convoy-agent/share-mnt --stage2 /var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c -- /launch volume-agent-nfs-internal 6982]" 8/1/2016 10:19:35 AMRegistering convoy socket at /var/run/convoy-convoy-nfs.sock 8/1/2016 10:19:35 AMMounting at: /var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c/mnt 8/1/2016 10:19:35 AMMounting nfs. Command: mount -t nfs -o vers=4.1 us-east-1b.fs-4b26e702.efs.us-east-1.amazonaws.com:/ /var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c/mnt 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Listening for health checks on 0.0.0.0:10241/healthcheck" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Got: root /var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Got: drivers [vfs]" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Got: driver-opts [vfs.path=/var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c/mnt]" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Got: ignore-docker-delete true" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Got: create-on-docker-mount true" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=info msg="Launching convoy with args: [--socket=/host/var/run/convoy-convoy-nfs.sock daemon --root=/var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c --drivers=vfs --driver-opts=vfs.path=/var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c/mnt --ignore-docker-delete=true --create-on-docker-mount=true]" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Creating config at /var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg= driver=vfs driver_opts=map[vfs.path:/var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c/mnt] event=init pkg=daemon reason=prepare root="/var/lib/rancher/convoy/convoy-nfs-e71ce7eb-3a88-4e07-b1e6-7c4d8c01de6c" 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg= driver=vfs event=init pkg=daemon reason=complete 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering POST, /volumes/create" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering POST, /volumes/mount" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering POST, /volumes/umount" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering POST, /snapshots/create" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering POST, /backups/create" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering DELETE, /volumes/" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering DELETE, /snapshots/" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering DELETE, /backups" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering GET, /snapshots/" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering GET, /backups/list" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering GET, /backups/inspect" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering GET, /info" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering GET, /volumes/list" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering GET, /volumes/" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.List" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /Plugin.Activate" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.Create" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.Remove" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.Mount" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.Unmount" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.Path" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=debug msg="Registering plugin handler POST, /VolumeDriver.Get" pkg=daemon 8/1/2016 10:19:35 AMtime="2016-08-01T14:19:35Z" level=warning msg="Remove previous sockfile at /host/var/run/convoy-convoy-nfs.sock" pkg=daemon

And here’s the storage pool container:

8/1/2016 10:19:33 AMWaiting for metadata 8/1/2016 10:19:33 AMtime="2016-08-01T14:19:33Z" level=info msg="Listening for health checks on 0.0.0.0:10241/healthcheck" 8/1/2016 10:19:33 AMtime="2016-08-01T14:19:33Z" level=info msg="Socket file: /host/var/run/convoy-convoy-nfs.sock" 8/1/2016 10:19:33 AMtime="2016-08-01T14:19:33Z" level=info msg="Initializing event router" workerCount=10 8/1/2016 10:19:33 AMtime="2016-08-01T14:19:33Z" level=info msg="Connection established" 8/1/2016 10:19:38 AMtime="2016-08-01T14:19:38Z" level=debug msg="storagepool event [6b3d20f5-2954-4a19-8c46-38337f733cdd 6acdfda0-4552-4a7f-afd0-bcb6bad88838 65a06fc3-ae40-49ef-9c3b-f1e2568a9fb5]"

Am I missing a step here? Do I have to install something on the node before using convoy-nfs?

Thanks!


#2

Sorry this is exactly the same issue as mentioned in the other post and this Github issue. I commented there as well.

I’d post a link to the issue but since my account is new it won’t let me.

Issue #96


#3

Hi @gmeans

Even in this situation, can you try if create a container with volume and convoy-nfs works?


#4

Hi @yasker,

I ended up switching to the Convoy-EFS service instead of Convoy-NFS. That has worked fine so far.

If I can find some time I can look at testing it again in a test environment.