[SOLVED] "invalid argument" error when attaching rancher-nfs storage

I’m having a problem bringing up a service with attached storage provided by the rancher-nfs driver. I’m 90% sure the problem is actually on the file server side (a NetApp appliance that I don’t administer), but since I have a well-constrained test case I’d like to discuss with the community to make sure I’m not barking up the wrong tree.

To reproduce the issue I’m seeing, follow these steps:

  1. Build a simple container that has a volume directive in it:

    $ cat Dockerfile
    FROM ubuntu:16.04
    RUN mkdir -p /mnt/rancher-nfs
    VOLUME /mnt/rancher-nfs
    $ docker build -t sometag .
    $ docker run -it --rm sometag
    root@c7c7a81b46ff:/# ll /mnt/rancher-nfs
    total 0
    drwxr-xr-x 2 root root 6 Nov 15 01:43 ./
    drwxr-xr-x 1 root root 24 Nov 15 01:34 …/

Observe that the contaner’s /mnt/rancher-nfs volume is owned by root:root.

  1. Launch a Rancher service using this container, with a rancher-nfs volume mapped to /mnt/rancher-nfs:

    $ cat docker-compose.yml
    version: ‘2’

    services:
    ubuntu:
    image: somerepo:sometag
    tty: true
    volume_driver: rancher-nfs
    volumes:
    - foovolume:/mnt/rancher-nfs

    volumes:
    foovolume:
    driver: rancher-nfs
    external: false

    $ rancher-compose up -d

Open a shell in one of the deployed containers and observe that /mnt/rancher-nfs is mounted to the NFS export, and that the /mnt/rancher-nfs directory is owned by root:root.

So far, so good. But now let’s make one simple change…now let’s make /mnt/rancher-nfs owned by a non-root user:

$ cat Dockerfile 
FROM ubuntu:16.04
RUN mkdir -p /mnt/rancher-nfs && chown www-data /mnt/rancher-nfs  <-- note chown command
VOLUME /mnt/rancher-nfs

Rebuild the container, push to registry, and tell the service to upgrade (or tear down the old one and start a new one, either way).

What ends up happening in this case, is that the rancher-nfs driver generates an error when trying to mount /mnt/rancher-nfs (see attached screenshot) and the service fails to start.

Here is the error I get in the Rancher console. I should note I’m using Rancher v1.6.2.

image

I believe that the root cause of this issue is that the NetApp is configured to not allow a chown of the /mnt/rancher-nfs mountpoint (after being mounted) to the www-data user. Most likely because the uid of www-data in the container doesn’t match what the filer thinks the www-data account should be (and/or the filer doesn’t have the UID of www-data in its own user map).

OK, so it looks like this is the result of disparate passwd/group namespaces between the container, host, and file server environments.

Within the container, there is an /etc/passwd and /etc/group. So let’s say within the container you have user “someuser” with uid 128 and group “users” with gid 100. Your container runs the application as “someuser” and in the Dockerfile you have a pair of lines like this:

RUN "mkdir -p /var/lib/data && chown someuser.users /var/lib/data && chmod 775 /var/lib/data"
VOLUME /var/lib/data

So /var/lib/data in the container’s filesystem is owned by UID 128 and GID 100.

You use Rancher to attach a Rancher-NFS volume to /var/lib/data. What Rancher-NFS does when it starts up the container is to first mount the NFS path to /var/lib/data, and then it runs a chown (and a chmod?) to make the resulting mountpoint match what was in the container’s filesystem before the NFS path was mounted.

So what happens when that chown command occurs?

The userland chown process in the container uses the chown(2) system call. This system call takes UIDs and GIDs, not names. So it first resolves the names (“someuser”, “users”) into IDs (128, 100) and then makes the chown(2) system call.

The kernel runs this system call and routes it to the NFSv4 handler. NFSv4’s chown code uses names, not numbers! So it has to take uid 128 and gid 100 and turn them into names. It does this inside the host’s context, not the container’s. Since these are low-numbered UIDs and GIDs, they probably map to system accounts. The resulting packet that goes over the wire to the file server will have something like “chown(libvirt-dnsmasq, kvm)” in it. You can see this if you use tcpdump or wireshark to monitor the traffic going to the NFS file server.

Now, on the fileserver side, that system now needs to translate those names back into UIDs and GIDs and perform the actual operation. If your file server doesn’t have a mapping for the weird system accounts like “libvirt-dnsmasq” or “kvm”, then it will kick back an error which the kernel raises back up to chown as an “invalid argument” error: the username (or group) is the invalid argument in question.

Ultimately, the way we solved this problem was really ugly. We had to create users on the NFS file server with name<->ID mappings that matched those on the Rancher host, not those that appeared in the container. It worked, but it makes troubleshooting really ugly, because when you examine the NFS export, you see that all the files are owned by weird system accounts rather than account names you might recognize from within your containers like “elasticsearch” and “rabbitmq”.

Ultimately I doubt there is anything that Rancher can actually do to fix this problem. It’s just working the way that NFS in the kernel is designed to work.

If you have the flexibility to create your own containers from scratch, and will be using Rancher-NFS, I suggest using the same base image for your container, that matches the OS used by your Rancher hosts. That will help to ensure that you get the same UIDs and GIDs in the container as you do on the Rancher host. You can then extend those same names and IDs out to your file server, and everything should match and it’ll be a lot easier to maintain.