ZFS on RancherOS troubles

Fresh install on a local HDD but im booting via ipxe. I see ROS is mounting my HDD though for persistence. First, I noticed that after following the docs for installing zfs that on reboot my pool wasn’t being imported and mounted. I’m not sure if that’s an error or not but i fixed it using the startup script in my cloud-config with the following code:

write_files:
  # /opt/rancher/bin/start.sh is executed on start before User Docker starts
  # /etc/rc.local is also executed on start but not guaranteed to be ran before User Docker
  - path: /opt/rancher/bin/start.sh
    permissions: "0755"
    owner: root
    content: |
      #!/bin/bash
      until sudo lsmod | grep ^zfs
      do
        sleep 2
      done
      sudo zpool import dpool

If there is any optimizations i can do to that please let me know but on to the main point…

I noticed that after importing images and creating containers that my dataset (dpool/zdocker) was empty. So where is docker storing all my stuff I asked. It uses /var/lib/docker which after doing a mount command i see its coming from my HDD (/dev/sda1). As per the docs, the mountpoint for the dataset should be /var/lib/docker and docker will automatically configure the storage driver to zfs so the “sudo ros config set rancher.docker.args …” step shouldn’t technically be needed. is there any way through maybe ros config that i can set this to mount or maybe stop it from mounting period and have zfs do it in my startup script?

I noticed that when I try to do this manually that it errors out because /var/lib/docker is not empty so zfs won’t mount my dataset to it. I tried to move all the files over and delete but the zfs/graph folder can’t be deleted as its “busy” so unless its gone zfs will never mount it.

That was a mouthful. Thanks for reading and thanks in advance.

1 Like

I had the same challenge with zfs. Followed the zfs instructions here:
http://docs.rancher.com/os/configuration/storage/

The docs worked perfectly and docker showed the zfs pool… until a reboot. Upon reboot, the user docker daemon would not start and the zfs pool was not detected. It does load the zfs kernel module correctly but can’t find the pool.

Your startup script ‘fixes’ the issue and the OS boots and loads properly. I have yet to try importing images or containers but I’m guessing I will hit the same issue.

OK, so you were close. Here’s how I changed things -

  1. Set your mountpoint for the ZFS pool that was created. If you follow the Rancher docs, this would be called zpool/zdocker, but in my case I called it zpool3 and zdocker3.

$ sudo zfs set mountpoint=/var/lib/docker zpool3/zdocker3

  1. Change your startup script as follows:
$ sudo ros config merge
write_files:
- encoding: ""
  content: |+
    #!/bin/bash
    until sudo lsmod | grep ^zfs
    do
      sleep 2
    done
    sudo zpool import zpool3
    sudo zfs mount -O -a
  owner: root
  path: /opt/rancher/bin/start.sh
  permissions: "0755"
(CTRL-D to save/exit)
$ sudo reboot

The added zfs mount command has the “-O” argument, which tells it to perform an overlay mount so you don’t have to have a completely empty directory. (You could stop user-docker and rm /var/lib/docker but I went with the overlay mount for now.)

Now when I boot, I see /var/lib/docker as mounted from zpool3/zdocker3 and Rancher was smart enough to rebuild that directory and pull any needed containers down.

I’m hoping this is the end of my zfs journey for now.