Error after migration

Hello, some time in the past after one of upgrade migration (I’m using currently 1.2.2) backup tab stopped to work, showing an error:

error listing backup volume names: Failed to execute: /var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-v1.2.2/longhorn [backup ls --volume-only nfs://omv:/backup/longhorn], output Failed to execute: ls [-1 /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes], output ls: cannot access ‘/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes’: Permission denied , error exit status 2 , stderr, time=“2021-10-14T16:29:34Z” level=warning msg=“failed to list first level dirs for path: backupstore/volumes reason: Failed to execute: ls [-1 /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes], output ls: cannot access ‘/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes’: Permission denied\n, error exit status 2” pkg=backupstore time=“2021-10-14T16:29:34Z” level=error msg=“Failed to execute: ls [-1 /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes], output ls: cannot access ‘/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes’: Permission denied\n, error exit status 2” , error exit status 1

Besides that, everything works as expected.

Here’s my deployment: homelab/helm-release.yaml at main · Marx2/homelab · GitHub

Can you help me fix this problem?

Edit: I’m using 3 nodes
Follder /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes exists only on one of them, and as a root I can’t access it:

root@longhorn-manager-chksw:/# ls -al /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/
ls: cannot open directory ‘/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/’: Permission denied

Could you please go to the node and try to umount the NFS mount point manually, Then, longhorn will remount it when it needs to access the remote backup target.

Hi, I did it
Mount is:

/dev/mapper/pve-root on /var/lib/longhorn-setting type ext4 (ro,relatime,errors=remount-ro)
omv:/backup/longhorn on /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.42.1.61,local_lock=none,addr=192.168.1.230)

After unmounting I went to Dashboard and it got remounted automatically like this:

/dev/mapper/pve-root on /var/lib/longhorn-setting type ext4 (ro,relatime,errors=remount-ro)
omv:/backup/longhorn on /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,acregmin=1,acregmax=1,acdirmin=1,acdirmax=1,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.42.1.61,local_lock=none,addr=192.168.1.230)

Unfortunatelly gooing int backup tab still shows the same error. It’s also visible in manager’s logs:

time="2021-10-15T05:59:34Z" level=error msg="Error listing backup volumes from backup target" controller=longhorn-backup-target cred= error="error listing backup volume names: Failed to execute: /var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-v1.2.2/longhorn [backup ls --volume-only nfs://omv:/backup/longhorn], output Failed to execute: ls [-1 /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes], output ls: cannot access '/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes': Permission denied\n, error exit status 2\n, stderr, time=\"2021-10-15T05:59:34Z\" level=warning msg=\"failed to list first level dirs for path: backupstore/volumes reason: Failed to execute: ls [-1 /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes], output ls: cannot access '/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes': Permission denied\\n, error exit status 2\" pkg=backupstore\ntime=\"2021-10-15T05:59:34Z\" level=error msg=\"Failed to execute: ls [-1 /var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes], output ls: cannot access '/var/lib/longhorn-backupstore-mounts/omv/backup/longhorn/backupstore/volumes': Permission denied\\n, error exit status 2\"\n, error exit status 1" interval=5m0s node=wezyr url="nfs://omv:/backup/longhorn"

Can it be somehow connected with this bug?

I can’t test it, because I use Longhorn from Helm chart, and I don’t know, how to pass flag

    - --default-fstype=ext4

I’ve made this changen in Helm chart also, but doesn’t seem to work:

 values:
    persistence:
      defaultFsType: ext4

I guess the NFS server is inside the Kubernetes cluster.

  • Can you check how your NFS was deployed?
  • Was the NFS server accessible by other in-cluster Pods after the upgrade?

The bug [BUG] Longhorn 1.2.0 - wrong volume permissions inside container / broken fsGroup · Issue #2964 · longhorn/longhorn · GitHub is not related to this issue.
That bug is fixed in Longhorn v1.2.2 that is the version you are using.