Update or edit workloads with longhorn volumes in multi node cluster


#1

When we update or edit a workload, the updated pod fails to initalize if its scheduled on another host in a 3 node cluster.
This happens because longhorn volumes are not multi node read write.

Isnt it possible for rancher to stop the old workload, reattach the volume to the new host and start the pod there?
Is this a bug, a missing feature or am I doing something wrong?

Kind regards,
Hendrik


#2

Hi @Isotop7

Assuming you’re using deployment, you should able to use the upgrade strategy Recreate instead of RollingUpgrade. It will delete the old pod first before starting a new one. See https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy for details.


#3

Hi @yasker,

thanks for getting back. I found that option later on.

I noticed that pods backed by a longhorn volume cant be rescheduled when the hosts goes down as longhorn-engine doesnt release the volume. Is this a bug or as designed?
If this works as designed, longhorn is not suitable for HA use IMHO.

Kind regards,
Hendrik


#4

It’s a bug.

We aware some issues with it on CSI e.g. https://github.com/rancher/longhorn/issues/294 . There are some issues with Kubernetes CSI that it’s not really fault tolerant. e.g. https://github.com/kubernetes-csi/external-provisioner/issues/130 . We’re working on the issue currently.