Longhorn - workload pod moved - storage did not

CrankyCoder · June 29, 2021, 10:52pm

I am having issues with the longhorn mounts not moving if a pod gets rescheduled to a new node.

Example. Right now I have a pod that was on node kube-4 and kube-4 got hit with some heavy loads and evicted the pods on it. Said pod got rescheduled to kube-3. Perfect, this is expected behavior. However, it’s been in container starting state for 20 minutes.

kubectl describe pod

Warning  FailedAttachVolume  <invalid> (x11 over 16m)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-0a6dea05-6954-41da-a7a9-6861461d4e2e" : rpc error: code = FailedPrecondition desc = The volume pvc-0a6dea05-6954-41da-a7a9-6861461d4e2e cannot be attached to the node kube-3 since it is already attached to the node kube-4

I can manually detach said pvc from kube-4 and reattach to kube-3 and it’s ok.

Is there something I am missing that is preventing the attachdetach-controller from detaching when the workload is terminated or evicted. My deployment is set for “recreate” to ensure termination of the pod before starting the new one so this is now a RWX collision.

c3y1huang · July 1, 2021, 6:43am

It looks similar to this issue and should be fixed in v1.1.1. What Longhorn version are you running?

CrankyCoder · July 1, 2021, 5:36pm

Ah very cool. I was on 1.1.0 until literally just this morning. I went through the upgrade process today. That would be awesome if that fixes really the only issue I have had with longhorn on my pi cluster. I have been impressed so far. Fingers crossed!

Topic		Replies	Views
Longhorn PVC failed to switch to different pod, once pod instance died Longhorn	3	3848	September 4, 2019
PVC mount fails Longhorn	0	449	December 2, 2023
Node does not exist Longhorn	10	2985	December 19, 2023
Pod stucks when recreates at another node Longhorn	0	442	August 9, 2023
FailedMount - Error attaching RWX Volume Longhorn	5	5536	September 21, 2022

Longhorn - workload pod moved - storage did not

Related topics