Missing engine image means broken images

jurgenweber · May 28, 2024, 11:23pm

So I was looking at my cluster nodes and wondering why I had so little compute to realise on my little 3 node home cluster I had 5 imagine engine’s running, so two nodes had two.

I found the CR and deleted the older one to get that compute back, since I could tell it was old and not managed by the manager anymore. It had not updated to 1.6.2 like the other one (which was on 1.6.1)… I figured the other one would just take over.

Why I have 5, I have no idea. Maybe I changed my mind during installation and switched namespaces but that was 2 months ago and I don’t remember. I saw some tickets on this.

Now a bunch of my PVC’s when the pods restart will not reattach (but some still do). Most of the PVC’s are either caching so it is not a big deal to loose and then the ones with important stuff on it I am backing up off cluster.

so I decided to delete a few of these PVC’s and let them recreate, new pods are back up and have new PVC’s but now the old ones are not longer seen in k8s, are no longer directories in teh replicas dir on the node disk and are sitting in the UI ‘deleting’ permanently with the same error;

longhorn-manager-zvnnb longhorn-manager time="2024-05-28T23:18:29Z" level=error msg="Dropping Longhorn engine out of the queue" func=controller.handleReconcileErrorLogging file="utils.go:67" controller=longhorn-engine engine=longhorn/pvc-0c96e083-d742-4d36-9963-0790c2aa6df4-e-0 error="failed to sync engine for longhorn/pvc-0c96e083-d742-4d36-9963-0790c2aa6df4-e-0: failed to list ready nodes containing engine image longhornio/longhorn-engine:v1.6.1: failed to get engine image longhornio/longhorn-engine:v1.6.1: engineimage.longhorn.io \"ei-5cefaf2b\" not found" node=warvm-node01
longhorn-manager-zvnnb longhorn-manager time="2024-05-28T23:18:29Z" level=error msg="Dropping Longhorn engine out of the queue" func=controller.handleReconcileErrorLogging file="utils.go:67" controller=longhorn-engine engine=longhorn/pvc-830f1c5c-fba2-4c81-b14d-108b48405de5-e-0 error="failed to sync engine for longhorn/pvc-830f1c5c-fba2-4c81-b14d-108b48405de5-e-0: failed to list ready nodes containing engine image longhornio/longhorn-engine:v1.6.1: failed to get engine image longhornio/longhorn-engine:v1.6.1: engineimage.longhorn.io \"ei-5cefaf2b\" not found" node=warvm-node01
longhorn-manager-zvnnb longhorn-manager E0528 23:18:29.866392       1 engine_controller.go:228] failed to sync engine for longhorn/pvc-239f74f6-0b80-4b7c-9c24-97952e5bbcdd-e-0: failed to list ready nodes containing engine image longhornio/longhorn-engine:v1.6.1: failed to get engine image longhornio/longhorn-engine:v1.6.1: engineimage.longhorn.io "ei-5cefaf2b" not found

How can i get rid of these things? How can I get the manager to forget these PVC’s since everything else seems to be gone.

I did find the longhorn ‘volumes’;

kubectl delete -n longhorn volumes pvc-0c96e083-d742-4d36-9963-0790c2aa6df4
volume.longhorn.io "pvc-0c96e083-d742-4d36-9963-0790c2aa6df4" deleted

but it is just hanging, safe to remove the finalizer?

jurgenweber · May 29, 2024, 10:58am

so it seems I have been over zealous and I didn’t realise the ‘up arrow’ in the UI means to upgrade the engine between 1.6.1 and 1.6.2… It kinda dawned on my when I was backing up stuff and recreating the PVC’s…

and then deleting the replicas and then deleting the image engine resources… I think I have it clean again now. :\

Topic		Replies	Views
Volume is stuck in creating state after server reboot Longhorn	1	39	September 25, 2024
Node does not exist Longhorn	10	2931	December 19, 2023
Fail to get target node ID Longhorn	3	1594	October 12, 2023
PVC mount fails Longhorn	0	424	December 2, 2023
Longhorn - workload pod moved - storage did not Longhorn	2	1509	July 1, 2021

Missing engine image means broken images

Related topics