I am trying to uninstall Longhorn from a High Availability K3s Cluster with embedded etcd. I seem to have botched the uninstallation somehow, as Longhorn and its associated namespaces have been stuck in the Terminating state for more than a day.
Attempted Solutions: I tried all three methods on the longhorn documentation, of which all fail. I cannot uninstall longhorn using a helm chart as I never installed longhorn through helm in the first place. Uninstalling longhorn using the kubectl also fails to create job.batch/longhorn-uninstall because the namespace longhorn-system is in the Terminating state. Editing the CRDs and finalizers, as per the troubleshooting documentation and the following site (Longhorn Namespace Stuck Terminating - Delete Longhorn from Kubernetes Cluster) also do not fix the problem of the system terminating, as in both cases, there is no change. Using the script from github (cleanup.sh) also fails to terminate longhorn, as it fails to find any of the resources.
Debugging information: The query kubectl get namespace longhorn-system -o json
gives the following results:
{
"apiVersion": "v1",
"kind": "Namespace",
"metadata": {
"creationTimestamp": "2022-10-31T20:09:45Z",
"deletionTimestamp": "2023-01-26T18:17:03Z",
"labels": {
"kubernetes.io/metadata.name": "longhorn-system",
"name": "longhorn-system"
},
"name": "longhorn-system",
"resourceVersion": "41929420",
"uid": "f1c78184-4613-4f9d-939d-13947ac8befa"
},
"spec": {
"finalizers": [
"kubernetes"
]
},
"status": {
"conditions": [
{
"lastTransitionTime": "2023-01-26T18:29:35Z",
"message": "All resources successfully discovered",
"reason": "ResourcesDiscovered",
"status": "False",
"type": "NamespaceDeletionDiscoveryFailure"
},
{
"lastTransitionTime": "2023-01-26T18:17:27Z",
"message": "All legacy kube types successfully parsed",
"reason": "ParsedGroupVersions",
"status": "False",
"type": "NamespaceDeletionGroupVersionParsingFailure"
},
{
"lastTransitionTime": "2023-01-26T18:17:51Z",
"message": "All content successfully deleted, may be waiting on finalization",
"reason": "ContentDeleted",
"status": "False",
"type": "NamespaceDeletionContentFailure"
},
{
"lastTransitionTime": "2023-01-26T18:17:27Z",
"message": "Some resources are remaining: engines.longhorn.io has 2 resource instances, nodes.longhorn.io has 5 resource instances, orphans.longhorn.io has 1 resource instances, replicas.longhorn.io has 4 resource instances, snapshots.longhorn.io has 3 resource instances, volumes.longhorn.io has 2 resource instances",
"reason": "SomeResourcesRemain",
"status": "True",
"type": "NamespaceContentRemaining"
},
{
"lastTransitionTime": "2023-01-26T18:17:27Z",
"message": "Some content in the namespace has finalizers remaining: longhorn.io in 17 resource instances",
"reason": "SomeFinalizersRemain",
"status": "True",
"type": "NamespaceFinalizersRemaining"
}
],
"phase": "Terminating"
}
}
The query kubectl api-resources --verbs=list --namespaced -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found -n longhorn-system yields the following information.
NAME STATE NODE INSTANCEMANAGER IMAGE AGE
engine.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057-e-344e3d26 stopped 64d
engine.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001-e-79f24cb7 stopped 64d
NAME READY ALLOWSCHEDULING SCHEDULABLE AGE
node.longhorn.io/master0 False true True 86d
node.longhorn.io/master1 True true True 14d
node.longhorn.io/master2 False true True 86d
node.longhorn.io/worker0 False true True 70d
node.longhorn.io/worker1 True true True 70d
NAME TYPE NODE
orphan.longhorn.io/orphan-010ee0d16422c151e7e039e27fe2306815361596fa3f8b6cccc8a601b673e429 replica master0
NAME STATE NODE DISK INSTANCEMANAGER IMAGE AGE
replica.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057-r-89dfabab stopped master2 c5a7e70d-09d8-43a2-9ba3-d5b65eb12b34 13d
replica.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057-r-a6881548 running worker1 2d7f16e8-f11b-40e8-8935-7f0559f7674e instance-manager-r-8ccf914f longhornio/longhorn-engine:v1.3.2 16d
replica.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001-r-52b5a290 stopped worker1 2d7f16e8-f11b-40e8-8935-7f0559f7674e 31d
replica.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001-r-8f0ae6c9 running master2 c5a7e70d-09d8-43a2-9ba3-d5b65eb12b34 instance-manager-r-672003dc longhornio/longhorn-engine:v1.3.2 13d
NAME VOLUME CREATIONTIME READYTOUSE RESTORESIZE SIZE AGE
snapshot.longhorn.io/887f9621-5417-40b3-8999-c2695d5585d7 pvc-1886524a-5ba0-459d-9d51-b8044fec3057 2023-01-12T21:07:46Z false 10737418240 312860672 13d
snapshot.longhorn.io/8f11c48b-da51-4124-80b8-1316db88eb01 pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001 2023-01-12T21:21:54Z false 21474836480 20096512000 13d
snapshot.longhorn.io/b4c31fe7-5ff5-4881-9cd8-b22fc73798bb pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001 2023-01-12T21:38:23Z true 21474836480 102400 13d
NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
volume.longhorn.io/pvc-1886524a-5ba0-459d-9d51-b8044fec3057 attaching unknown 10737418240 master0 64d
volume.longhorn.io/pvc-5aca06f9-adf1-45c8-b11d-6e79d9719001 detaching unknown 21474836480 64d
Attempting to manually delete any of the items described also failed. All APIservices have Availability as TRUE.
What do I do to resolve this problem? I will provide any more information needed.