Longhorn-system stuck Terminating

#1

Was experimenting with Longhorn. It is a cool technology, but need some help uninstalling. Namespace is currently stuck in “Terminating” state.

Found the uninstall steps on the GitHub instructions but they don’t work at this point because they expect the namespace to be in a normal state.

Any suggestions?

Output of kubectl get namespace longhorn-system -o json:

{
    "apiVersion": "v1",
    "kind": "Namespace",
    "metadata": {
        "annotations": {
            "cattle.io/status": "{\"Conditions\":[{\"Type\":\"InitialRolesPopulated\",\"Status\":\"True\",\"Message\":\"\",\"LastUpdateTime\":\"2019-01-10T17:18:05Z\"},{\"Type\":\"ResourceQuotaInit\",\"Status\":\"True\",\"Message\":\"\",\"LastUpdateTime\":\"2019-01-10T17:18:04Z\"}]}",
            "field.cattle.io/creatorId": "user-sw4mg",
            "field.cattle.io/projectId": "c-gkz6s:p-48tst",
            "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Namespace\",\"metadata\":{\"annotations\":{\"cattle.io/status\":\"{\\\"Conditions\\\":[{\\\"Type\\\":\\\"InitialRolesPopulated\\\",\\\"Status\\\":\\\"True\\\",\\\"Message\\\":\\\"\\\",\\\"LastUpdateTime\\\":\\\"2019-01-10T17:18:05Z\\\"},{\\\"Type\\\":\\\"ResourceQuotaInit\\\",\\\"Status\\\":\\\"True\\\",\\\"Message\\\":\\\"\\\",\\\"LastUpdateTime\\\":\\\"2019-01-10T17:18:04Z\\\"}]}\",\"field.cattle.io/creatorId\":\"user-sw4mg\",\"field.cattle.io/projectId\":\"c-gkz6s:p-48tst\",\"lifecycle.cattle.io/create.namespace-auth\":\"true\"},\"creationTimestamp\":\"2019-01-10T17:17:20Z\",\"deletionTimestamp\":\"2019-02-22T12:30:14Z\",\"labels\":{\"cattle.io/creator\":\"norman\",\"field.cattle.io/projectId\":\"p-48tst\"},\"name\":\"longhorn-system\",\"resourceVersion\":\"10512979\",\"selfLink\":\"/api/v1/namespaces/longhorn-system\",\"uid\":\"9673789f-14fb-11e9-ba68-005056b171b1\"},\"spec\":{\"finalizers\":[]},\"status\":{\"phase\":\"Terminating\"}}\n",
            "lifecycle.cattle.io/create.namespace-auth": "true"
        },
        "creationTimestamp": "2019-01-10T17:17:20Z",
        "deletionTimestamp": "2019-02-22T12:30:14Z",
        "labels": {
            "cattle.io/creator": "norman",
            "field.cattle.io/projectId": "p-48tst"
        },
        "name": "longhorn-system",
        "resourceVersion": "15206257",
        "selfLink": "/api/v1/namespaces/longhorn-system",
        "uid": "9673789f-14fb-11e9-ba68-005056b171b1"
    },
    "spec": {
        "finalizers": [
            "kubernetes"
        ]
    },
    "status": {
        "phase": "Terminating"
    }
}
#2

Hi @jtstepan

You can try try follow the uninstallation instruction of v0.3 here https://github.com/rancher/longhorn/tree/v0.3#uninstall-longhorn

The cleanup.sh should still work. The reason is some resources inside the namespace has finalizer built-in and need to be removed properly. If you delete the manager first then the resource, the resource would be leftover and cannot be deleted by Kubernetes.

cleanup.sh should remove all the finalizers for you.

#3

Looks like that script is not in master anymore, but I found it here: https://raw.githubusercontent.com/rancher/longhorn-manager/revert-196-issue-273/deploy/scripts/cleanup.sh

Gets stuck removing the “engineimages”:

# bash -x ./cleanup.sh
+ NAMESPACE=longhorn-system
+ remove_crd_instances
+ remove_and_wait volumes.longhorn.rancher.io
+ local crd=volumes.longhorn.rancher.io
++ kubectl -n longhorn-system delete volumes.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "volumes"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"volumes"'
error: the server doesn't have a resource type "volumes"
+ return
+ remove_and_wait engines.longhorn.rancher.io
+ local crd=engines.longhorn.rancher.io
++ kubectl -n longhorn-system delete engines.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "engines"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"engines"'
error: the server doesn't have a resource type "engines"
+ return
+ remove_and_wait replicas.longhorn.rancher.io
+ local crd=replicas.longhorn.rancher.io
++ kubectl -n longhorn-system delete replicas.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "replicas"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"replicas"'
error: the server doesn't have a resource type "replicas"
+ return
+ remove_and_wait engineimages.longhorn.rancher.io
+ local crd=engineimages.longhorn.rancher.io
++ kubectl -n longhorn-system delete engineimages.longhorn.rancher.io --all

Never goes any further.

#4

You can use
kubectl - n longhorn-system edit lhei

Then manually remove the finalizer field in the entries. It should let the script continue.

#5

It did get past the engineimagines, but now hanging on the nodes removal:

# bash -x ./cleanup.sh
+ NAMESPACE=longhorn-system
+ remove_crd_instances
+ remove_and_wait volumes.longhorn.rancher.io
+ local crd=volumes.longhorn.rancher.io
++ kubectl -n longhorn-system delete volumes.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "volumes"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"volumes"'
error: the server doesn't have a resource type "volumes"
+ return
+ remove_and_wait engines.longhorn.rancher.io
+ local crd=engines.longhorn.rancher.io
++ kubectl -n longhorn-system delete engines.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "engines"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"engines"'
error: the server doesn't have a resource type "engines"
+ return
+ remove_and_wait replicas.longhorn.rancher.io
+ local crd=replicas.longhorn.rancher.io
++ kubectl -n longhorn-system delete replicas.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "replicas"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"replicas"'
error: the server doesn't have a resource type "replicas"
+ return
+ remove_and_wait engineimages.longhorn.rancher.io
+ local crd=engineimages.longhorn.rancher.io
++ kubectl -n longhorn-system delete engineimages.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "engineimages"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"engineimages"'
error: the server doesn't have a resource type "engineimages"
+ return
+ remove_and_wait settings.longhorn.rancher.io
+ local crd=settings.longhorn.rancher.io
++ kubectl -n longhorn-system delete settings.longhorn.rancher.io --all
+ out='error: the server doesn'\''t have a resource type "settings"'
+ '[' 1 -ne 0 ']'
+ echo error: the server 'doesn'\''t' have a resource type '"settings"'
error: the server doesn't have a resource type "settings"
+ return
+ remove_and_wait nodes.longhorn.rancher.io
+ local crd=nodes.longhorn.rancher.io
++ kubectl -n longhorn-system delete nodes.longhorn.rancher.io --all
#6

Apply the same method to nodes.longhorn.rancher.io as well.

E.g.

kubectl -n longhorn-system edit nodes.longhorn.rancher.io
#7

That worked, thank you! The script ran to complete and the namespace is gone.