My upgrade job in ns kube-system is stuck with the below log messages. The pods in ns longhorn-system still look intact. However, the failed upgrade degraded longhorn somehow and many pvcs are not working anymore.
Maybe someone knows a proper way out of this situation? This is not a production k3s installation, but I invested some time in getting it where it was and would like not to lose the data stored in longhorn. It’s a 3 server 3 agent cluster and I am using kube-hetzner.
if [[ ${KUBERNETES_SERVICE_HOST} =~ .*:.* ]]; then
echo "KUBERNETES_SERVICE_HOST is using IPv6"
CHART="${CHART//%\{KUBERNETES_API\}%/[${KUBERNETES_SERVICE_HOST}]:${KUBERNETES_SERVICE_PORT}}"
else
CHART="${CHART//%\{KUBERNETES_API\}%/${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}}"
fi
set +v -x
+ [[ '' == \v\2 ]]
+ shopt -s nullglob
+ [[ -f /config/ca-file.pem ]]
+ [[ -f /tmp/ca-file.pem ]]
+ [[ false == \t\r\u\e ]]
+ [[ false == \t\r\u\e ]]
+ [[ -n '' ]]
+ helm_content_decode
+ set -e
+ ENC_CHART_PATH=/chart/longhorn.tgz.base64
+ CHART_PATH=/tmp/longhorn.tgz
+ [[ ! -f /chart/longhorn.tgz.base64 ]]
+ return
+ [[ install != \d\e\l\e\t\e ]]
+ helm_repo_init
+ grep -q -e 'https\?://'
+ [[ longhorn/longhorn == stable/* ]]
+ [[ -n https://charts.longhorn.io ]]
+ [[ -f /auth/username ]]
+ [[ -f /auth/tls.crt ]]
+ helm repo add longhorn https://charts.longhorn.io
"longhorn" already exists with the same configuration, skipping
+ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "longhorn" chart repository
Update Complete. ÔÄêHappy Helming!ÔÄê
+ helm_update install --namespace longhorn-system --version '*'
++ ++ jq -r '"\(.[0].chart),\(.[0].status)"'
helm ls --all -f '^longhorn$' --namespace longhorn-system --output json
++ tr '' ''
+ LINE=longhorn-1.7.1,uninstalling
+ IFS=,
+ read -r INSTALLED_VERSION STATUS _
+ VALUES=
+ for VALUES_FILE in /config/*.yaml
+ VALUES=' --values /config/values-01_HelmChart.yaml'
+ [[ install = \d\e\l\e\t\e ]]
+ [[ longhorn-1.7.1 =~ ^(|null)$ ]]
+ [[ uninstalling =~ ^(pending-install|pending-upgrade|pending-rollback|uninstalling)$ ]]
Previous helm job was interrupted, updating status from uninstalling to failed
+ echo Previous helm job was interrupted, updating status from uninstalling to failed
+ echo 'Resetting helm release status from '\''uninstalling'\'' to '\''failed'\'''
+ helm set-status longhorn failed --namespace longhorn-system
2024/09/24 05:12:14 release longhorn status updated
+ [[ uninstalling == \p\e\n\d\i\n\g\-\u\p\g\r\a\d\e ]]
+ STATUS=failed
+ [[ failed =~ ^deployed$ ]]
+ [[ failed =~ ^(deleted|failed|null|unknown)$ ]]
+ [[ reinstall == \r\e\i\n\s\t\a\l\l ]]
+ echo 'Uninstalling failed helm chart'
+ helm uninstall longhorn --namespace longhorn-system --wait