Upgrade K3OS three node cluster

Hello,

I’m a bit lost, with upgrading our test K3OS cluster:

NAME               STATUS   ROLES         AGE    VERSION
fra-corp-node-01   Ready    etcd,master   117d   v1.19.5+k3s2
fra-corp-node-02   Ready    etcd,master   117d   v1.19.5+k3s2
fra-corp-node-03   Ready    etcd,master   117d   v1.19.5+k3s2

On the K3OS site, the latest version is v0.20.4-k3s1r0
Now I try to upgrade our cluster to this version … but I do not understand … how it basically works. I’ve red something about the system-upgrade-controller … So I found k3os/overlay/share/rancher/k3s/server/manifests at master · rancher/k3os · GitHub and tried to apply them. The first node was upgraded after a while … but with warnings … which I don’t find anymore. I think, it was

namespaces \"kube-system\" is forbidden: User \"system:serviceaccount:system-upgrade:system-upgrade\" cannot get resource \"namespaces\"

… after a day … a 2nd node was upgraded too … but now it looks like:

NAME               STATUS   ROLES                       AGE   VERSION
fra-test-node-01   Ready    control-plane,etcd,master   69d   v1.20.4+k3s1
fra-test-node-02   Ready    etcd,master                 69d   v1.19.5+k3s2
fra-test-node-03   Ready    control-plane,etcd,master   69d   v1.20.4+k3s1

Also Rancher (hosted on a different K3s / K3OS cluster) shows often nodes / deployments and other types as unknown, maybe because of the balancer …

anyway … I did:

kubectl delete -f system-upgrade-controller.yaml
kubectl delete -f k3os_upgrade_plan.yaml

and after a while … I’ve applied them again … and now I have …

NAME               STATUS   ROLES                       AGE   VERSION
fra-test-node-01   Ready    control-plane,etcd,master   69d   v1.20.4+k3s1
fra-test-node-02   Ready    control-plane,etcd,master   69d   v1.20.4+k3s1
fra-test-node-03   Ready    control-plane,etcd,master   69d   v1.20.4+k3s1

Everything is “unknown” in Rancher, except pods. But from the command line, it looks fine …

$ kubectl  get -A  deployments.apps 
NAMESPACE            NAME                                                 READY   UP-TO-DATE   AVAILABLE   AGE
cattle-system        cattle-cluster-agent                                 1/1     1            1           69d
default              nginx-test                                           3/3     3            3           68d
fleet-system         fleet-agent                                          1/1     1            1           69d
ingress-controller   haproxy-ingress-kubernetes-ingress-default-backend   2/2     2            2           69d
k3os-system          system-upgrade-controller                            1/1     1            1           20m
kube-system          coredns                                              1/1     1            1           69d
kube-system          local-path-provisioner                               1/1     1            1           69d
kube-system          metrics-server                                       1/1     1            1           69d
longhorn-system      csi-attacher                                         3/3     3            3           9m7s
longhorn-system      csi-provisioner                                      3/3     3            3           8m58s
longhorn-system      csi-resizer                                          3/3     3            3           8m48s
longhorn-system      csi-snapshotter                                      3/3     3            3           8m37s
longhorn-system      longhorn-driver-deployer                             1/1     1            1           69d
longhorn-system      longhorn-nfs-provisioner                             1/1     1            1           68d
longhorn-system      longhorn-ui                                          1/1     1            1           69d

So … I’m pretty sure, that I did something wrong … and I don’t want to test it again on my other K3S / K3OS cluster :slight_smile:

Can somebody sort it out … how I have to upgrade the cluster ?

cu denny