K3s HA - Did it ever work?

Hi,

I know, the title is not really funny, but after two days of trying to set up a simple two nodes k3s-ha cluster to deploy rancher on it I’m asking myself wtf I’m missing. I’m short about to move back to rke.

Basically, I mean the following configuration (Rancher in k3s HA cluster using external datastore):

What I have:

  • Working postgres cluster (or better say cockroachdb), reachable over local network.
    k3s gets access to it and creates kine table… no problems here
  • Two server with Ubuntu 20.04.2 LTS and unique names (Hetzner Cloud Nodes with 2CPUs and 4GB RAM)
    IP forwarding is active, firewall off.
  • Loadbalancer pointing to the two server

The simplest HA installation should go this way when I understand it right:

curl -sfL https://get.k3s.io | sudo INSTALL_K3S_VERSION="v1.19.11+k3s1" sh -s - server \
  --datastore-endpoint "postgres://[redacted]/k3s_rancher" \
  --datastore-cafile /etc/rancher/datastore-ca.crt \
  --token-file /etc/rancher/k3s-token \
  --write-kubeconfig-mode 0644
  --tls-san "..." \
  --tls-san "..." \
  ...

Service is running

The node is ready (but why no role master? Perhaps, here is the attempt with --docker as container runtime. Same results)

After adding the second one I see two nodes in the cluster, but it still has no single pod installed. No CNI, no traefik, nothing.

Does anyone have an idea what’s missing here?
I tried it with cluster cloud provider (hcloud-cloud-controller-manager/deploy_with_networks.md at master · hetznercloud/hcloud-cloud-controller-manager · GitHub)
With this, the server get the right internal and external IPs but I’m not able to install CNI (Calico, Cilium, Weave, Whatever) on it. It fails for some reason and it’s impossible to view the logs because of some wired kube-api permissions.

Kind regards,
Michael