How are folks approaching HA with k8s clusters in production?

Is there a ‘Rancher way’ of deploying HA k8s clusters? By way of documentation I see this for Rancher itself:
https://rancher.com/docs/rancher/v2.x/en/installation/ha/

But I don’t see any docs on how to deploy the k8s clusters themselves HA. Does such documentation exist?

Are y’all using HA k8s cluster in production, and if so, what’s your approach/setup?

We use RKE to create our clusters, and then import them into Rancher. In the cluster.yaml file you specify 3 nodes as role: [controlplane,etcd], and RKE will configure them in HA mode. You can optionally add the worker role if you want to have the same 3 nodes also server workloads.

Thanks for the reply shubbard343.

When RKE creates 3 nodes with role ‘controlplane, etcd’, is it truly HA in regard to the API server? My understanding of k8s HA clusters is that the scheduler and controller follow an ‘active-passive’ pattern where they negotiate who’s active at any time, so that just works out of the box, but that the API-server is active across all nodes and needs a fronting load-balancer, else the case of cluster control failure where the clients (kubelet, etc) are pointing at a single API server that goes down.

Also, is creating an HA cluster from within the Rancher UI substantially different from using RKE, or is RKE just being used underneath? I’ve successfully deployed an HA cluster from the UI, where etcd seems to be properly replicating and I maintain cluster control if I bring down one of the three HA members(i.e. scheduler and control seem to be properly bouncing around between nodes when they need to), with the exception of the node that is hosting API server that kubelet is talking to.

Yes, the API server runs on all 3 nodes, and the scheduler and controller only run on one node each.

Regarding the load-balancer, yes, you would need a load balancer to send traffic to all 3 nodes for any Ingress traffic to the cluster. We use an HAProxy that points to all of the nodes in each cluster for 80 & 443, since RKE/Rancher automatically creates an Ingress for those ports on all of the nodes. See Ingress and Failover for my example of how to point to your cluster nodes. You would create a frontend and backends for your main Rancher cluster, AND a set for each additional cluster you import into Rancher. I use keepalived to manage VIPs that run on the HAProxy nodes.

For managing the clusters, i.e. kubectl, you don’t point your kube config directly at the cluster. Once the cluster is imported into Rancher, you can create a cluster-specific kube config file from the Rancher UI, but the server endpoint is your Rancher cluster, not the actual cluster where your workloads will run.

For example, the kube config file for one of my clusters has the following:

clusters:
- name: "dev1"
  cluster:
    server: "https://rancher.example.com/k8s/clusters/c-st3pu"

And a second cluster has a kube config like:

clusters:
- name: "cicd"
  cluster:
    server: "https://rancher.example.com/k8s/clusters/c-f1gk8"

The rancher.example.com hostname resolves to the VIP for my main Rancher cluster on the HAProxy/Keepalived hosts, not the actual cluster nodes.

1 Like

Question about this.

Does Rancher, internally, run a load balancer within its API proxy that sends API connections to a surviving API server (in case 1 out of 3 nodes are down, that 1 node being the first one), or is it also afflicted by the problem of one API server being the “designated” master?

I believe that Rancher does some sort of health checking of the nodes in the cluster. If any of the nodes are down, then Rancher will know to talk to the surviving node(s).