Deploying HA kubernetes clusters

I’m not seeing any mention of how to deploy kubernetes HA clusters in the 2.x docs (only how to deploy HA rancher). I currently have a single node with the ‘Control Plane’ and ‘etcd’ roles; is it as simple as re-running the docker run command on a few more nodes w/ the ‘etcd’ and ‘Control Plane’ params added? e.g.

sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run rancher/rancher-agent:v2.0.8 --server --token blahblah --etcd --controlplane

Any gotchas or caveats I should be aware of? This seems deceptively simple…

Also, how can I determine when a new control plane or etcd node is fully up, data synced and functional? (e.g. I could take the other control nodes down and the cluster would still function)

“High Availability” in Rancher 2.0 means installing Rancher into a Kubernetes cluster that is already “HA” meaning 3 or more nodes. The way I have done it is by using RKE to create a 3-node Kubernetes cluster, and then use Helm to install Rancher into the Kubernetes cluster.

Once you have the 3-node cluster, follow steps 3 (Helm) and 4 (Install Rancher) and you have HA rancher in Kubernetes.

Hi shubbard343, thanks for your response.
I am familiar with that documentation, but it’s not quite what I was looking for.

I basically had deployed a cluster via Rancher that looked like this:

nodeA - Control Plane, etcd, worker
nodeB - worker
nodeC - worker

And want to turn it into an HA cluster:

nodeA - Control Plane, etcd, worker
nodeB - Control Plane, etcd, worker
nodeC - Control Plane, etcd, worker

To get to this state, I tried running the docker run command with the added args on nodeA and nodeB to take on the etcd and control roles; this didn’t work - etcd never took, though it looked like it successfully setup the control plane components.

What did work for me was to delete nodeB and nodeC from the cluster, run a cleanup script on the nodes with:

And then add those nodes again with the docker run command with the all the desired roles from the get-go. Not sure if the cleanup script was needed since I didn’t try this approach without.

Based off my experience, it seems like it’s problematic to add roles after the initial ‘install’, for etcd at least. I wonder if that’s expected, or should it be possible to add and remove etcd after initial install?

I used RKE to create the cluster, and RKE has the option to specify the roles on each node. So if you use RKE, it’s just a matter of adding the controlplane and etcd to the nodes, and run rke up again to apply the new roles.