HA for master nodes

hi, i am very new to rancher and kubernetes. I want to create HA of multiple master nodes using rancher GUI?
please, let me know the procedure.


HA install for Rancher is documented at https://rancher.com/docs/rancher/v2.x/en/installation/ha/, creating production ready clusters within Rancher is documented at https://rancher.com/docs/rancher/v2.x/en/cluster-provisioning/production/

just want to confirm that whether Layer 4 load balancer(tcp) works perfectly fine on VM machines? I read somewhere it won’t supports. only layer 7 loadbalancer supports this.

Create your cluster using RKE, andin your cluster.yaml declare three nodes that are control plane. Its that easy.



##then declare al your worker nodes

An L4 load balancer definitely works and is recommended. We deploy to AWS and use an NLB.

hello yeti,
thanks for immediate reply.
As of now I have created 2 master nodes and 1 worker node.
now i have to test certain cases like :

  1. if my first master node gets down, then whether the second master node is able to take entire load? moreover, i need to confirm is it ok to test this with 2 master node or it requires 3 master nodes ?

Please read the documentation linked, in https://rancher.com/docs/rancher/v2.x/en/cluster-provisioning/production/#count-of-etcd-nodes it clearly states 2 etcd nodes does not give you fault tolerance.

@ [kamlesh] It is generally good practice to always use an odd number of masters, as the control-plane nodes perform leader elections.

Leader election is the mechanism that guarantees that only one instance of the kube-scheduler — or one instance of the kube-controller-manager — is actively making decisions, while all the other instances are inactive, but ready to take leadership if something happens to the active one.

1 Like

I thought so but all the K8s docs say you only need two Masters… Do you have any reference to validate that the Masters perform leader election?

There you go! https://rancher.com/docs/rancher/v2.x/en/troubleshooting/kubernetes-resources/#kubernetes-controller-manager-leader

https://medium.com/michaelbi-22303/deep-dive-into-kubernetes-simple-leader-election-3712a8be3a99 & others. Just google it

Words are getting conflated here. There is nothing we call a “master” in Rancher, nodes have the “control plane” or “etcd” role.

etcd has leader election and a "master " inside of itself. You should always have an odd number etcd nodes. There is no reason to ever have an even number except temporarily during a failure or on the way up (or down) to the next odd number; even is strictly worse than odd. And 2 is the absolute worst number to have, because you still have no fault tolerance (if either goes down you have no quorum) but have introduced twice as many hard drives, power supplies, NICs, DIMMs, CPUs etc that could fail.

Control plane nodes talk to etcd, provide the API, and tell worker nodes to do things. More than one provides redundancy in case one fails (and can sometimes horizontally scale load). You do not need an odd number of them. If you have more than one then you need a load balancer or DNS round-robin to distribute requests from users/nodes to the healthy control plane nodes.

1 Like


i am little bit confuse regarding number of control plane and etcd required for HA of master. currently i have updated my cluster with 3 master node (each having 1 etcd role and 1 control plane role) and 1 worker node (which has 1 worker role only).
is it right to move forward ?
or some ground level changes still required before start with installation.

Again, there is nothing called a “master”. To survive the failure of any one node, you want:

  • 3 or 5 nodes with the etcd roles
  • 2 or more control plane
  • 2 or more worker

A single node can have one or more of those roles (i.e. 3 nodes with all 3 roles satisfies the above). Combining etcd and control plane together is common.

1 Like

can we put roles (etcd, control plane, and worker ) on the same node? will they work fine ?

Yes that will work. You may want to think about the potential consequences though, ie you have less resilience and the possibility that problems with one component will adversely impact the the others. There is also clearly a difference in how you scale this set up if you were to find that any of the components have different resource usage profiles than others (hint, they do).

Anyway, your requirements are your own so that’s what should inform your choices. Technically speaking multi-role nodes are definitely supported.

hey, i m using this link:
for rancher HA.
here it is mentioned that it is required to install tools namely : RKE, kubectl,helm. As per the doc we are installing kubernetes using RKE. so below are my queries regarding the same :

  1. on which nodes (like i have 1 load balancer node, 3 ingress controller nodes, 1 worker node) these tools (RKE, Kubectl, helm) are to be installed?
  2. if kubernetes is installed using RKE then is it necessary to install kubectl separately on each node?

Those are client side tools so whilst you may choose to install them on your worker of management nodes, more typically you will use whatever your CI/CD platform of choice is to create deployment pipelines that could use helm, vanilla kubectl and rke. Helm 2 is slightly different in the sense that you can install tiller on your nodes. However that’s not a requirement and many people today regard tiller as a security vulnerability (although it is possible to mitigate that in a number of ways). Personally speaking, we have already moved over to Helm 3 which has recently moved to release candidate status.

Also it is important to understand if you are referring to HA for Rancher Server itself (the “local” cluster) or you already have Rnacher running and are creating a workload cluster.

For Rancher HA cluster, you will have all three roles on each node (and should have three nodes, or 5,7,9 if you wanna get crazy). But only Rancher Server runs on this cluster (plus the K8s components)

For a workload cluster managed by Rancher, a common config is 3 nodes with “etc” and “control plane” and then additional nodes with only “worker”.

hey, thanks guys for your support. :slight_smile:
I have deployed the HA rancher successfully. let me tell you about the cluster that I formed

  1. 1 load balancer which is a separated node.
  2. 3 ingress nodes (having roles etcd, controlpane) I configured.
  3. 1 worker node.

now as per the docs
setup is done successfully, you can view the status of the pods.

[high@loadbalancer creating_cluster]$ kubectl -n cattle-system get pods
rancher-85498c4d67-jncjx 1/1 Running 8 7d15h
rancher-85498c4d67-mtvb2 1/1 Running 8 7d15h
rancher-85498c4d67-trmtw 1/1 Running 9 7d15h

*note i have done below changes

  1. disabled and stop firewalld service on all 5 nodes.
  2. changed the web port for ngnix from 80 to other random port.

now I need to know how can I open rancher web portal?
i am trying using IP address of one of the ingress controller node.
but getting error : connection refused.

please help.