Rancher2.X running on AWS EC2(Multi-AZ k8s clusters).

bnaob · July 6, 2018, 12:48am

Hi all.

Have a question for Rancher2.X running on AWS EC2(Multi-AZ k8s clusters).

following sample configuration(number of machines for AWS instances):
Two AZ setting, AZ-a and AZ-c

etcd-group-a : AZ-a × 1
etcd-group-c : AZ-c × 2
node-group-a : AZ-a × 1
node-group-c : AZ-c × 1
master-group-a : AZ-a × 1
master-group-c : AZ-c × 1

When in this configuration,
If down of AZ-c(In case of down of AZ-c NW),
does failover all nodes from AZ-c to AZ-a?, Or is the instance re-created after AZ-c recovery?
(failover meaning of create new instances on AZ-a).

vincent · July 6, 2018, 1:22am

Nodes are not automatically replaced. What you’ve described is not a good configuration. For the cluster to operate, a strict majority of the etcd nodes must be available. As you’ve laid it out, if “az-c” goes down you have no quorum for etcd because 2 of 3 are down.

bnaob · July 6, 2018, 8:44am

Hi, vincent.

Thank you for your reply.

not automatically replaces

I understood that.

Is the down instances re-created after AZ-c NW recovery ?

By the way, Is my understanding of this correct?

I missed wrote bad example configuration…
Thank you for pointing that out.

no quorum for etcd because 2 of 3 are down.

Fraser_Goffin · December 5, 2018, 9:09am

For your worker nodes, I would expect that you would run these under an ASG/ALB and if the health-check fails a node would be terminated and replaced by AWS. The replacement node will register itself to the cluster as part of you user-data launch configuration. Is that what you have ?

Etcd might need to be handled a bit differently.

sjairam · November 19, 2023, 8:38pm

hi,… I have a seperate topic… but how did you add additional nodes to main server node?
Just curious if I§m missing a command

Topic		Replies	Views
Restore etcd quorum after data center outage Rancher	1	855	October 29, 2019
Multi-AZ HA for Rancher Hosts in Amazon (AWS EC2) Rancher 1.x	7	2823	March 13, 2017
Solving a Customer Cluster problem: Failed to reconcile etcd plane: Etcd plane nodes are replaced Rancher	0	2280	April 24, 2020
No redundancy of etcd/control plane in 5-node kubernetes cluster Rancher	9	1771	February 22, 2019
Running Rancher Server in HA Rancher	3	1148	August 23, 2018

Rancher2.X running on AWS EC2(Multi-AZ k8s clusters).

following sample configuration(number of machines for AWS instances): Two AZ setting, AZ-a and AZ-c

etcd-group-a : AZ-a × 1 etcd-group-c : AZ-c × 2 node-group-a : AZ-a × 1 node-group-c : AZ-c × 1 master-group-a : AZ-a × 1 master-group-c : AZ-c × 1

Related topics

following sample configuration(number of machines for AWS instances):
Two AZ setting, AZ-a and AZ-c

etcd-group-a : AZ-a × 1
etcd-group-c : AZ-c × 2
node-group-a : AZ-a × 1
node-group-c : AZ-c × 1
master-group-a : AZ-a × 1
master-group-c : AZ-c × 1