Rancher2.X running on AWS EC2(Multi-AZ k8s clusters).


#1

Hi all.

Have a question for Rancher2.X running on AWS EC2(Multi-AZ k8s clusters).

following sample configuration(number of machines for AWS instances):
Two AZ setting, AZ-a and AZ-c

etcd-group-a : AZ-a × 1
etcd-group-c : AZ-c × 2
node-group-a : AZ-a × 1
node-group-c : AZ-c × 1
master-group-a : AZ-a × 1
master-group-c : AZ-c × 1

When in this configuration,
If down of AZ-c(In case of down of AZ-c NW),
does failover all nodes from AZ-c to AZ-a?, Or is the instance re-created after AZ-c recovery?
(failover meaning of create new instances on AZ-a).


#2

Nodes are not automatically replaced. What you’ve described is not a good configuration. For the cluster to operate, a strict majority of the etcd nodes must be available. As you’ve laid it out, if “az-c” goes down you have no quorum for etcd because 2 of 3 are down.


#3

Hi, vincent.

Thank you for your reply.

not automatically replaces

I understood that.

Is the down instances re-created after AZ-c NW recovery ?

By the way, Is my understanding of this correct?

I missed wrote bad example configuration…
Thank you for pointing that out.

no quorum for etcd because 2 of 3 are down.


#4

For your worker nodes, I would expect that you would run these under an ASG/ALB and if the health-check fails a node would be terminated and replaced by AWS. The replacement node will register itself to the cluster as part of you user-data launch configuration. Is that what you have ?

Etcd might need to be handled a bit differently.