Need for Rancher HA / Impact if down?

Well, I’m finding some very strange behavior in some very simple tests.

I have:

1 x VM running Rancher 2.0.1
3 x VMs nodes, running all services (worker, etcd, control)

I have a few workload deployed - some NGINX web servers mostly. All those workloads are working on node1.

Everything is working well, the Web UI is responsive, kubectl is responding nicely, the workloads are working. kubectl and the web UI are going to the API on the VM hosting the Rancher server.

I then disconnect the network of one of the 2 nodes that currently have no workload - say node2. This means it takes down one etcd and one control.

I would expect the API to remain available and everything to be all nice and working… But no, that’s not what happens. Instead, kubectl and the Web UI hang.

After a bit, kubectl gets an answer and a bit longer and the WebUI is back.

In one of my test, at some point just after the Web UI came back, the pods were “updating” and ingress responded with a 503 service unavailable.

Is that the expected behavior?