I was just trying out Rancher 2 as we’ve been using rancher 1.x for more than a year now and I was looking how it would suit us and what its capable of.
I’ve deployed Rancher latest on a 3 x rancher servers and 1x front nginx loadbalancer as its suggested in documentation and few articles I read. All of them are bare metal fresh OS ubuntu 16.
The cluster goes up and all fine but when I try to fail the HA just for testing purposes, it does work as expected if i stop docker for example on the two of them, but the first that was on the list of the cluster, once it goes down/off it takes down the whole of rancher cluster with 503 errors from the nginx loadbalancer.
I had a look through online but did not find anything quite similar. I assume I’ve setup something slightly wrong.
I’ve used the docs from -> https://rancher.com/docs/rancher/v2.x/en/installation/ha/rke-add-on/layer-7-lb/ for configuration I used this template -> https://raw.githubusercontent.com/rancher/rancher/master/rke-templates/3-node-externalssl-certificate.yml
Looking into the pods list, I noticed that the default backend http has only one pod running:
kubectl --kubeconfig kube_config_rancher-cluster.yml get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
cattle-system cattle-559cc76db-v27jl 1/1 Running 1 1h
cattle-system cattle-cluster-agent-57f687dbb5-bw8hj 1/1 Running 0 37m
cattle-system cattle-node-agent-2bcxv 1/1 Running 0 37m
cattle-system cattle-node-agent-bmv7p 1/1 Running 0 37m
cattle-system cattle-node-agent-n4slk 1/1 Running 3 37m
ingress-nginx default-http-backend-797c5bc547-2f72m 1/1 Running 1 1h
ingress-nginx nginx-ingress-controller-78gq6 1/1 Running 0 1h
ingress-nginx nginx-ingress-controller-qtxwn 1/1 Running 0 1h
ingress-nginx nginx-ingress-controller-vkxpl 1/1 Running 1 1h
kube-system canal-4vvsp 3/3 Running 0 1h
kube-system canal-97rkh 3/3 Running 5 1h
kube-system canal-rwpfv 3/3 Running 0 1h
kube-system kube-dns-7588d5b5f5-6zqgh 3/3 Running 4 1h
kube-system kube-dns-autoscaler-5db9bbb766-z9x6r 1/1 Running 2 1h
kube-system metrics-server-97bc649d5-fff2s 1/1 Running 1 1h
kube-system rke-ingress-controller-deploy-job-lp45d 0/1 Completed 0 1h
kube-system rke-kubedns-addon-deploy-job-6dhth 0/1 Completed 0 1h
kube-system rke-metrics-addon-deploy-job-2g959 0/1 Completed 0 1h
kube-system rke-network-plugin-deploy-job-f2nnm 0/1 Completed 0 1h
kube-system rke-user-addon-deploy-job-chllf 0/1 Completed 0 30m
Pretty sure that this is not how a HA solution works if you have a single point of failure from one of the cluster servers.
What I want to end up with is to understand how to config Rancher on actual HA setup and get another cluster after setup for our apps.
Any advice would be really helpful, as I mentioned we come from Rancher 1.x and just started looking into Rancher 2 and kubernetes…