Rancher HA General question about single point of failure

So I just followed the main installation of the rancher ha installation. But I do see that I need my load balancer in order for the cluster to be useful. I am using nginx for this but its hosted on its own machine and if it goes down then everything goes down… is there a way to make it more resilient like have multiple nginxs or some way to do this through the DNS records? Any advice would help!

Thanks for the future help/advice!!!

The way we do it is to use 3 servers running HAproxy and Keepalived. As long as one of them are running, the VIP is up and the HAproxy forwards to our real servers.

Could you explain a little more?

Is this something that could be done with “on premise” servers?
How do the DNS records “know” to go?

Sorry if these are dumb questions but im still getting into this stuff.

Yes, this can be done on premise, as we are doing.

For example:
Say you have 3 servers for Rancher: 10.0.1.11, 12, and 13.
You also have 3 servers for your load balancers, 10.0.1.21, 22, and 23.
You run Keepalived on the load balancers, and configure it to serve the VIP of 10.0.1.31. This is what you point your DNS entry for the Rancher UI to. The 3 load balancers will talk to each other, and one of them will assume the VIP address. If that LB node dies or reboots, the other two will decide which one will take the IP, and it will then assume that IP.
Each of the LB nodes will also run HAproxy/Nginx and be configured to listen on IP 10.0.1.31 and port 443. The “backend” will be the 3 IP addresses for your Rancher servers. The Nginx frontend should be configured as a TCP frontend, not HTTP, so that the SSL certs will be provided by Rancher, and not Nginx.

You will also need to enable non_local bind on the load balancers, so that Nginx won’t complain when it tries to bind to an IP that it doesn’t own (for the two servers that do NOT own the VIP).

1 Like