I’m running rancher HA, I set this up using rancherOS as the underlaying OS.
I have three rancher nodes, and a pfsense FW which is load balancing. This all seems to work great, until the load balancer restarted, then the load on the ranchers hits the roof and never recovers without a reboot.
I think Rancher it’s self is trying to talk to it’s self via the load balancer and getting into a mess as the load balancer is reporting the nodes are down and so not routing to them.
The pfsense Load balancer is set to monitor the node by HTTPS, checking for path “/” and host “rancher.” and should have a return status code of “200 OK”
While the nodes are screwed up the load balancer obviously and correctly reports it’s down which means it’s not able to talk to its self.
Setting the load balancer to always see the node as up, while might work, defeats the value of load balancer