There’s plenty of ways the monitoring by Keepalived can be achieved, that’s really something that’ll be specific to each users ‘case’ IMHO. We rely on the DNS for the Rancher LB rather than a health check, thus we go Keepalived > Custom HAProxy > Rancher HAProxy/Service.
We did consider integrating the two as well but didn’t as ‘handling’ the inter-dependency within the container is actually harder. Do you kill the container if only one dies, or both? You have to ensure both services are started and check the first is still running when the second succeeds and so on. It’s easier to handle each discreetly IMHO and of course, restart them separately.
FYI: We have keepalived as a sidekick of haproxy to do some of this for us.
Cool, thanks for your insight! Can you share why you dont use 0.0.0.0 or 127.0.0.1 as the IP for the check?
Adding an HAProxy “layer” esp in a sidekick combination with Keepalived would probably achieve the same, but I’m trying to remove the possible points and simplify the traffic and infrastructure as much as possible, I guess a custom haproxy wouldnt be too much traffic but if I can get away without it it would be a bit better imo…
I’ll try a couple of options, as well as trying an IP pool across different hosts, its unlikely that HAProxy is really exhausted as we run the balancers on pretty big hosts with good network, etc, but thinking in terms of scalability, adding haproxies in a master-master fashion could be desirable
We run multiple environments and rather than have a central stack for Keepalived and HAProxy. We run both per environment, each with its own dedicated IP. We need the dedicated IP so we can serve a different SSL/TLS certificate over port 443 for each environment. If we just bound to 0:: we’d need to use non-standard ports for each cert we wanted to serve, which isn’t an option due to our client profile.
I’ve added some improvements and variable value checks and created a pull request on github. I know I have merge rights but thought it best you give it the once over. Cheers
Glad its already helpful to someone Who knows someday we may find a way of making this a plugin to the rancher agent or something so its easier to configure from the UI or CLI.
I’m still in the middle of the war between Rancher and our requirements but I do like rancher a lot and once we can get enough stability in the environment etc I’m pretty confident we will be using it…
With great interest I stumpled over this thread on my search for a resilient load balancing solution. I also found @etlweatherpost Thanks a lot for all the information.
The installation through the dockerfile worked fine, but I miss at least one point to understand the whole setting and I hope someone can help me shed light into it.
I have two droplets on Digital Ocean with two Haproxy services through Rancher and two Keepalived services on each droplet. Now I took a floating ip on Digital Ocean and assigned it on one droplet. The minute I turn of this droplet the other one Haproxy will not take over. So I guess I miss a basic point here. I found a solution that one can re-assign the floating IP in Digital Ocean (e.g. from master to slave), but that is not part of this solution.
Can someone tell me how I can choose and set a floating IP (or elastic IP) to make this work? My hope was to use one instand on AWS with a HAproxy and anoher one on Digital Ocean to be more resilient, but I guess that is not possible. Thanks for any hints in advance!
I never tried on anything than my private servers (VMs on premise), so I can’t totally say.
A couple thing I can say to help you is:
set a notification email, see if keepalived is switching owner properly. I.e. the node remaining alive should send a notification saying it’s now the MASTER node.
If it is, then it’s a matter of how do you move an IP address on Digital Ocean. What keepalived does is send gratuitous arp notifications - the master keepalived tells the switch that its mac address has the floating IP address and the switch updates the arp table.
If you don’t get a MASTER/SLAVE change notification, then that means the node that is alive isn’t able to detect the other node is gone. That would be because they can’t exchange information. Normal operations, the two keepalived constantly send some form of notification to each other so they know the nodes are alive and who is the master.
Hi @ckreutz I havent tried this on Digital Ocean… I also dont know how their floating IP works, but from the jist of what you said their floating ip may already do what you need?
It may be possible that you dont need the keepalived, just haproxy on each host and if digital ocean can float the ip between them, that should be all…
If the floating ip isnt something “managed”, i.e. you must “latch on” to the floating ip through your service, then the keepalived would be a viable solution. (I actually skimmed through the link you posted and apparently they do rely on you configuring keepalived…)…
I dont understand why you have 2 haproxy and 2 keepalived on each droplet? Unless I’m missing something…
It should also be possible to use AWS and DigitalOcean as long as you do some DNS automation… You would then have both the AWS & DigitalOcean IPs on your DNS record (which would Roundrobin them, unless your dns provider offers otehr options), and a low ttl should allow a script to run from your “alive” host to remove the “dead” host…
This is however more involved than what we were proposing with this implementation… I will be coming back to this project in the next weeks/month(or 2) but you surely (afaik) wont be able to float an IP between AWS & DigitalOcean… you would have different IPs for both, and you can float the IPs within each datacenter, on DO using something along the lines of this Dockerfile as well as some changes detailed in the link, and the AWS IP with ELB.
I hope this clears it up a bit… I havent worked with either myself, as the goal of this project was to provide “ELB-like” features on colocation/private clouds…
@etlweather and @RVN_BR thanks so much for your replies and sorry for my delay. I wanted to wait until I have tangible results, but I am struggling implementing it for my purposes (e.g. already getting the mail running for keepalived is surprisingly difficult . I will post back here again when I have a decent solution, which might help someelse too.