Rancher load balancing questions

A few questions about Rancher load balancing using HAProxy…

  1. Is there a way to specify a default fall-through? My thought was to create an ELB in AWS in front of the 3 nodes I have running Rancher’s LB service, and they’d each route traffic based on the host to the right container. However, the ELB health checks are currently returning 503s, I suspect because it’s a generic HTTP/1.0 GET / request from the ELB.

  2. Is there a way to route requests without having both somedomain.com and *.subdomain.com rules? With a lot of sites, needing two rules for each site is a bit of a pain.

  3. How about health checks? Can I ensure that if I route the traffic equally to all 3 HAProxy instances that they will only route traffic over to containers that are in a good healthy state? I’ve explicitly named the backends so I can reference them in the custom haproxy.cfg, but I don’t know what to add in to do so.

If I’ve got some 20 different sites, should I look at doing Nginx+ load balancing, or do folks here think that the native haproxy LBs will be sufficient?

Follow up questions. About typical use case. Do you set up an LB service for each service stack (ie, site stack has a wordpress service and load balancer service), or is it cleaner to have one load balancer stack with all the lb services underneath it.

I’m running somewhere between 12 and 20 ELBS and it’s a huge cost I need to get under control, without sacrificing reliability.

  1. Port rules apply in the order given; make the last one have no host/path. If you have multiple source ports you’d need one for each I suppose.

  2. There wasn’t intended to be, but the code is actually wrong so “*.host.com” turns into "ends with host.com. Which matches both (and someotherhost.com). So that will have to be fixed, but we can probably add some other syntax like **.host.com means to allow both.

  3. The health checks for the target services are used to determine if they should be in the rotation.

If you’re only doing HTTP/S you probably want one balancer to limit the number of IPs needed. It doesn’t matter what stack it’s in, but typically in it’s own since it’s not part of any one other.

(https://github.com/rancher/rancher/issues/7145)

Thanks for the feedback. Knowing that I can ditch all the somedomain.com entries and just use *.somedomain.com instead will make things simpler for me.

Is there an easy way to re-arrange the rules, or do I need to manually adjust them; ie, if I have a default catchall and then go to add a new site, can I just drag the catchall down, or do I need to copy it over to the new blank rule and then put in the new site in the spot I just freed up?

As for health checks, the last time I tried doing Rancher health checks on services, doing upgrades seemed to be really flaky. Things would get into a really unhappy state and I’d find myself needing to delete the service/stack and recreate it to get it to handle things well. Is there some detailed documentation about how best to handle implementing health checks in a way that is upgrade-friendly? When I was setting things up initially, recreating things was ok, but now that they’re actually in service… not so much.

There are up/down arrow buttons to reorder in the UI, drag & drop didn’t make it in. With a bunch of services & rules you might want to look into selector rules, which let you define the routing rules on the individual services instead of editing the balancer every time. (Also not configurable in the UI yet, probably 1.4)

I dunno when that was or what to tell you without specifics, but health checks are the only way containers are replaced or pulled out of balancers when they go bad. They run on (up to) 3 hosts other than the one the target container is on, so you need the full mesh of cross-host-communication working (every host in the environment should be able to reach every other one using their registered public IPs).

Might have been part of the weirdness I kept running into using t2.medium nodes, then. I’ll cautiously start trying it out on the new cluster I have that is m3.mediums.

Another LB question… right now I’ve been using fixed ports on services, like 8002:80, 8003:80, 8015:8088, etc. Which means I can only scale these up to the number of hosts I have. If I’m using the haproxy load balancer, do I even need to specify the external port, or can I just leave it autoassigned and then be able to say scale up to 6 containers across 3 hosts for a high volume web service?

If services are behind a balancer then there’s no need to publish host posts for each individual container. The balancer users the overlay network so a service does not have to expose port 80 as 808x to be a target. With no host ports published those services can then be scaled as much as you want.