Rancher HA - HAProxy Configuration

I am trying to startup a HA cluster for rancher, but it seems my configuration is experiencing significant errors.

At the moment, the HA installation documentation is a little lacking in verbosity of what is exactly needed in the way of a load balancer. I am deploying via DigitalOcean droplets, and have a set of load balancers attached pointing to my three rancher servers. Should the Host Registration URL be the FQDN of this load balancer? Do I need to load SSL certs on the load balancer or are the self-signed certs used for this purpose? I have seen several mentions of needing to enable the proxy protocol for AWS ELB, any guidance on HAProxy?

My most common errors I am encountering are:

  • "Container agent is not running in state &types.ContainerState{Status:\"exited\", Running:false, Paused:false, Restarting:false, OOMKilled:false, Dead:false, Pid:0, ExitCode:0, Error:\"\", StartedAt:\"2016-05-03T03:30:47.529858519Z\", FinishedAt:\"2016-05-03T03:31:02.052640391Z\"}" component=docker

  • Failed to read project: Unsupported config option for rancher-compose-executor service: 'health_check'\nUnsupported config option for go-machine-service service: 'health_check'\nUnsupported config option for websocket-proxy service: 'health_check'\nUnsupported config option for websocket-proxy-ssl service: 'health_check'\nUnsupported config option for cattle service: 'health_check'

  • Could not parse config for project management : Unsupported config option for cattle service: 'health_check'\nUnsupported config option for go-machine-service service: 'health_check'\nUnsupported config option for websocket-proxy-ssl service: 'health_check'\nUnsupported config option for rancher-compose-executor service: 'health_check'\nUnsupported config option for websocket-proxy service: 'health_check'

If I can get past these questions/errors, I think it should work. Port 18080 comes up even with the errors, but the project/cluster never comes online on port 80.

1 Like

I am also wondering about the LB tier for HA. In our case we’re totally private anyway (ELB for example is not an option) so currently opting for HAProxy and terminating the TLS with HAProxy also. Will this have issues with web sockets? Is there specific configuration required? etc.

I am getting identical errors.

I am running on AWS:

  • ELB using a signed cert for SSL termination on the elb.
  • ELB config:
    ** 80 -> 80
    ** 443 -> 443
    ** 18080 -> 18080
  • The rancher-ha.sh config script was generated using self signed certificates
  • The elb has proxy protocol policy enabled for ports 80 and 443.
  • When looking at the stack in the management console the rancher compose and docker compose are both empty and no containers appear to be created… it stays like this forever.
  • All nodes are in an autoscaling group and have permission to talk to each other over all ports both TCP and UDP.

I’m trying different things, Ill update if I find a solution

Looks like the health_check issues are related to this bug: https://github.com/rancher/rancher-compose/pull/159

I am still unable to deploy a three node cluster using 1.1.0-dev or 1.0.1, though I was able to get a single node 1.0.1 node working with no load balancer.

1.1.0-dev is still using v0.8.0 of rancher-compose. There isn’t a version of 0.8.1 available yet on the releases page.

–edit–

I updated the dockerfile for rancher-server to use rancher compose 0.8.1 from the github releases page. Still getting the same health_check errors.

Do you have your HAProxy configuration available?
Even if it doesn’t work

There is an issue that we are tracking to that indicates that dev1 build is not working for HA deployments.

https://github.com/rancher/rancher/issues/4733