Assistance needed on setting up HA when using AWS AutoScaling groups for hosts

I have been using Rancher with AWS for testing container management and I am beginning to build my production environment. I have gone through the HA setup instruction and have Rancher successfully running on 3 hosts with an ELB in front. All seems to be working properly. I have come to the point where I am going to be adding hosts. I saw this Note at the end of the HA instructions.

If you are using AWS, you will need to specify the IP of the hosts that you are adding into Rancher. If you are adding a custom host, you can specify the public IP in the UI and the command to launch Rancher agent will be editted to specify the IP. If you are adding a host through the UI, after the host has been added into Rancher, you will need to ssh into the host to re-run the custom command to re-launch Rancher agent so that the IP is correct.

In my testing environment I have 3 autoscaling groups and launch configurations for three different types of host I want. These the following:

  1. Launches instance from AMI that I pre-created with docker installed. (this instance also has the certificate and settings for my docker registry)
  2. During launch it installs the correct Rancher agent which is configured to have the host labels I want for each group of instances.

This has been working very smoothly. I have my containers configure to run on particular hosts. If a host dies, then it is automatically replaced with with a new host which has the proper label and the container will redistribute to the new host properly.

My question is this:

Judging by the what this HA instructions are saying, it appears that there is something that will need to be done after each host is added. Something related to registering the host IP address. Can you please explain this process and help me find a way to automate that in my autoscaling group/launch configuration.

Also, this is the custom script that is generated from my new Rancher HA server when I click on add a custome host:
sudo docker run -d --privileged -e CA_FINGERPRINT="AF:6C:75:7B:BC:C1:36:23:F8:0F:78:D4:01:65:B3:18:7D:43:C6:DC" -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/rancher:/var/lib/rancher rancher/agent:v1.0.1

As of this time I cannot connect to the server over https:// so this will not work. I suspect my load balancer is not pointing to the correct port. I have it set to direct 80 > 18080 but I do not know which port to direct it to for https. These are the ports that the rancher server has opened:

Any help is appreciated.

this is the results of netstat -ln
Nothing is listening on port 443 or 444.

Any help? The documentation for HA setup does not really state what ports to set for the load balancer but I found another post that said it should be 80 > 81 and 443 > 444. This did not work. I can access the HA server only because I have pointed 80 > 18080

Something is clearly not right with the networking.

I found this post:

The post mentioned that there was a bug in rancher 1.1.0dev1. Rancher HA does not work with Rancher v1.1.0-dev1 due to rancher-compose issue · Issue #4733 · rancher/rancher · GitHub. I checked and realized that I was using that version.

I followed the directions and removed all the containers and restarted, specifying to user v1.0.1.

remove all containers

$ docker rm -f $(docker ps -a -q)

launch rancher server

$ ./ rancher/server:v1.0.1

I can indeed connect over port 80 now with my LoadBalancer set to 80 > 80.
But when I try to connect to https: I get an error that the certificate is invalid. When I look at the certificate, this is what I see.

There are some known issues with adding hosts in a HA AWS/ELB setup. We are looking to fix this for v1.1.0-dev2.

As denise mentioned, there are existing issues pertaining to SSL and AWS ELB.

Your current config (80->80) will work for HTTP traffic, but not web sockets. We use web sockets for log tailing and shell access to containers.

If you’d like to get Rancher HA functioning in AWS without SSL, I’d recommend mapping 80->81 and enabling proxy protocol on port 81 of your ELB.


Thanks LLParse and Denise.

I really want to stick with the stable release 1.0.1 if I can. I am trying to build a production environment. I am also on docker version 1.10.3. I encountered problems with 1.11 in trying to implement ConvoyFS. So I would like to know the best way to get this working in that scenario.

Is it possible, with the instruction you gave me, or is HA not possible with this version?

@cloudlady911 We just released v1.0.2 (latest stable release) that has some improvements with HA, AWS, ELB, so you might want to check that out!

Thanks @denise! What is the risk level in deploying this in my production environment? I have everything very stable right now.

Has the documentation been updated yet?

HA + SSL + ELB is not possible in 1.0.1 because the proxy-protocol port (444) returns responses that say http/ws instead of https/wss. This is fixed in 1.0.2, along with a short list of other changes.