I have been using Rancher with AWS for testing container management and I am beginning to build my production environment. I have gone through the HA setup instruction and have Rancher successfully running on 3 hosts with an ELB in front. All seems to be working properly. I have come to the point where I am going to be adding hosts. I saw this Note at the end of the HA instructions.
NOTE:
If you are using AWS, you will need to specify the IP of the hosts that you are adding into Rancher. If you are adding a custom host, you can specify the public IP in the UI and the command to launch Rancher agent will be editted to specify the IP. If you are adding a host through the UI, after the host has been added into Rancher, you will need to ssh into the host to re-run the custom command to re-launch Rancher agent so that the IP is correct.
In my testing environment I have 3 autoscaling groups and launch configurations for three different types of host I want. These the following:
Launches instance from AMI that I pre-created with docker installed. (this instance also has the certificate and settings for my docker registry)
During launch it installs the correct Rancher agent which is configured to have the host labels I want for each group of instances.
This has been working very smoothly. I have my containers configure to run on particular hosts. If a host dies, then it is automatically replaced with with a new host which has the proper label and the container will redistribute to the new host properly.
My question is this:
Judging by the what this HA instructions are saying, it appears that there is something that will need to be done after each host is added. Something related to registering the host IP address. Can you please explain this process and help me find a way to automate that in my autoscaling group/launch configuration.
Also, this is the custom script that is generated from my new Rancher HA server when I click on add a custome host: sudo docker run -d --privileged -e CA_FINGERPRINT="AF:6C:75:7B:BC:C1:36:23:F8:0F:78:D4:01:65:B3:18:7D:43:C6:DC" -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/rancher:/var/lib/rancher rancher/agent:v1.0.1 https://rch.domain.com/v1/scripts/CAFBBFUGJA996B76E2:1463504400000:8aiktUisghh3QbKOfi5FwyyfX8A
As of this time I cannot connect to the server over https:// so this will not work. I suspect my load balancer is not pointing to the correct port. I have it set to direct 80 > 18080 but I do not know which port to direct it to for https. These are the ports that the rancher server has opened:
Any help? The documentation for HA setup does not really state what ports to set for the load balancer but I found another post that said it should be 80 > 81 and 443 > 444. This did not work. I can access the HA server only because I have pointed 80 > 18080
Something is clearly not right with the networking.
I followed the directions and removed all the containers and restarted, specifying to user v1.0.1.
remove all containers
$ docker rm -f $(docker ps -a -q)
launch rancher server
$ ./rancher-ha.sh rancher/server:v1.0.1
I can indeed connect over port 80 now with my LoadBalancer set to 80 > 80.
But when I try to connect to https: I get an error that the certificate is invalid. When I look at the certificate, this is what I see.
I really want to stick with the stable release 1.0.1 if I can. I am trying to build a production environment. I am also on docker version 1.10.3. I encountered problems with 1.11 in trying to implement ConvoyFS. So I would like to know the best way to get this working in that scenario.
Is it possible, with the instruction you gave me, or is HA not possible with this version?
HA + SSL + ELB is not possible in 1.0.1 because the proxy-protocol port (444) returns responses that say http/ws instead of https/wss. This is fixed in 1.0.2, along with a short list of other changes.