Rancher HA setup - not working (?)

Hi.

I am trying to follow the documentation to setup the HA configuration.

I have an external mysql database where I have created the “rancher” db.

I can start the “script generating” container as the first step. I generate the script in the HA tab of the UI on port 8080. I can see that the mysql database has been populated with the proper tables in the “rancher” db).

I can launch the script on the first host (where I was running the “script generating” container that I have deleted before running the script).

I can see the containers coming up on this node:

       mreferre@rancher-ha1:~$ sudo docker ps -a
       [sudo] password for mreferre: 
       CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                                                                                                                   NAMES
       c1ee168a1ac4        rancher/server      "cattle"                 20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-cattle
       f804c6695a54        rancher/server      "redis"                  20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-redis
       6220976b4054        rancher/server      "zk"                     20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-zk
       b8606efd413b        rancher/server      "tunnel -d -s [0.0.0."   20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-tunnel-zk-client-1
       a8ae96f06718        rancher/server      "tunnel -d -s [0.0.0."   20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-tunnel-zk-leader-1
       cff31993ea1a        rancher/server      "tunnel -d -s [0.0.0."   20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-tunnel-zk-quorum-1
       0591023d4f17        rancher/server      "tunnel -d -s [0.0.0."   20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha-tunnel-redis-1
       12a2564cca85        rancher/server      "parent"                 20 minutes ago      Up 20 minutes       3306/tcp, 0.0.0.0:18080->8080/tcp, 0.0.0.0:2181->12181/tcp, 0.0.0.0:2888->12888/tcp, 0.0.0.0:3888->13888/tcp, 0.0.0.0:6379->16379/tcp   rancher-ha-parent
       e4d6e15bcf2e        rancher/server      "ha"                     20 minutes ago      Up 20 minutes                                                                                                                                               rancher-ha

I am not even sure what to do now. It appears that the only ports being mapped are those listed above (which does not include things like 8080, 80, 443, or whatever the user is intended to hit). The documentation doesn’t seem to properly cover this point. I am not even getting to the point where I need to configure the load balancer as it’s not even clear what ports I need to balance.

I have tried with all possible ports (80, 443, 8080, 18080) but the browser always says it can’t connect with the server.

Thoughts?

Hi @mreferre, welcome to the forums.

There is another container that launches once you’ve started the entire HA cluster and Zookeeper achieves quorum. This container creates network tunnels into the relevant containers and exposes traffic to the ports which you are expecting (80, 443).

But let’s back up. You should be able to access your single Rancher server on port 18080 for debugging/learning purposes. Some things such as log tailing and container shell access will not work from this port, but it is useful for inspecting the HA System Stack. If you can’t access the server on this port, please take a look at rancher-ha-cattle logs via docker logs -f rancher-ha-cattle and verify your firewall configuration.

I hope this helps,
James

Thanks James (@LLParse).

I have run the script on ALL 3 hosts and I noticed that new containers have been launched. Now each host look like this:

CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                                                                                                                                   NAMES
05ee0a87a022        rancher/agent-instance:v0.8.1   "/etc/init.d/agent-in"   37 minutes ago      Up 37 minutes       0.0.0.0:500->500/udp, 0.0.0.0:4500->4500/udp                                                                                            7205e6a2-         a904-4b70-9fc8-2384df1519a1
3e7cfbceefd0        rancher/agent:v1.0.1            "/run.sh run"            38 minutes ago      Up 38 minutes                                                                                                                                               rancher-agent
4920d80469f7        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-zk-client-3
1a73d99a348b        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-zk-leader-3
2ba9ce047d75        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-zk-quorum-3
2c848a33e267        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-redis-3
2daf51a904e7        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-zk-client-2
5d03c9388a41        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-zk-leader-2
c0e423d3910c        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-zk-quorum-2
2989b6bc589c        rancher/server                  "tunnel -e -s [127.0."   39 minutes ago      Up 39 minutes                                                                                                                                               rancher-ha-tunnel-redis-2
2714c11e3307        rancher/server                  "tunnel -d -s [0.0.0."   40 minutes ago      Up 40 minutes                                                                                                                                               rancher-ha-tunnel-zk-client-1
c96d26e5a392        rancher/server                  "tunnel -d -s [0.0.0."   40 minutes ago      Up 40 minutes                                                                                                                                               rancher-ha-tunnel-zk-leader-1
c7b52df4c16f        rancher/server                  "tunnel -d -s [0.0.0."   40 minutes ago      Up 40 minutes                                                                                                                                               rancher-ha-tunnel-zk-quorum-1
baa17c1985ec        rancher/server                  "tunnel -d -s [0.0.0."   40 minutes ago      Up 40 minutes                                                                                                                                               rancher-ha-tunnel-redis-1
c1ee168a1ac4        rancher/server                  "cattle"                 14 hours ago        Up 51 minutes                                                                                                                                               rancher-ha-cattle
f804c6695a54        rancher/server                  "redis"                  14 hours ago        Up 51 minutes                                                                                                                                               rancher-ha-redis
6220976b4054        rancher/server                  "zk"                     14 hours ago        Up 51 minutes                                                                                                                                               rancher-ha-zk
12a2564cca85        rancher/server                  "parent"                 14 hours ago        Up 51 minutes       3306/tcp, 0.0.0.0:18080->8080/tcp, 0.0.0.0:2181->12181/tcp, 0.0.0.0:2888->12888/tcp, 0.0.0.0:3888->13888/tcp, 0.0.0.0:6379->16379/tcp   rancher-ha-parent
e4d6e15bcf2e        rancher/server                  "ha"                     14 hours ago        Up 38 minutes                                                                                                                                               rancher-ha

The good news is that I can now open :18080 against all three hosts (the UI comes up).

The bad news is that I still don’t see any container exposed on port 443 / 80.

Also, the log on the first two nodes of the cluster seems to be “clean”. The log on the third node keeps spitting this message:


time=“2016-05-12T07:28:15Z” level=error msg="Could not parse config for project management : Unsupported config option for cattle service: ‘health_check’\nUnsupported config option for rancher-compose-executor service: ‘health_check’\nUnsupported config option for websocket-proxy-ssl service: ‘health_check’\nUnsupported config option for go-machine-service service: ‘health_check’\nUnsupported config option for websocket-proxy service: ‘health_check’"
time=“2016-05-12T07:28:15Z” level=fatal msg="Failed to read project: Unsupported config option for cattle service: ‘health_check’\nUnsupported config option for rancher-compose-executor service: ‘health_check’\nUnsupported config option for websocket-proxy-ssl service: ‘health_check’\nUnsupported config option for go-machine-service service: ‘health_check’\nUnsupported config option for websocket-proxy service: ‘health_check’"
time=“2016-05-12T07:28:15Z” level=info msg=“Can not launch agent right now: exit status 1” component=service


Needless to say all 3 hosts have been deployed from the same template, they are all identical and they all live on the same flat port group.

Not sure if this is the reason why the containers on port 80 / 443 don’t come up.

Thanks.

This is a known issue in v1.1.0-dev1, which will be addressed in our next release. https://github.com/rancher/rancher/issues/4733

Could you please try shutting down your nodes and restarting them with the previous official release?

# remove all containers
$ docker rm -f $(docker ps -a -q)
# launch rancher server
$ ./rancher-ha.sh rancher/server:v1.0.1

Thanks @LLParse that made it. It took me only 1 day but I am up and running now :wink:

I what happened is that I launched the script creation container with the v1.0.1 version specified (per the manual) but then the manual doesn’t say to launch the script with an image version. Looking at the script:

IMAGE=$1
if [ "$IMAGE" = "" ]; then
     IMAGE=rancher/server
fi  

I guess it just downloaded the latest on the HA nodes (and hence the problems).

I suggest you fix the documentation to have users to launch “./rancher-ha.sh rancher/server:v1.0.1”

Thanks.

Glad you got it working!

As far as updating the docs, that is a good suggestion, thanks. We are releasing a new version soon and will address this.

I am struggling with this as well. See my post here: Assistance needed on setting up HA when using AWS AutoScaling groups for hosts

I have it working correctly, but it now says my certificate is invalid.