Hi.
I am trying to follow the documentation to setup the HA configuration.
I have an external mysql database where I have created the “rancher” db.
I can start the “script generating” container as the first step. I generate the script in the HA tab of the UI on port 8080. I can see that the mysql database has been populated with the proper tables in the “rancher” db).
I can launch the script on the first host (where I was running the “script generating” container that I have deleted before running the script).
I can see the containers coming up on this node:
mreferre@rancher-ha1:~$ sudo docker ps -a
[sudo] password for mreferre:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c1ee168a1ac4 rancher/server "cattle" 20 minutes ago Up 20 minutes rancher-ha-cattle
f804c6695a54 rancher/server "redis" 20 minutes ago Up 20 minutes rancher-ha-redis
6220976b4054 rancher/server "zk" 20 minutes ago Up 20 minutes rancher-ha-zk
b8606efd413b rancher/server "tunnel -d -s [0.0.0." 20 minutes ago Up 20 minutes rancher-ha-tunnel-zk-client-1
a8ae96f06718 rancher/server "tunnel -d -s [0.0.0." 20 minutes ago Up 20 minutes rancher-ha-tunnel-zk-leader-1
cff31993ea1a rancher/server "tunnel -d -s [0.0.0." 20 minutes ago Up 20 minutes rancher-ha-tunnel-zk-quorum-1
0591023d4f17 rancher/server "tunnel -d -s [0.0.0." 20 minutes ago Up 20 minutes rancher-ha-tunnel-redis-1
12a2564cca85 rancher/server "parent" 20 minutes ago Up 20 minutes 3306/tcp, 0.0.0.0:18080->8080/tcp, 0.0.0.0:2181->12181/tcp, 0.0.0.0:2888->12888/tcp, 0.0.0.0:3888->13888/tcp, 0.0.0.0:6379->16379/tcp rancher-ha-parent
e4d6e15bcf2e rancher/server "ha" 20 minutes ago Up 20 minutes rancher-ha
I am not even sure what to do now. It appears that the only ports being mapped are those listed above (which does not include things like 8080, 80, 443, or whatever the user is intended to hit). The documentation doesn’t seem to properly cover this point. I am not even getting to the point where I need to configure the load balancer as it’s not even clear what ports I need to balance.
I have tried with all possible ports (80, 443, 8080, 18080) but the browser always says it can’t connect with the server.
Thoughts?
Hi @mreferre, welcome to the forums.
There is another container that launches once you’ve started the entire HA cluster and Zookeeper achieves quorum. This container creates network tunnels into the relevant containers and exposes traffic to the ports which you are expecting (80, 443).
But let’s back up. You should be able to access your single Rancher server on port 18080 for debugging/learning purposes. Some things such as log tailing and container shell access will not work from this port, but it is useful for inspecting the HA System Stack. If you can’t access the server on this port, please take a look at rancher-ha-cattle logs via docker logs -f rancher-ha-cattle
and verify your firewall configuration.
I hope this helps,
James
Thanks James (@LLParse).
I have run the script on ALL 3 hosts and I noticed that new containers have been launched. Now each host look like this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
05ee0a87a022 rancher/agent-instance:v0.8.1 "/etc/init.d/agent-in" 37 minutes ago Up 37 minutes 0.0.0.0:500->500/udp, 0.0.0.0:4500->4500/udp 7205e6a2- a904-4b70-9fc8-2384df1519a1
3e7cfbceefd0 rancher/agent:v1.0.1 "/run.sh run" 38 minutes ago Up 38 minutes rancher-agent
4920d80469f7 rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-zk-client-3
1a73d99a348b rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-zk-leader-3
2ba9ce047d75 rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-zk-quorum-3
2c848a33e267 rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-redis-3
2daf51a904e7 rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-zk-client-2
5d03c9388a41 rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-zk-leader-2
c0e423d3910c rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-zk-quorum-2
2989b6bc589c rancher/server "tunnel -e -s [127.0." 39 minutes ago Up 39 minutes rancher-ha-tunnel-redis-2
2714c11e3307 rancher/server "tunnel -d -s [0.0.0." 40 minutes ago Up 40 minutes rancher-ha-tunnel-zk-client-1
c96d26e5a392 rancher/server "tunnel -d -s [0.0.0." 40 minutes ago Up 40 minutes rancher-ha-tunnel-zk-leader-1
c7b52df4c16f rancher/server "tunnel -d -s [0.0.0." 40 minutes ago Up 40 minutes rancher-ha-tunnel-zk-quorum-1
baa17c1985ec rancher/server "tunnel -d -s [0.0.0." 40 minutes ago Up 40 minutes rancher-ha-tunnel-redis-1
c1ee168a1ac4 rancher/server "cattle" 14 hours ago Up 51 minutes rancher-ha-cattle
f804c6695a54 rancher/server "redis" 14 hours ago Up 51 minutes rancher-ha-redis
6220976b4054 rancher/server "zk" 14 hours ago Up 51 minutes rancher-ha-zk
12a2564cca85 rancher/server "parent" 14 hours ago Up 51 minutes 3306/tcp, 0.0.0.0:18080->8080/tcp, 0.0.0.0:2181->12181/tcp, 0.0.0.0:2888->12888/tcp, 0.0.0.0:3888->13888/tcp, 0.0.0.0:6379->16379/tcp rancher-ha-parent
e4d6e15bcf2e rancher/server "ha" 14 hours ago Up 38 minutes rancher-ha
The good news is that I can now open :18080 against all three hosts (the UI comes up).
The bad news is that I still don’t see any container exposed on port 443 / 80.
Also, the log on the first two nodes of the cluster seems to be “clean”. The log on the third node keeps spitting this message:
time=“2016-05-12T07:28:15Z” level=error msg="Could not parse config for project management : Unsupported config option for cattle service: ‘health_check’\nUnsupported config option for rancher-compose-executor service: ‘health_check’\nUnsupported config option for websocket-proxy-ssl service: ‘health_check’\nUnsupported config option for go-machine-service service: ‘health_check’\nUnsupported config option for websocket-proxy service: ‘health_check’"
time=“2016-05-12T07:28:15Z” level=fatal msg="Failed to read project: Unsupported config option for cattle service: ‘health_check’\nUnsupported config option for rancher-compose-executor service: ‘health_check’\nUnsupported config option for websocket-proxy-ssl service: ‘health_check’\nUnsupported config option for go-machine-service service: ‘health_check’\nUnsupported config option for websocket-proxy service: ‘health_check’"
time=“2016-05-12T07:28:15Z” level=info msg=“Can not launch agent right now: exit status 1” component=service
Needless to say all 3 hosts have been deployed from the same template, they are all identical and they all live on the same flat port group.
Not sure if this is the reason why the containers on port 80 / 443 don’t come up.
Thanks.
This is a known issue in v1.1.0-dev1, which will be addressed in our next release. https://github.com/rancher/rancher/issues/4733
Could you please try shutting down your nodes and restarting them with the previous official release?
# remove all containers
$ docker rm -f $(docker ps -a -q)
# launch rancher server
$ ./rancher-ha.sh rancher/server:v1.0.1
Thanks @LLParse that made it. It took me only 1 day but I am up and running now
I what happened is that I launched the script creation container with the v1.0.1 version specified (per the manual) but then the manual doesn’t say to launch the script with an image version. Looking at the script:
IMAGE=$1
if [ "$IMAGE" = "" ]; then
IMAGE=rancher/server
fi
I guess it just downloaded the latest on the HA nodes (and hence the problems).
I suggest you fix the documentation to have users to launch “./rancher-ha.sh rancher/server:v1.0.1”
Thanks.
Glad you got it working!
As far as updating the docs, that is a good suggestion, thanks. We are releasing a new version soon and will address this.
I am struggling with this as well. See my post here: Assistance needed on setting up HA when using AWS AutoScaling groups for hosts
I have it working correctly, but it now says my certificate is invalid.