Is there anyone who have successfully managed to set up Consul on Rancher?
We have tried the Catalog version, failing on parse error in server.json.
We have tried to run it manually, failing on nodes being unable to communicate.
First, if you dont provide an -advertise a.b.c.d to Consul in rancher, each server will use the IP of the Rancher Agent.
which is the same on all nodes, so that fails with Consul complaining that all nodes cannot have the same IP.
If I specify the IP manually for each server, and then provide that IP as the value for -advertise then it does work.
A better approach would have been to fetch the primary IP from the Rancher metadata, is that possible w/o creating new docker images?
The next issue is similar, I want to spin up a Consul agent on each node. If I do this by setting scale to run one instance on each host, then we are back to the above IP issue, all agents try to join with the same IP (that of the network agent)
This seems completely messy to me, is there a simpler way with less friction?
I brought up a cluster and it seemed to work without advertising, but I think this is going to depend massively on the topology you want. It sounds a bit like you want separate Rancher hosts to port forward the consul ports into separate docker stacks?
I brought up three nodes on a single server, the IPs were all allocated correctly and the nodes talk (annoyingly, no UI, but that’s another topic). Having them on the same hardware is obviously not desirable, but I think the way to do it is to add additional hosts to Rancher and then use anti-affinity to separate the containers physically - but still all under the same stack?
Running it all on a single node works fine. each container gets their own IP.
However, when running on multiple nodes, the first container on each node gets the same IP (the ip of the rancher agent)
The Metadata API is good, however, containers running busybox with WGET cannot call the API.
So in order to make a Consul container able to advertise itself using either the Host IP or the Container primary IP,
I would have to
create my own docker images containing a custom installation of Consul
and make that image CURL for the metadata and set variables from that.
Or…
Run the Consul images with Host network
Figure out a way to get what IP to bind to for each machine as Consul exits complaining there are multiple IP addresses. (which brings us back to the same issue as above, the busybox containers can’t talk to metadata)
Thats a lot of work just to adapt the images to Rancher.
Rancher is a tool that should make it easier to use containers, not force you to re-create existing containers…
[Edit] it looks like this post fixes the WGET problem
AFAIK HashiCorp Consul is NOT cluster aware at present, that’s why you have to set up the specific IPs or DNS names of you Consul server(s) and Agents explicitly rather than just allowing the docker mesh to resolve. A few more people going to Hashicorp’s site to request that enhancement might help