How are people handling DNS for services?

Example: Say I’ve got 2 web applications - a frontend and an api. The frontend hits the api at http://api.example.com. How do I configure this within Rancher in a clean way? If I have a large cluster of containers in my API service, I don’t really know specifically what host or IP any particular instance has, and I don’t want to have to manually update DNS if I have to make changes to the load balancer.

To make things more difficult, I also want access to my api from external hosts. How can I make sure to have current DNS for my api service that is transparent enough for me to not have to worry what hosts have my load balancers on them?

And again, if both of these services listen on port 80 and want external access, do I need to have at least two underlying host machines to facilitate this? Or do people recommend pointing all DNS at a single instance with a load balancer configured to route based on request hostname?

I run a LB service on all nodes in the environment. Then I point a CNAME to a A record in our DNS, listing all nodes that is part of the environment (thus, it will hit a LB whichever IP the client will use).

Then the LB is configured to route based on request host (and/or portnumber). For non http, you’d need to allocate a unique port for each different service (so if you have multiple separate instances of the same service, you’d have to use non standard ports for all but one of them, in this setup).

Not sure if this is the best approach, but it works well for our use case, as I only have a single record to update whenever a node is added/removed from the environment, regardless of how many services point to that environment (due to the extra level of indirection with service CNAME -> A -> Nodes’ IP).

Also look at this issue: How to locate a container using the Rancher API

I wrote a short python script to locate external IP:PORT from Rancher API

But how do you handle internal/external lookups? I want my frontend to hit my api via http://api.example.com, and I want that to go direct between containers. If I have other non-docker services running and I want those to hit api.example.com, I want those to reach my rancher hosts on their internal IPs. At the same time, if I want to hit api.example.com from my machine, I need a different view that says to hit my host from the outside.

How do I manage that?

The thing is you have two networks. Internal routing is best done using Rancher’s linking mechanism and Rancher internal DNS will find api.domain.com for you on the internal IP. If you need access from outside then you have an external DNS entry which points at the exported port on the Rancher host. This may move which is why I built the python script (and ported to Lua so Nginx can find which host a particular container is on). In my case api.domain.com would point to my Nginx instance and it will find the external IP.

@kiboro can you expand on how I can set this up within Rancher’s internal DNS? I had tried in the past to link containers as ‘api.example.com’ but the links did not get added. I feel like I must be missing something.

I can get the DNS set up outside of Rancher pretty easily, its the inter-container stuff that I’m beating my head against right now.

In your front end container simply set up a link ```
links:

  • api_service_name:api
then `api` should resolve directly so ping api from the front end container should just work

Ah, yes… but that doesn’t really solve my problem. I need my applications to be able to reach each other at their full URLs - as in, api.example.com rather than just api.

Seems like I am going to have to run my own internal DNS within the Rancher network.

@tprice I believe what you are asking is in this Github issue that’s been labeled as an enhancement. Being able to set a domain name and use that for the DNS.

https://github.com/rancher/rancher/issues/2769

That does look like what I’m looking for, although this statement doesn’t seem to be true either - ‘For rancher-dns service, you can specify the domain name in the name of the links.’

Whenever I have attempted to link a container as a name with periods in it, Rancher doesn’t seem to actually add the link.

Expanding further on the issue (or driving this off topic :p), but one thing comes to mind…

Considering a large number of rancher hosts, there needs to be a way (or should be a way, may be a better term), to route the requests coming from the outside… i.e. an externally facing loadbalancer of sorts, which would assume a specific Virtual IP for example, and failover between multiple hosts…

I guess this sort of setup can be emulated with ELBs if you use amazon, but when private hosting is happening it would make sense to have some mechanism that can aggregate a pool of external loadbalancers in a series of “outside-facing” IPs…

When considering adding rancher to our stack, one of the things we havent really figured out is how to do this… In initial tests we will probably stick to an existing HAProxy which today has application server VMs as backend servers and change those to LBs within rancher… but it still leaves a loadbalancer pair “outside” of rancher doing Virtual-IP failover, etc, between the hosts and delivers requests to each rancher LB…

Replying to myself, but updating to 0.49.1 seems to have resolved my inability to add links with periods in them. That sort of resolves the basic need for me, but I am still excited for the feature @denise mentioned.

In our case, we are planning to have an ‘External LB’ resource within Rancher that is configured to run an instance on every node, and then an ELB or HAproxy running in front of that to pool in external traffic. I want to have stuff as transparent as possible, so I think having a setup like that makes it so I don’t have to worry about changing DNS after initial setup.

Hi @tprice, understood… in the case of HAproxy I suppose you will be running two instances for HA, outside of the rancher cluster, is that correct? (with a standard vrrp setup in conjunction with keepalived or something similar)…

I guess I am looking for something which would allow for a single external IP to be assigned to a load balancer but have it managed within rancher… i.e. a fixed external IP is setup as the “floating” IP and a pair of containers (or more), are scheduled on different hosts, if a host (or container) goes down, another one will “pull” the IP and start responding…

I understand this is less of an issue when you put something like ELB in front, but for colo setups, and other larger deployments which arent receiving their traffic into AWS, for example, it makes things a bit tricky…

Dedicated haproxy machines or vms even could be installed (as we do today), but if we are diving into rancher it would be so much sexier to have that all “under one roof”… Another interesting feature that comes to mind is “multiple entrypoints” to the same application… Suppose you have an app with 2 hosts (or sets of hosts - for sakes of simplicity), one running on amazon, one running on a private datacenter. You could have one LB (or more) on the amazon hosts, and one on the private datacenter hosts. Each one advertising (or actually listening/answering on) a public IP of that area, and which can be Geo balanced or just distributed to different users, all while having communications between the hosts transparent due to rancher’s vpn (for failover or disaster recovery for example)…

For production, we have a mix of AWS and a physical datacenter. In our physical datacenter, yes we will have a failover pair of HAProxy boxes working outside of Rancher. Currently we have ancient F5s doing that, but they are dying so we are looking to switch over. I think we will probably use a set of physical boxes for HAProxy.

In AWS, where most of our containers will live, we will probably just use ELBs.

For my use case, I’m thinking that I will have rancher launch a LB container on every host, and then all my hosts in AWS will sit behind an ELB. All my external site DNS points at the ELB, and then all the routing happens internally inside rancher. Our physical datacenter is mostly mass-emailing infrastructure, so that is staying mostly physical machines. We will have a few rancher containers living there, but I imagine they will be mostly internal.

So far that seems like a pretty good plan - I’m not sure if there are any downsides to that method.

Cool…

I think I’ll probably experiment with something similar… Initially I’ll keep the HAProxys outside the rancher infrastructure balancing into one of the hosts… But ideally it would be cool to be able to launch a pair of “external” haproxys to handle that… I’d tag them so they only run on the physical datacenter machines, and pass one or more “VIPs” for them to attach to (one at a time)… probably need to run them as privileged container so they can “grab on” to the host network… I wonder how I would achieve this without breaking the rancher lb though…

Thanks for your info though… At least I know I’m not completely off in putting rahcner behind some haproxys…

I’m working on a similar setup: AWS ELB -> Docker hosts
Rancher loadbalancer running on each host, and proxying the external web requests to the appropriate containers.

Overall, this is working slick, but I’m struggling with rolling upgrades. Ideally, an upgrade needs to:

  1. Pull the new image (we’re bundling the source in the container) onto the new target machine
  2. Start the new instance, and add to the load balancer
  3. When the new version is up and active, remove the old instance from the load balancer and stop that instance

As of 0.47, this was kinda working, but #3 wasn’t exactly smooth. It appears to start too early, so there would be 503 errors as the old instance is pulled down before the new instance is removed from the LB.

I haven’t started experimenting with 0.50 yet, will be trying that shortly.

@drmikecrowe If you set the following in your rancher-compose.yml file then i expect it will behave as you want.

  upgrade_strategy:
    start_first: true

I have been toying around having rancher integrate with an external service discovery system (consul) and LB (haproxy) that gets it’s updates from consul and consul gets it’s updates from rancher dynamically.

I put together a small microservice POC container for dynamic integration between Rancher and Consul, using the websocket events from Rancher. This is no way is production ready, but I think the concept works.

1 Like