DNS Service does not randomize IP addresses

It seems that DNS Service does not work in a way documentation prescribes it:

When queried by the service name, the DNS service returns a randomized list of IP addresses of the healthy containers implementing that service.

I’ve created Nginx stack, added nginx service and created lb-test load balancer for this service.

I then created a different stack, added debian service and linked lb-test from nginx service as nginx-lb in the debian service.

Below are my results, it seems that IPs are not randomized and always returned in the same order:

root@4a398d57a8ab:~# for _ in {1..4}; do echo '~/~'; getent hosts nginx-lb; sleep 1; done
~/~
10.42.67.172    nginx-lb.rancher.internal
10.42.180.18    nginx-lb.rancher.internal
~/~
10.42.67.172    nginx-lb.rancher.internal
10.42.180.18    nginx-lb.rancher.internal
~/~
10.42.67.172    nginx-lb.rancher.internal
10.42.180.18    nginx-lb.rancher.internal
~/~
10.42.67.172    nginx-lb.rancher.internal
10.42.180.18    nginx-lb.rancher.internal
root@4a398d57a8ab:~# for _ in {1..4}; do echo '~/~'; getent hosts lb-test.nginx; sleep 1; done
~/~
10.42.67.172    lb-test.nginx
10.42.180.18    lb-test.nginx
~/~
10.42.67.172    lb-test.nginx
10.42.180.18    lb-test.nginx
~/~
10.42.67.172    lb-test.nginx
10.42.180.18    lb-test.nginx
~/~
10.42.67.172    lb-test.nginx
10.42.180.18    lb-test.nginx

In comparison with AWS I have expected that returned list of IP addresses will be rotated in a round robin fashion.

A simple application simulation:

root@4a398d57a8ab:~# for _ in {1..30}; do curl -sv lb-test.nginx 2>&1 | grep 'Connected to'; done
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
* Connected to lb-test.nginx (10.42.180.18) port 80 (#0)
root@4a398d57a8ab:~# for _ in {1..30}; do curl -sv nginx-lb 2>&1 | grep 'Connected to'; done
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)
* Connected to nginx-lb (10.42.180.18) port 80 (#0)

Each time my application (curl script) has connected to the same nginx container (10.42.180.18).

Any ideas?

BTW, I’m using the following components:

Rancher	v1.4.1
Cattle	v0.176.9
User Interface	v1.4.6
Rancher CLI	v0.4.1
Rancher Compose	v0.12.2

Just FYI, I did a test of this in my environment and saw the same behaviour. What I did notice was that a single container always seems to get the same response and record order but another container will get a different order (but again, always the same order). Perhaps a subtle change in behaviour that hasn’t been documented? It might be worth raising an issue on Github?

Just opened new issue, hopefully soon we will see what devs think about it.

The same people read the forum and github. Creating an issue with no info in it besides a link to the forum is just a waste of time and effort for all.

Responses are always randomized. Client software is not and some specifically in-randomize. DNS is not an effective load balancer, if you care about the balance of requests you need an internal balancer.

https://github.com/rancher/rancher/issues/3495

Hopefully the following test can prove that regardless of the client Rancher DNS is not randomized.

Before I tested it on debian image, this time busybox was used:

Note that when I use AWS DNS, responses are rotated in round robin. When I restore original resolv.conf, thus internal Rancher DNS server is being used, DNS responses are all returned in the same order. Client is the same in both tests, the only changing component is DNS server.

/ # resolv=$(cat /etc/resolv.conf)
/ # echo 'nameserver 172.30.0.2' > /etc/resolv.conf
/ # for _ in $(seq 4); do echo '~/~'; nslookup $ADDR | grep -A2 Name: | awk '/Address/{split($3,ip,".");printf "X.X.X.%s\n",ip[3]}'; done
~/~
X.X.X.13
X.X.X.128
~/~
X.X.X.128
X.X.X.13
~/~
X.X.X.13
X.X.X.128
~/~
X.X.X.128
X.X.X.13
/ # echo "$resolv" > /etc/resolv.conf
/ # for _ in $(seq 4); do echo '~/~'; nslookup $ADDR | grep -A2 Name: | awk '/Address/{split($3,ip,".");printf "X.X.X.%s\n",ip[3]}'; done
~/~
X.X.X.13
X.X.X.128
~/~
X.X.X.13
X.X.X.128
~/~
X.X.X.13
X.X.X.128
~/~
X.X.X.13
X.X.X.128
/ # ADDR=nginx.nginx
/ # for _ in $(seq 4); do echo '~/~'; nslookup $ADDR | grep -A2 Name: | awk '/Address/{split($3,ip,".");printf "X.X.X.%s\n",ip[3]}'; done
~/~
X.X.X.139
X.X.X.7
~/~
X.X.X.139
X.X.X.7
~/~
X.X.X.139
X.X.X.7
~/~
X.X.X.139
X.X.X.7

what’s your lease time?

How is dhcp lease time relevant?

Apparently adding caching broke this a while ago, and the cache is client-specific.