External DNS Lookups Fail

Rancher 1.4.1
Network Services .16
Docker 1.12.6

After upgrading to 1.4.1 all container/stacks that are managed by rancher managed network fail with external dns lookups that are forwarding to our dns fails (see network-services-metadata-dns logs).

When we manually edit the resolv.conf for the managed container and add our nameserver above the rancher nameserver 169.254.169.250, it works fine. However, rancher auto comments out the manual changes whenever the container restarts.

We implemented a temporary fix where we insert our nameserver 192.168.99.19 to the resolv.conf in image entrypoint scripts for our images. It fixes it for those images. But does not fix external lookup for public images that we don’t build the image for. So looking for a better way to fix this issue.

Is there a better fix for this?

see network-services-metadata-dns logs All external dns lookup are like this.

3/7/2017 10:18:07 PMtime=“2017-03-08T03:18:07Z” level=warning msg=“Recurser error: read udp 172.17.0.2:38674->192.168.1.29:53: i/o timeout” fqdn=www.terracotta.org. resolver=192.168.99.19
3/7/2017 10:18:07 PMtime=“2017-03-08T03:18:07Z” level=warning msg=“Recurser error: read udp 172.17.0.2:58319->192.168.1.29:53: i/o timeout” fqdn=www.terracotta.org. resolver=192.168.99.19
3/7/2017 10:18:09 PMtime=“2017-03-08T03:18:09Z” level=warning msg=“Recurser error: read udp 172.17.0.2:48249->192.168.1.28:53: i/o timeout” fqdn=www.terracotta.org. resolver=192.168.99.18
3/7/2017 10:18:09 PMtime=“2017-03-08T03:18:09Z” level=info msg=“No answer found” client=10.42.87.202 question=www.terracotta.org. type=A
3/7/2017 10:18:09 PMtime=“2017-03-08T03:18:09Z” level=warning msg=“Recurser error: read udp 172.17.0.2:57484->192.168.1.28:53: i/o timeout” fqdn=www.terracotta.org. resolver=192.168.99.18
3/7/2017 10:18:09 PMtime=“2017-03-08T03:18:09Z” level=info msg=“No answer found” client=10.42.87.202 question=www.terracotta.org. type=AAAA

The normal behavior is containers are configured to point to 169.254.169.250, which the dns container listens on; it answers things for service discovery locally and recurses to the public resolver configured on the host for other names.

All that is happening, but the dns container can’t talk to the public server. It’s not clear why it can’t but your own container can.

https://github.com/rancher/rancher/issues/7243