DNS broken for single workload: can't resolve '(null)'

DomiStyle · October 10, 2018, 10:26am

Hey,

while upgrading my stacks from Rancher 1.6 to 2.1 I ran into a strange issue.

I deployed nextcloud:13-fpm-alpine as workload + a sidecar for the cronjob and all the other services needed (Redis, MariaDB, …) and everything was looking good.

However, external DNS is broken inside of the Nextcloud container only. The Redis and MariaDB container resolve rancher.com just fine but the Nextcloud container and sidecar will fail to resolve anything that isn’t a discoverable service.

The output inside of the Nextcloud container looks like this:

/var/www/html # nslookup rancher.com
nslookup: can't resolve '(null)': Name does not resolve

nslookup: can't resolve 'rancher.com': Try again

Removing options ndots:5 from resolv.conf will lead to a result but it takes 5-10 seconds:

/var/www/html # nslookup rancher.com
nslookup: can't resolve '(null)': Name does not resolve

Name:      rancher.com
Address 1: 104.24.16.51
Address 2: 104.24.17.51
Address 3: 2606:4700:20::6818:1033
Address 4: 2606:4700:20::6818:1133

This is the only workload where this happens, with the difference being that it has a sidecar and it uses the Nextcloud image. This works fine in both Rancher 1.6 and any other workload in this project/namespace.

Any ideas what could cause this? Does anybody else see this with the image nextcloud:13-fpm-alpine?

DomiStyle · October 10, 2018, 11:45am

Switching to nextcloud:13-fpm solves this issue, any idea why?

vincent · October 10, 2018, 3:53pm

Alpine uses a different libc then most other base image, which has a different DNS resolver in it with different behaviors and features (mostly not) implemented.

5-10 seconds maybe suggests timeout(s) trying to contact a resolver though.

DomiStyle · October 10, 2018, 4:57pm

Just tried more Alpine containers and that is indeed the cause.

Is there a way to resolve this? Since there is just a single nameserver the only timeout that could happen is to the Kubernetes internal DNS. Unless Alpine uses something outside of resolv.conf for DNS resolving.

Topic		Replies	Views
Issue With DNS on v0.63.1 Rancher 1.x	11	2028	March 29, 2016
External DNS Lookups Fail Rancher 1.x	2	2910	March 29, 2017
Kubernetes: resolve external DNS from pod Rancher 1.x	1	1194	September 9, 2016
ExternalDNS: Record created in Cloudflare GUI but not on DNS servers Rancher	0	402	August 22, 2020
Help about dns resolving to containers Rancher 1.x	3	951	November 10, 2016

DNS broken for single workload: can't resolve '(null)'

Related topics