We’re using the traefik load balancer in some of our environments, which uses the rancher-metadata service to update it’s config dynamically.
We’ve now had two incidents where the load balancer has stopped updating. The first time we migrated to a new rancher environment which fixed the issue, but recreating the infra stacks and the load balancer stack didn’t fix the issue.
This time we’ve done a bit more investigation and it seems that none of the containers can resolve rancher-metadata anymore. Restarting the containers and the infrastructure stacks seems to have made a difference.
We can resolve rancher-metadata.rancher.internal though.
/ # curl http://rancher-metadata
curl: (6) Couldn’t resolve host ‘rancher-metadata’
/ # curl http://rancher-metadata.rancher.internal
We’re running Rancher Server v1.5.1
That would have suggested that the DNS search domain isn’t set, but it looks like it is:
/ # cat /etc/resolv.conf
search nginx.rancher.internal test.nginx.rancher.internal rancher.internal ntj
/ # ping rancher-metadata.rancher.internal
PING rancher-metadata.rancher.internal (169.254.169.250): 56 data bytes
64 bytes from 169.254.169.250: seq=0 ttl=63 time=0.082 ms
--- rancher-metadata.rancher.internal ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.082/0.082/0.082 ms
/ # ping rancher-metadata
`ping: bad address 'rancher-metadata'
@adamgraves-choices is your container based on alpine?
I have alpine containers that are not resolving
http://rancher-metadata do you know why this is? It is the same deal as @adamgraves-choices,
http://rancher-metadata.rancher.internal will resolve.