Ingresses, External DNS and fault tolerance

Waya · October 16, 2018, 2:43pm

Hi,

With Rancher 1.6 we were using the external-dns route53 service to sync. the IP of our load-balancers with our DNS records stored on AWS. It worked well. For instance if one of the node running a load-balancer had a critical failure its IP would automatically get removed from the external DNS records and clients would stop trying to reach it.

We are trying to do kind of the same thing on Rancher 2.x. We tried to do it using kubernetes-incubator/external-dns. It works (which is great already :p) but when a node fails its IP will still be listed in the DNS records anyway.
I guess kubernetes-incubator/external-dns doesn’t do any health check, it’s just reading the configuration from K8s concerning ingresses and their external IPs and that’s it.
Since the IP of one of the failed node is still within our DNS records we will still have clients trying to connect to it.

Are we doing something wrong with kubernetes-incubator/external-dns ?
Or maybe is there another way to get the the same goal : automatically removing node’s external IP from the DNS records if the node is not responding or in a failed state ?

When reading the doc it seems that having external/cloud load-balancers is the way to go. Should we go that way and just don’t spend anymore time on trying to work with DNS records synchronizations ? Are most of the deployments of Rancher 2.x configured that way ?

Thanks

Waya · October 22, 2018, 9:52am

I’m talking to myself here but maybe this can help someone else :
In the end we went for the ‘health check’ option on Route53. Route53 is checking the /healthz URL every 10sec on our nodes and if the result is not HTTP 200 then the IP address of the failed node will be removed from the DNS entries. It would then take another minute for the client to stop trying to reach this failed node since that’s the TTL of our DNS entries.

Topic		Replies	Views
More complex dynamic DNS on EC2 RKE cluster Rancher	1	546	July 8, 2020
Global DNS problem Rancher	1	721	March 31, 2020
Ingress and Failover Rancher	6	2710	October 25, 2018
Rancher 2.0 with Route 53 AWS Rancher	2	1408	October 24, 2018
Limit number of dns entries by external dns Rancher	12	1244	January 7, 2020

Ingresses, External DNS and fault tolerance

Related topics