Seemingly random 503 Service unavailable after some time


#1

Version: 2.1.3
Behavior
After some time Rancher’s NGINX ingress starts giving 503. It works initially but stops working after a day or a few days of running. When I check the service it is up and running and shows no errors and the ingresses are all ready (not initializing) and there are no errors in the ingress service itself.


#2

You are not alone, I have seen this too but don’t have an answer as to why yet. If I find anything out I’ll let you know.


#3

There is not enough information in this post to know where the problem is. Is Rancher’s NGINX ingress the ingress controller deployed into a RKE built Rancher HA cluster? Or are ingresses failing created in a Rancher Launched Kubernetes cluster? What are you creating/accessing and what is the response, if NGINX ingress is responding 503 and it’s not in the log, you should raise the logging verbosity and check the logging when you receive the 503’s.