Auto restart sometimes not working

Hello there,

I am posting here to share a problem i had a few days ago. I have a rancher stack and all of my containers are set to restart always (rancher level not docker level) and for some reasons that i don’t know i had a container crashed for hours and rancher was not restarting it.

It does not happen all the time, most of the time my container does restart but on rare occasions it is stuck and just won’t restart unless i manually ask rancher to do so.
The container was marked in rancher as “stoped” and was red (as expected) it was just not restarting.

Has anyone seen this kind of behavior ? I will provide more informations next time i see this behavior.

If you have any advise on what i should look at next time i see it please tell me.


Component Version
Rancher v1.6.5
Cattle v0.182.1
User Interface v1.6.9
Rancher CLI v0.6.2
Rancher Compose v0.12.5

Thanks for your help

I’m still having the issue.

I have looked at the docker daemon log -> nothing

Currently i have setup an HTTP health-check but i still have the same issue. Sometimes my container die his status is marked as unhealthy but it is not recreated nor restarted.

Where should i look to have some clue ? I have looked at rancher logs / agent / health-check and so one. I could not find a single clue.

Any help appreciated

Thanks

I’m also seeing this issue in rancher 1.6.15

Did you ever find a way to work around this?

I think there is some kind of wierd race condition.
If your container start / stop multiple times in a verry short time it sometimes get stuck (this can happend when you have network issue for instance).
I just wait Xs before my container actually start and i don’t have the issue anymore.