Stagger Container Startup

Hi,

We are currently on Rancher 1.6.9. We have a number of containers running on a host. If we need to bring up a new host and terminate the old host, Rancher correctly tries to move the containers over to the new host. Occasionally we see an issue where when the containers try to start on the new host, there are enough containers trying to start at the same time that the initialization timeout elapses and then the containers recreate. This gets us into a timeout/recreate loop where nothing ever starts. We have to manually stop services in Rancher and let the containers come up one or two at a time to get everything running again. I know we can increase the initialization timeout but does Rancher have any way of staggering the container startup so that the host doesn’t get overloaded like this?

1 Like