Auto-Restart and reconnecting host

Xavier_Baude · July 19, 2016, 8:25am

Dear all,

We facing an issue with Rancher. I created a service with auto restart in a two hosts environment. When I shoot one VM, in infrastructure view I see a “reconnecting…” state, my service is not available but the stack status is still “active”. Is there a timeout settings for the host ? Even if I deactivate the host.

I understand there is an option “health check” but how to perform the health check with a daemon without tcp/udp connection ? (ex : ntp)

denise · July 26, 2016, 6:15pm

The behavior that you described is what is expected for a reconnecting host.

A couple of notes:

We don’t automatically delete any hosts from the Rancher setup. If they are in reconnecting, we are expecting our users to either fix or remove the host from the UI.
When a host is inactive state, none of the services/containers on the host will be moved off the host unless there is a health check. Rancher has no knowledge of whether or not this container might still be running on the host, but it’s just not connected to Rancher.
If you have no health check, the only way to move the container to a different host is to delete it.

Currently, we don’t have support for health checks for what you’re asking.

Xavier_Baude · July 27, 2016, 9:21am

Thanks for your response @denise .

The most disturbing for me is the state of the stack which stay in active/green state. Maybe a “Unknown” state will be more appropriate ?

I developped a script to connect to the rancher API to get the status of a stack. In this case I’m never noticed of a problem.

denise · August 3, 2016, 6:30am

Technically, there is nothing wrong with the stack. Rancher doesn’t know the state of that container and assumes it is healthy. The reconnecting state of a host doesn’t indicate that the container is still not being up and running. It just indicates the connection between rancher/agent container and the rancher/server container failed.

Without a health check, Rancher does not know the state of the container and would not be able to report that something is unhealthy with the state of the service.

Topic		Replies	Views
Bug in rancher server 1.5.1 Rancher 1.x	2	843	March 21, 2017
Automatically remove disconnected hosts Rancher 1.x	17	4176	April 30, 2017
Host failed but container not failing over Rancher 1.x	17	4054	January 12, 2016
Load Balancer health check? Rancher 1.x	2	1315	January 14, 2016
Thoughts / Questions after bringing Rancher to Production on AWS Rancher 1.x	2	1218	September 27, 2016

Auto-Restart and reconnecting host

Related topics