Healthchecks causing upgrades to fail

davalex · August 22, 2017, 8:40pm

Hi,

After implementing healthchecks on most of our services we are having issues upgrading services. We typically run a stack with 10 - 12 services each. One stack represents what we earlier used to call a application. Usually 2-3 backend services, 2-3 frontend services and load balancers between them. Failures during upgrade usually happens to the frontend services that are linked to a backend service. The healthcheck running on frontend also check that the backend application is responding as it should. If not - the front end service is recreated.
During upgrades the backend services are upgraded first. Trying to upgrade the frontend service often fails with the following error message:

ERRO[0049] Failed to start: frontend-portal : Service frontend-portal must be state=active or inactive to upgrade, currently: state=updating-active

This is probably because the frontend-service has detected that the backend was upgraded and is now reinitalizing due to change IP (or what ever) on the backend service.

Upgrades are done automatically from our CI-system using the rancher binary:

rancher up --upgrade -c --interval 10000 --batch-size=1

Our healthcheck typically looks like this:

frontend-portal:
scale: 2
health_check:
port: 80
interval: 2000
request_line: ‘GET “/” “HTTP/1.1\r\nHost: bla\r\nUser-Agent: healthcheck”’
unhealthy_threshold: 3
healthy_threshold: 1
response_timeout: 2000
initializing_timeout: 20000
reinitializing_timeout: 20000

So what is the correct way of upgrading services when healtchecks are configured? Is there a way to disable healthchecks during upgrades or maybe ignore the state “state=updating-active” and upgrade anyhow?

Best regards,
Alexander

Topic		Replies	Views
Healthcheck failing Rancher 1.x	1	851	June 19, 2017
Can I customize the healthcheck response? Rancher 1.x	5	1165	June 2, 2016
Heath checks being distributed outside the stack - service stays in "Initializing" state Rancher 1.x	0	594	November 17, 2016
Health checks are stuck in the initializing state after 0.9.0 RancherOS upgrade	0	785	March 28, 2017
Upgrading of "unhealthy" services Rancher 1.x	0	744	February 29, 2016

Healthchecks causing upgrades to fail

Related topics