Rancher 2+ node cluster dies overnight - reproducable with multiple OSs

Hey all, I’ve got a strange reproducible issue. I’m running the latest RancherOS and Rancher Server. These VMs are running on my Proxmox vhost. Overnight, the healthcheck container on one or more VMs will go unhealthy. The screen shots show it healthy, the next day, unhealthy.

All I do is spin up the VMs, setup the Rancher Server on one node, enable local auth. Then I add all the VMs to the Rancher Server via the link that the “Add Hosts” page spits out. Wait overnight, and It is unhealthy.

Here are the logs from the unhealthy healthcheck container.

9/19/2017 8:44:38 PMtime=“2017-09-20T01:44:38Z” level=info msg=“Starting haproxy listener”
9/19/2017 8:44:38 PMtime=“2017-09-20T01:44:38Z” level=info msg=“healthCheck – starting haproxy\n * Starting haproxy haproxy\n …done.\n”
9/19/2017 8:44:38 PMtime=“2017-09-20T01:44:38Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:38 PMtime=“2017-09-20T01:44:38Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/014438 (31) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/014438 (31) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/19/2017 8:44:39 PMtime=“2017-09-20T01:44:39Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:39 PMtime=“2017-09-20T01:44:39Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:39 PMtime=“2017-09-20T01:44:39Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:39 PMtime=“2017-09-20T01:44:39Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:39 PMtime=“2017-09-20T01:44:39Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:39 PMtime=“2017-09-20T01:44:39Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:40 PMtime=“2017-09-20T01:44:40Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:40 PMtime=“2017-09-20T01:44:40Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:40 PMtime=“2017-09-20T01:44:40Z” level=info msg=“Monitoring 2 backends”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/014441 (50) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/014441 (50) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:41 PMtime=“2017-09-20T01:44:41Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/014441 (63) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/014441 (63) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/19/2017 8:44:42 PMtime=“2017-09-20T01:44:42Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:42 PMtime=“2017-09-20T01:44:42Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/014442 (70) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/014442 (70) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/19/2017 8:44:42 PMtime=“2017-09-20T01:44:42Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:42 PMtime=“2017-09-20T01:44:42Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:43 PMtime=“2017-09-20T01:44:43Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:43 PMtime=“2017-09-20T01:44:43Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:44 PMtime=“2017-09-20T01:44:44Z” level=info msg=“6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1=DOWN”
9/19/2017 8:44:46 PMtime=“2017-09-20T01:44:46Z” level=info msg=“Scheduling apply config”
9/19/2017 8:44:46 PMtime=“2017-09-20T01:44:46Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/19/2017 8:44:47 PMtime=“2017-09-20T01:44:47Z” level=info msg=“6242cd69-3ef8-4abb-bac4-de6b2252d6e5_4dc157cb-504b-46c6-bf0e-887a836c890b_3=DOWN”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/134455 (86) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/134455 (86) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:55 AMtime=“2017-09-20T13:44:55Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/134455 (99) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/134455 (99) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/20/2017 8:44:56 AMtime=“2017-09-20T13:44:56Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:56 AMtime=“2017-09-20T13:44:56Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/20/2017 8:44:56 AMtime=“2017-09-20T13:44:56Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:56 AMtime=“2017-09-20T13:44:56Z” level=info msg=“healthCheck – reloading haproxy config with the new config changes\n[WARNING] 262/134456 (109) : config : ‘option forwardfor’ ignored for proxy ‘web’ as it requires HTTP mode.\n[WARNING] 262/134456 (109) : config : ‘option forwardfor’ ignored for backend ‘cattle-6242cd69-3ef8-4abb-bac4-de6b2252d6e5_b26bacb7-6aca-4e7c-8344-cff85b35880a_1’ as it requires HTTP mode.\n”
9/20/2017 8:44:57 AMtime=“2017-09-20T13:44:57Z” level=info msg=“Scheduling apply config”
9/20/2017 8:44:57 AMtime=“2017-09-20T13:44:57Z” level=info msg=“healthCheck – no changes in haproxy config\n”
9/20/2017 8:45:00 AMtime=“2017-09-20T13:45:00Z” level=info msg=“6242cd69-3ef8-4abb-bac4-de6b2252d6e5_4dc157cb-504b-46c6-bf0e-887a836c890b_4=DOWN”
9/20/2017 8:45:01 AMtime=“2017-09-20T13:45:01Z” level=info msg=“Scheduling apply config”
9/20/2017 8:45:01 AMtime=“2017-09-20T13:45:01Z” level=info msg=“healthCheck – no changes in haproxy config\n”

Any ideas? Should I open up a bug report? I can reproduce this with OpenSuse 42.3 also.

Here are some more logs:

Rancher agent on trouble node: https://hastebin.com/acopohazon.sql

Rancher server on “healthy” node: https://hastebin.com/uvupimilib.hs