Intermittent Failure of Managed Network causing critical issues for some containers

slemaire · April 7, 2017, 8:06pm

Hey there,

My org is having a persistent issue with Rancher/Docker’s managed network layer. We have three physical hosts in one datacenter and when it fails, containers affected are note able to communicate with each other and one of the admins has to manually restart the IPSec router container on the affected host to fix it. It usually happens on one specific host, but has occurred intermittently on others and has also occured in our staging environment which is a mix of physical and virtual hosts in a separate datacenter.

Honestly, Googling the issue seems to indicate that it’s a ‘bug’ and that there isn’t a solution, which is kind of ridiculous. Has anyone experience the problem? Were you able to correct it without ditching use of managed network altogether?

leodotcloud · April 9, 2017, 6:00am

@slemaire can you please find/ping me (leodotcloud) on https://slack.rancher.io next time when this issue happens? I would like to jump on a call to debug this further.

Topic		Replies	Views
Networking goes randomly down Rancher 1.x	4	1092	August 10, 2017
IPSec network fails silently on a host Rancher 1.x	24	7855	November 6, 2017
Intermittent issue with communication between containers across different hosts Rancher 1.x	1	959	December 30, 2016
Rancher IPSEC errors Rancher 1.x	1	1079	August 10, 2017
Occasionally, a container is unreachable by IP? Rancher 1.x	2	1422	September 26, 2016

Intermittent Failure of Managed Network causing critical issues for some containers

Related topics