I have following set up:
host1 is in the cloud, host2 and host3 are on premise (home network) behind wi-fi router/firewall
Rancher server is deployed on host1.
The problem: It seems like only one of host2 or host3 can be part of the Rancher IPSec network at a time. Host1 has no such problem, so it’s only for hosts running onprem.
Symptom: rancher healthcheck container, running on the host that is having problems, is stuck in the ‘Initializing’ state. Cross host/container ping doesn’t work for containers deployed on this host.
I can ‘fix’ host that is in this state (let’s say host2) by rebooting it. Host reboot will generate following sequence (i can observer it in the /env/1a5/infra/hosts page):
- host2 status Active -> Reconnecting->Active
- host2 healthcheck status Initializing->Stopped->Initializing->Running(green)
- host3 healthcheck status Running->Stopped->Initializing
- host3 becomes non operational and host2 is ‘fixed’ i.e. i can ping it’s containers from host1