IPSec network fails silently on a host

Hi I have the same problem. Today communication between two EC2-Servers went down.
Some feedback:

  • swanctl --list-sas: the connection between server 10.30.0.185 and 10.30.1.213 had no child and was in “CONNECTING”-State

  • swanctl --log shows much information but i could read about a “delete job” that cannot delete a child because it is not found

  • the other way: on server 10.30.1.213 there is no connection to 10.30.0.185. I can see with swanctl --list-conn that a connection should exist

  • Although on server 10.30.1.213 there were three connection on “Connecting” State with another server

This happens to me:

  • between kafka servers after 4-7 days (cluster chatter?)
  • between servers where prometheus fetch data from other servers

With restart of ipsec container connections are fine again for few days.

1 Like