Recently upgraded to a working rancher 0.63 to v1.0.0 (GA)… First thing I noticed was, cross host networking was broken, and pings between network agents/containers on different hosts were failing. I decided to clean install both host servers (coreos-1, coreos-2 with latest versions), and also the rancher server instance. However, this still fails under clean install. Can ping containers on same host, but different hosts fail.
I noticed in host process list a bunch of spawned /etc/init.d/rancher-net start which seem to be growing in numbers… I shelled into the rancher-agent container, and took a look at around. You can see a loop happening, here is the full dump:
Probing around the init script for rancher-net, trying to run some of these ip xfrm commands fail with:
$ip xfrm state add src 22.214.171.124 dst 126.96.36.199 spi 42 proto esp mode tunnel aead “rfc4106(gcm(aes))” 0x0000000000000000000000000000000000000001 128 sel src 188.8.131.52 dst 184.108.40.206
RTNETLINK answers: Function not implemented