Kubernetes environment

I have been playing with rancher for the past week and now I’m hitting those 2 problems…

I have a 3 host ha-rancher setup (no issue there I was even able to update it from 1.1.3 1.2.0-pre2 to check if the issue happened there too) with a Kubernetes environment with 3 hosts.

  • When ever I lose a Kubernetes host, when it reconnects OR I add a replacement etcd just never start on that new node.
  • Adding a new kubernetes host afterwards, I needed to restart the kubernetes stack for etcd to start (from 1 host to 3 hosts)

any idea why?

Can you share what version of etcd your kubernetes stack is running?

I ended up doing LOTS of stuff and it worked…

Only thing left is that the third etcd still shows as WARNING in Hosts UI but the container is healthy and well behaving.

I’m pretty sure it’s related to the UDP ports required but I’ll have to dig more! If you don’t mind I’ll leave this post open a couple of days to post my findings, they might help more people in my situation.

So I started to realize that my hosts were joining with their Public IPs instead of their private one (No idea how rancher agent actually interpolate it) so I change my bootstrap logic to include -e CATTLE_AGENT_IP=${PRIVATE_IP} and now everything works flawlessly so far!

For some reasons AWS Security Groups are not applied to external traffic (at least the from Security-Group feature). Which was my issue, I found it by allowing all from 0.0.0.0/0 instead of from my kubehost security group.

I’ll make some more testing and let you know if it works as expected.

@Yann_David, would you mind filing an issue for the bug you observed regarding ip traffic routing over Public IPs? We get a lot of complaints about this, but some extra context would be helpful.

Thanks

Sure. Once I got something running on the setup I will… Still unable to expose RC/Pods to the outside world :confused:

Github issue: https://github.com/rancher/rancher/issues/5909