Unable to deploy to EC2

So I’ve been trying to replicate this setup while I familiarise myself with Rancher and have noticed that since deploying 0.51 (on ami-0dd6bd6d) again after the Christmas break, that cross-host communications no longer work.

Deploying three hosts into us-west-1, and allowing Rancher to create the SG, the hosts can’t communicate via their rancher network addresses.

None of the machines appear to be listening on :22 for me to connect and diagnose. My rancher server in the same SG is available via SSH, which rules out Amazon network interaction for that element. Does Rancher change something in the ubuntu image and move SSH to another port?

Hi

According to the link to the wordpress setup you’re replicating, you have to add the rules to the SG yourself. Did you add port 500/4500 UDP?

Rancher doesn’t move SSH to another port. Have you checked the security groups to make sure there has been no changes to it? If you re-use the rancher-machine security group, but someone else has changed the settings, we don’t update to fix it.

Which AMI are you using to deploy the host? I noticed you used RancherOS for Rancher, so not sure if you used that for the hosts as well. If you used RancherOS, you need to ssh using the rancher user.

Did you also download the machine config to SSH into the hosts? Please make sure to use the correct user for your linux distribution.
http://docs.rancher.com/rancher/rancher-ui/infrastructure/hosts/#accessing-hosts-from-the-cloud-providers

Yep, in the end the solution was to configure rancher to use the 172.x addresses. This still resulted in intermittent connection. After restarting a host I needed to restart some services several times before they were able to contact their peers.

The RancherOS machine is in the same SG, so to be able to SSH to one I guess this disproves policy blocking access to others. The other machines were deployed using Rancher. Judging by the defaults in the fields that means they were Ubuntu?

I was able to SSH to the RancherOS box, but not the machines it deployed.

We’ve halted work on this for now and plan to review it next week.

Could you post your SG configuration? I’ve never had to reconfigure Rancher address ranges.

Regarding services not being able to contact their peers, you may need to actually wait until a service is available since docker starts the container(s) very quickly. I have loops doing something like:

until ping -c 1 my-required-service
do
    echo "waiting for required service to appear (1 sec)"
    sleep 1
done