SSH timeout on AWS on specific instance types


We have created a new environment on Rancher 1.5 for testing proposes.

We’ve been able to spin “t2.micro” instances (eu-west-2) using the AMI ami-51776335 (RancherOS 0.9.1) without any problem.

Now we are trying to clone one of this micro instances, but changing the instance type to a bigger one (ex. m4.xlarge) and keeping the rest of the options the same results in a “Last error: Maximum number of retries (60) exceeded” error in the rancher logs.

Upon further debugging the rancher logs we see java errors looking like:

2017-04-11 13:26:11,186 ERROR [8a16a01b-1881-45ab-85f4-73c6fb8dc8c5:3556] [host:26] [host.provision] [] [utorService-113] [c.p.e.p.i.DefaultProcessInstanceImpl] Unknown exception io.cattle.platform.util.exception.ExecutionException: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded

The biggest instance we could deploy was a “t2.large”. It seems like only “t2” machines worked. Perhaps it could also be something related to the size as none of the xl’s worked (we also tried the “c4.xlarge”).

Have you ever experienced this issue?

Thanks a lot!

Worked for us with the new AMI.