I’m trying to create a vSphere kubernetes cluster using rancher with custom made Vmware template The Vmware template has Ubuntu 20.04 OS.
Rancher is connecting to vSphere, creating vm, but stucks on SSH waiting to be available.
The created VM has SSH working properly. The ssh-key authentication to the created VM is working perfectly from outside.
The cloud-init file is also passed with credentials (ssh key and username) that can be used for making ssh connection. The following error message is obtained whilst provisioning. The Ubuntu template is also installed with cloud-init utility.
Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded
If the rancher was able to provision the VM, why is it not able to gain ssh access to the created VM.
Can anyone suggest a workaround to fix the issue.
Thanks in advance
When Rancher deploys a VM, it creates a dedicated SSH key pair, used only to connect on this node. It connect with the user docker. This is implicit. You can have a look to the cloud-init file on your node by mounting /dev/sr0:
mount /dev/sr0 /mnt
Depending on your cloud-init config, this file could be omitted. If so, the ssh public key is not populated into the authorized_keys file of docker user.
Also, be sure that the IP address get by the node is reachable by Rancher. Open-vm-tools must be started.
I hope it will help.
Thanks for the valuable input, I’m able to find the ssh key, however the provisioning worked when copied over the ssh key manually to the docker user.
I suspect the issue is with the cloud-init config which comes along during the OS provisioning
To get rid of the default cloud-init config, I’ve ran cloud-init clean prior to converting the vm to template so that rancher cloud-init will be used during the cluster provisioning, still the ssh key for docker is not been created in home directory of the docker user(/home/docker/.ssh).
Any idea why the docker ssh key is not been copied over to the home directory, so that login will go through.
Thanks in advance.
Ok, thanks for this, here is I have something for you.
The “Too many retries waiting for SSH to be available” error suggests Rancher is struggling to SSH into the VM during provisioning. Ensure accurate cloud-init data, SSH key permissions, and network/firewall settings.
Verify SSH daemon allows key-based authentication and isn’t blocking access. Check VM network connectivity to the Rancher server. Analyze cloud-init logs for clues in /var/log/cloud-init.log.