Unable to SSH into VMWare nodes

I am unable to ssh into any node cluster member. I tried downloading the keys from the UI and ran the below from OSX terminal with they key in current directory (after downloading from UI):
ssh -i id_rsa rancher@

My rancher is v2.0.7

OS ISO URL:
https://releases.rancher.com/os/latest/rancheros-vmware.iso

I also tried using a cloud-config.yml (repurposed from my rancherOS with rancher v2 install via an internal url.
Example:
cloud init (from node template)
http://repo/confs/cloud-config.yml

I found an old post on here a similar issue but with AWS. Although, I tried same steps with no luck.

The nodes are in same subnet as my Mac. I am prompted for a password.

Any questions/suggestions welcomed.

I guess you should place your cloud-config.yml on a web-server accessible by the node and use the corresponding http-url instead of using an internal url.

It works for my nodes (also VMware).

Thanks for your response! I created a new cluster after updating a clone node template with an updated URL that is publicly exposed and the cluster gets stuck created, so you might be on to something. Strange though as this internal url worked for the rancher OS (to install rancher 2.x) install. Additionally, it appears that the cloud config to stand up rancher 2.x is much different that what you would need for a rancheros-vmware.iso to stand up etcd, control-plan, and worker nodes (servers in VMWare).

"
Error with pre-create check: “ServerFaultCode: Cannot complete login due to an incorrect user name or password.”; Timeout waiting for ssh key
"

Would you mind sharing your cloud-config to stand-up VM’s so I can have a reference?

In the meantime, I will start considering this:
https://pcocc.readthedocs.io/en/latest/manpages/man7/newvm-tutorial.html#newvm

Thanks!

Which vSphere Version do you use?

My cloud-config:

#cloud-config

ssh_authorized_keys:

rancher:
console: ubuntu
resize_device: /dev/sda

I use vsphere 6.7 which sits on top of an esxi host that is on version 6.5. My nodes are running the default vmware iso for rancher (https://releases.rancher.com/os/latest/rancheros-vmware.iso)

"
Error with pre-create check: “ServerFaultCode: Cannot complete login due to an incorrect user name or password.”; Timeout waiting for ssh key
"

Are you still getting this error?

No, because I retracted that previous cloud-config I was using and tried what you posted here. So, it seems that it’s not carrying over the config to the VM’s it creates.

For example, I used your cloud-config stanza and removed the other parts and it built the node. It has some errors in red but it flashed before I could read it all or take a screenshot. Guessing I need to find the log for cluster creation process. I jumped to the box it created using vmware console (same as using a crash cart in front of a physical server) and checked /home/rancher and /root for .ssh folder to see if there was an authorized_keys file created with my pub key in it, but no.

/home/rancher has no .ssh folder
/root does have a .ssh folder but no authorized_keys file.

(attached image is from the node it created, the cloud-config file present there)

I assume whatever you put in your cloud init field should push to the node (VM) you create, right?

Rancher V2 -> Node Templates: Rancher_Template_2GB_Workers_Cloud-Init-On
edit > Account Access

  • IP/account/PW port for vcenter

edit > Instance Options

edit > Scheduling

  • Data Center ->
  • Pool ->
  • Host ->
  • Network -> Home10G_PG (This is pulled from vcenter)
  • Data Store -> SSD_RAID5 (This is pulled from vcenter)
  • Folder ->

Rancher: add cluster > vSphere > cluster options >

  • Require a supported Docker version
  • Kubernetes Version v1.11.1-rancher1-1
  • Network provider -> Canal
  • Nginx Ingress -> Enabled
  • Cloud Provider -> None (for now, will add one later if this other piece works)
  • Docker Root Directory -> /var/lib/docker
  • Metrics Server Monitoring -> Enabled
  • Project Network Isolation -> Disabled
  • Pod Security Policy Support -> Disabled
  • Default Pod Security Policy -> None (no options to choose)

So I tailed the rancher docker container to get the logs for the cluster creation:
2018/08/29 19:56:33 [INFO] [mgmt-cluster-rbac-delete] Creating namespace c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-cluster-rbac-delete] Creating Default project for cluster c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-cluster-rbac-delete] Creating System project for cluster c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Creating namespace p-2z8ws
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Creating creator projectRoleTemplateBinding for user user-pkw9c for project p-2z8ws
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Creating namespace p-5zp4m
2018/08/29 19:56:33 [INFO] [mgmt-cluster-rbac-delete] Updating cluster c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Creating creator projectRoleTemplateBinding for user user-pkw9c for project p-5zp4m
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Creating creator clusterRoleTemplateBinding for user user-pkw9c for cluster c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating clusterRole c-jfqk9-clusterowner
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Setting InitialRolesPopulated condition on project p-5zp4m
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating clusterRoleBinding for membership in cluster c-jfqk9 for subject user-pkw9c
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Setting InitialRolesPopulated condition on cluster
2018/08/29 19:56:33 [INFO] [mgmt-cluster-rbac-delete] Updating cluster c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating clusterRole p-5zp4m-projectowner
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Updating project p-5zp4m
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating role cluster-owner in namespace c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Setting InitialRolesPopulated condition on project p-2z8ws
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for membership in project p-5zp4m for subject user-pkw9c
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating roleBinding for subject user-pkw9c with role cluster-owner in namespace
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Updating project p-2z8ws
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating clusterRole p-2z8ws-projectowner
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating clusterRole c-jfqk9-clustermember
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating role cluster-owner in namespace p-5zp4m
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for membership in project p-2z8ws for subject user-pkw9c
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating clusterRoleBinding for membership in cluster c-jfqk9 for subject user-pkw9c
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Updating project p-5zp4m
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating roleBinding for subject user-pkw9c with role cluster-owner in namespace
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating role project-owner in namespace c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Updating clusterRoleBinding clusterrolebinding-nfzsm for cluster membership in cluster c-jfqk9 for subject user-pkw9c
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating role cluster-owner in namespace p-2z8ws
2018/08/29 19:56:33 [INFO] [mgmt-project-rbac-create] Updating project p-2z8ws
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating role project-owner in namespace c-jfqk9
2018/08/29 19:56:33 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for subject user-pkw9c with role project-owner in namespace
2018/08/29 19:56:33 [INFO] [mgmt-auth-crtb-controller] Creating roleBinding for subject user-pkw9c with role cluster-owner in namespace
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating role project-owner in namespace p-5zp4m
2018/08/29 19:56:34 [ERROR] ProjectRoleTemplateBindingController p-2z8ws/creator-project-owner [mgmt-auth-prtb-controller] failed with : couldn’t create role project-owner: roles.rbac.authorization.k8s.io “project-owner” already exists
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for subject user-pkw9c with role project-owner in namespace
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating role project-owner in namespace p-2z8ws
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating role admin in namespace p-5zp4m
2018/08/29 19:56:34 [INFO] [mgmt-cluster-rbac-delete] Updating cluster c-jfqk9
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for subject user-pkw9c with role admin in namespace
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating role admin in namespace p-2z8ws
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for subject user-pkw9c with role project-owner in namespace
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for subject user-pkw9c with role project-owner in namespace
2018/08/29 19:56:34 [INFO] [mgmt-cluster-rbac-delete] Updating cluster c-jfqk9
2018/08/29 19:56:34 [INFO] [mgmt-auth-prtb-controller] Creating roleBinding for subject user-pkw9c with role admin in namespace
2018/08/29 19:56:34 [INFO] [mgmt-cluster-rbac-delete] Updating cluster c-jfqk9
2018/08/29 19:56:35 [INFO] stdout: Creating CA: management-state/node/nodes/box01/certs/ca.pem
2018/08/29 19:56:35 [INFO] stdout: Creating client certificate: management-state/node/nodes/box01/certs/cert.pem
2018/08/29 19:56:35 [INFO] stdout: Running pre-create checks…
2018/08/29 19:56:36 [INFO] stdout: Creating machine…
2018/08/29 19:56:36 [INFO] stdout: (box01) Image cache directory does not exist, creating it at management-state/node/nodes/box01/cache…
2018/08/29 19:56:36 [INFO] stdout: (box01) Downloading management-state/node/nodes/box01/cache/boot2docker.iso from https://releases.rancher.com/os/latest/rancheros-vmware.iso
2018/08/29 19:56:39 [INFO] stdout: (box01) 0%!.(MISSING)…10%!.(MISSING)…20%!.(MISSING)…30%!.(MISSING)…40%!.(MISSING)…50%!.(MISSING)…60%!.(MISSING)…70%!.(MISSING)…80%!.(MISSING)…90%!.(MISSING)…100%!(NOVERB)
2018/08/29 19:56:39 [INFO] stdout: (box01) Generating SSH Keypair…
2018/08/29 19:56:39 [INFO] stdout: (box01) Creating VM…
2018/08/29 19:56:40 [INFO] stdout: (box01) Uploading Boot2docker ISO …
2018/08/29 19:56:45 [INFO] stdout: (box01) adding network: Home10G_PG
2018/08/29 19:56:45 [INFO] stdout: (box01) Reconfiguring VM
2018/08/29 19:56:45 [INFO] stdout: (box01) Setting disk.enableUUID to TRUE
2018/08/29 19:56:45 [INFO] stdout: (box01) setting guestinfo.cloud-init.data.url to https://repo.[external-facing-URL]/confs/cloud-config.yml
2018/08/29 19:56:45 [INFO] stdout: (box01)
2018/08/29 19:56:47 [INFO] stdout: (box01) Waiting for VMware Tools to come online…


E0829 20:02:47.973871 6 reflector.go:322] github.com/rancher/rancher/vendor/github.com/rancher/norman/controller/generic_controller.go:139: Failed to watch *v1.ClusterRoleBinding: Get https://10.0.0.230:6443/apis/rbac.authorization.k8s.io/v1/watch/clusterrolebindings?resourceVersion=705&timeout=1h0m0s&timeoutSeconds=517: tunnel disconnect
E0829 20:02:47.973930 6 reflector.go:322] github.com/rancher/rancher/vendor/github.com/rancher/norman/controller/generic_controller.go:139: Failed to watch *v1beta2.ReplicaSet: Get https://10.0.0.230:6443/apis/apps/v1beta2/watch/replicasets?resourceVersion=795&timeout=1h0m0s&timeoutSeconds=484: tunnel disconnect

I don’t notice anything that stands out with issues using the cloud-config. Here it is for reference:
#cloud-config
ssh-authorized-keys:
- [pub-key-bla-bla-bla…]

rancher:
console: ubuntu
resize_device: /dev/sda

SOLVED:
I had dashes instead of underscores =/

I had:
ssh-authorized_keys

Should have been:
ssh_authorized_keys

I should have had it correct at one time, probably missed it the last time because of bad multi-tasking, etc. I basically copied the file from my apache server to my rancherOS (rancher UI server) and ran: ros config validate -i cloud-config.yml which yielded:
ERRO[0000] ssh-authorized-keys: Additional property ssh-authorized-keys is not allowed

When I read that on-screen, I was like… why dashes?? Got it!

Embarrassing but this will be worth it to have this functionality in the long run.

Cheers!