Node creation stuck with message WaitingForNodeRef

Hi!

I am going crazy trying to discover why new nodes provisioned by Rancher are stuck with the message “Waiting for Node Ref”. The host is up with SSH ready, I can remotely login but no action is performed to continue with the installation and configuration RKE2. I do not know what can I do or check to discover what is failing.

status:
  bootstrapReady: true
  conditions:
    - lastTransitionTime: '2024-10-03T12:19:56Z'
      status: 'True'
      type: Ready
    - lastTransitionTime: '2024-10-03T12:19:46Z'
      status: 'True'
      type: BootstrapReady
    - lastTransitionTime: '2024-10-03T12:23:45Z'
      status: 'True'
      type: InfrastructureReady
    - lastTransitionTime: '2024-10-03T12:19:46Z'
      reason: WaitingForNodeRef
      severity: Info
      status: 'False'
      type: NodeHealthy
  lastUpdated: '2024-10-03T12:19:46Z'
  observedGeneration: 2
  phase: Provisioning

My setup:

Aye,

The problem is at the CPI layer in communication with the infrastructure provider.
What do you mean by Cloud VMware?

Sorry, I explain bad. I mean Rancher uses vSphere cloud provider to deploy RKE2 nodes in VMware.

I am using “Default RKE2 Embedded” CPI. At the beginning, I used vSphere CSI/CPI, but later I migrate to longhorn for PV and I modified the cluster to use the Default RKE2 CPI provider following your recommendations.

It seems to be something wrong with the cluster, because new nodes are not deployed:

Could I check some logs or Could I do some checks to know what is wrong?

You can check rancher logs (RMS)(try change verbosity).
Show what is Provider ID for all this cluster’s machines (Cluster Management → Machines).
From what you have attached it appears that the provider for this cluster is VMware.

Thanks @R2D2 for your comments! I will try the next week to check it. As soon as I find something, I will provide feedback just it is suitable for other users!