Unable to use User managed Identity authentication method to setup RKE cluster

Hi, we have been provisioning RKE cluster using Service principals for a while but due to security constraints from Azure we are now switching towards using Managed Identity and following the RKE documentation for the same. We have assigned User managed identity to each VM’s with contributor access but still we are getting below errors even after using correct configurations as per the RKE documentation.


time=“2024-06-26T14:59:23Z” level=info msg=“Running RKE version: v1.5.8”

2024-06-26T14:59:23.7608503Z time=“2024-06-26T14:59:23Z” level=info msg=“Initiating Kubernetes cluster”

2024-06-26T14:59:23.8718073Z time=“2024-06-26T14:59:23Z” level=info msg=“[dialer] Setup tunnel for host []”

2024-06-26T14:59:23.8718772Z time=“2024-06-26T14:59:23Z” level=info msg=“[dialer] Setup tunnel for host []”

2024-06-26T14:59:23.8719351Z time=“2024-06-26T14:59:23Z” level=info msg=“[dialer] Setup tunnel for host []”

2024-06-26T15:06:09.2047067Z time=“2024-06-26T15:06:09Z” level=warning msg=“Failed to set up SSH tunneling for host []: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Failed to dial ssh using address []: dial tcp connect: connection timed out”

2024-06-26T15:06:09.2049609Z time=“2024-06-26T15:06:09Z” level=warning msg=“Failed to set up SSH tunneling for host []: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Failed to dial ssh using address []: dial tcp connect: connection timed out”

2024-06-26T15:06:09.2051069Z time=“2024-06-26T15:06:09Z” level=warning msg=“Failed to set up SSH tunneling for host []: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Failed to dial ssh using address []: dial tcp connect: connection timed out”

2024-06-26T15:06:09.2051948Z time=“2024-06-26T15:06:09Z” level=warning msg=“Removing host [] from node lists”

2024-06-26T15:06:09.2052539Z time=“2024-06-26T15:06:09Z” level=warning msg=“Removing host [] from node lists”

2024-06-26T15:06:09.2053479Z time=“2024-06-26T15:06:09Z” level=warning msg=“Removing host [] from node lists”

The cluster configuration we used is following:

ssh_key_path: rkeKey
    - address: $controlPlaneIp
      hostname_override: rke-control
      user: rancher
        - controlplane
        - etcd
    - address: $workerIp1
      hostname_override: rke-worker-1
      user: rancher
        - worker
    - address: $workerIp2
      hostname_override: rke-worker-2
      user: rancher
        - worker
    name: azure
      useManagedIdentityExtension: true
      userAssignedIdentityID: $AZ_WORKLOAD_IDENTITY_ID
      aadClientId: $AZ_INFRA_CLIENT_ID
      location: eastus
      resourceGroup: $RESOURCE_GROUP_RKE
      subnetName: rke-subnet
      subscriptionId: $AZ_SUBSCRIPTION_ID_PROV
      vnetName: rke-vnet
      tenantId: $AZ_TENANT_ID_PROV
      vmType: $RKE_VM_SIZE
      LoadBalancerSku: $RKE_LOADBALANCER_SKU
ignore_docker_version: true
kubernetes_version: $RKE_K8S_VERSION

Could you please help to resolve this issue ASAP?