Hi, we have been provisioning RKE cluster using Service principals for a while but due to security constraints from Azure we are now switching towards using Managed Identity and following the RKE documentation for the same. We have assigned User managed identity to each VM’s with contributor access but still we are getting below errors even after using correct configurations as per the RKE documentation.
Error:
time=“2024-06-26T14:59:23Z” level=info msg=“Running RKE version: v1.5.8”
2024-06-26T14:59:23.7608503Z time=“2024-06-26T14:59:23Z” level=info msg=“Initiating Kubernetes cluster”
2024-06-26T14:59:23.8718073Z time=“2024-06-26T14:59:23Z” level=info msg=“[dialer] Setup tunnel for host [20.12.8.32]”
2024-06-26T14:59:23.8718772Z time=“2024-06-26T14:59:23Z” level=info msg=“[dialer] Setup tunnel for host [20.12.15.244]”
2024-06-26T14:59:23.8719351Z time=“2024-06-26T14:59:23Z” level=info msg=“[dialer] Setup tunnel for host [20.7.36.242]”
2024-06-26T15:06:09.2047067Z time=“2024-06-26T15:06:09Z” level=warning msg=“Failed to set up SSH tunneling for host [20.12.15.244]: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Failed to dial ssh using address [20.12.15.244:22]: dial tcp 20.12.15.244:22: connect: connection timed out”
2024-06-26T15:06:09.2049609Z time=“2024-06-26T15:06:09Z” level=warning msg=“Failed to set up SSH tunneling for host [20.7.36.242]: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Failed to dial ssh using address [20.7.36.242:22]: dial tcp 20.7.36.242:22: connect: connection timed out”
2024-06-26T15:06:09.2051069Z time=“2024-06-26T15:06:09Z” level=warning msg=“Failed to set up SSH tunneling for host [20.12.8.32]: Can’t retrieve Docker Info: error during connect: Get “http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info”: Failed to dial ssh using address [20.12.8.32:22]: dial tcp 20.12.8.32:22: connect: connection timed out”
2024-06-26T15:06:09.2051948Z time=“2024-06-26T15:06:09Z” level=warning msg=“Removing host [20.12.15.244] from node lists”
2024-06-26T15:06:09.2052539Z time=“2024-06-26T15:06:09Z” level=warning msg=“Removing host [20.7.36.242] from node lists”
2024-06-26T15:06:09.2053479Z time=“2024-06-26T15:06:09Z” level=warning msg=“Removing host [20.12.8.32] from node lists”
The cluster configuration we used is following:
ssh_key_path: rkeKey
nodes:
- address: $controlPlaneIp
hostname_override: rke-control
user: rancher
role:
- controlplane
- etcd
- address: $workerIp1
hostname_override: rke-worker-1
user: rancher
role:
- worker
- address: $workerIp2
hostname_override: rke-worker-2
user: rancher
role:
- worker
cloud_provider:
name: azure
azureCloudProvider:
useManagedIdentityExtension: true
userAssignedIdentityID: $AZ_WORKLOAD_IDENTITY_ID
aadClientId: $AZ_INFRA_CLIENT_ID
location: eastus
resourceGroup: $RESOURCE_GROUP_RKE
subnetName: rke-subnet
subscriptionId: $AZ_SUBSCRIPTION_ID_PROV
vnetName: rke-vnet
tenantId: $AZ_TENANT_ID_PROV
vmType: $RKE_VM_SIZE
LoadBalancerSku: $RKE_LOADBALANCER_SKU
ignore_docker_version: true
kubernetes_version: $RKE_K8S_VERSION
Could you please help to resolve this issue ASAP?
Thanks!