I’m running a Rancher Server for testing purposes. I already provisioned a custom cluster on 3 custom nodes (all CentOS 7.7 1908 VMs, with Docker 18.09.9) that runs very fine. But trying to provision a Cluster with the vSphere driver is not working.
- Single Node Rancher Test Environment
- Running Rancher v2.3.2 on CentOS 7.7 1908 and Docker 18.09.9 installed on a VM
- Our vSphere Environment is a vSphere 6.7
- I’m using a recognized CA signed certificate
- The whole system is running behind a Proxy
- CloudCredentials for vSphere
- Node Template with vSphere Credentials
This is the Rancher Template section from my Node Template:
My cloud-config.yml file is hosted on a local webserver. But I also tried it without the cloud-config, the problem stays the same. The file looks like that:
#cloud-config rancher: network: http_proxy: http://192.168.100.101:8080 https_proxy: http://192.168.100.101:8080 no_proxy: localhost,127.0.0.1,0.0.0.0,.our.domain,172.16.0.0/21 runcmd: - echo 172.16.1.245 ranchermgmt.our.domain >> /etc/hosts ssh_authorized_keys: - ssh-rsa <my key> me@there
The deployment process runs without any errors until it comes to this point:
2019/11/29 13:28:45 [ERROR] cluster [c-nt6mr] provisioning: Failed to apply the ServiceAccount needed for job execution: Post https://172.16.3.19:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 2019/11/29 13:28:46 [INFO] kontainerdriver rancherkubernetesengine stopped 2019/11/29 13:28:46 [ERROR] ClusterController c-nt6mr [cluster-provisioner-controller] failed with : Failed to apply the ServiceAccount needed for job execution: Post https://172.16.3.19:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings?timeout=30s: net/http: request canceled while waiting for connection (Client
Afterwards rancher is trying to deploy the cluster again and again until I stop it.
I’m totally stuck with this. Thanks in advance for your help.