Waiting for cluster agent to connect\r\n

Hello,
I’m getting these errors after restarting the rancher UI rancher/rancher:latest
“log”:“2024/01/26 16:17:20 [ERROR] ClusterController c-q9whh [cluster-provisioner-controller] failed with : Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) [192.168.254.141]\r\n”,“stream”:“stdout”,“time”:“2024-01-26T16:17:20.399205485Z”}
{“log”:“2024/01/26 16:17:20 [ERROR] ClusterController c-q9whh [user-controllers-controller] failed with : failed to start user controllers for cluster c-q9whh: failed to contact server: Get https://192.168.254.152:6443/api/v1/namespaces/kube-system?timeout=30s: waiting for cluster agent to connect\r\n”,“stream”:“stdout”,“time”:“2024-01-26T16:17:20.399925559Z”}
{“log”:“2024/01/26 16:17:35 [ERROR] ClusterController c-q9whh [user-controllers-controller] failed with : failed to start user controllers for cluster c-q9whh: failed to contact server: Get https://192.168.254.152:6443/api/v1/namespaces/kube-system?timeout=30s: waiting for cluster agent to connect\r\n”,“stream”:“stdout”,“time”:“2024-01-26T16:17:35.381182729Z”}

The rancher gui states " This cluster is currently Updating; areas that interact directly with it will not be available until the API is ready.

Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s)x.x.x.x
Originally, all was running months ago and then I was not able to log into the gui until i restarted it and i had to run the below to get new certs
docker exec -it practical_kowalevski sh -c “ls -l /var/lib/rancher/k3s/server/tls/”
docker exec -it 78d81c71d254 sh -c “rm -rf /var/lib/rancher/k3s/server/tls/*”
docker exec -it practical_kowalevski sh -c “ls -l /var/lib/rancher/k3s/server/tls/”

docker stop 78d81c71d254
docker start 78d81c71d254
docker ps

docker exec -it practical_kowalevski sh -c “ls -l /var/lib/rancher/k3s/server/tls/”, now I have this error , thanks

  • Which type of downstream cluster do you have? RKE ? RKE2 ? K3s?
  • Do you have a Kubeconfig to connect to it directly ? Without going through Rancher?
  • If yes, you can check the status of the cattle-cluster-agent pods in the cattle-system namespace, also check their logs. You should get more info as to why your downstream/application cluster cannot connect to Rancher. I suspect that you have a certificate issue, where Rancher Ingress has now a different certificate which is not recognized by the cattle-cluster-agent pods.