Failed to start cluster controllers c-4jnqb: context canceled

Kuo_Hugo · March 20, 2023, 2:55am

This happened again and no way to bring it back. The only option is reinstall Rancher server and register the existing k8s to it. The downside is the k8s can’t be upgraded.

This is annoying and seems no update to similar cases in the forum.

2023/03/20 02:26:23 [INFO] Stopping cluster agent for c-4jnqb
2023/03/20 02:26:23 [ERROR] failed to start cluster controllers c-4jnqb: context canceled

jaiser · June 20, 2023, 1:51pm

Just in case anyone else hit this problem.

Description:
Rancher server can no longer access the downstream Clusters - in the rancher gui you’ll see an http 500 Error. in the rancher pod logs (Rancher Cluster, namespace cattle-system) you see messages like this:
2023/06/20 12:02:51 [INFO] Stopping cluster agent for
2023/06/20 12:02:51 [ERROR] failed to start cluster controllers : context canceled
2023/06/20 12:02:52 [ERROR] Failed to connect to peer wss:///v3/connect [local ID=]: websocket: bad handshake
2023/06/20 12:02:53 [ERROR] Failed to handle tunnel request from remote address : response 400: cluster not found

Solution:
what worked for me was to delete all rancher pods in the Rancher Cluster ninnamespace cattle-system. I also deleted the rancher-webhook, but that was before I restarted the rancher pods, so maybe it’s not necessary.
Afterwards the downstream Clusters were accessible again.

Topic		Replies	Views
Failed to start cluster controllers c-pwrvw: context canceled Rancher	1	1811	November 22, 2021
failed to start cluster controllers c-dbk7g: context cancel	3	4888	June 8, 2021
Conflicting with k8s version 1.22.2-3 Rancher	0	1417	May 18, 2022
ClusterController: failed to start user controllers: Unauthorized Rancher	1	2798	June 9, 2020
Rancher was working fine and suddenly it is giving error - failed calling webhook "rancher.cattle.io.clusters.management.cattle.io"	0	206	June 26, 2024

Failed to start cluster controllers c-4jnqb: context canceled

Related topics