Cluster api access is stuck on a missing node

naorw · August 12, 2021, 2:17am

Hi guys. i lost one of the control plane nodes, unfortunately the leading one. so now i do not have access to what used to be apiEndpoint, say 222.222.232.232 but still have another control node. however when i try to access anything via kube api I’m getting “error trying to reach service:” 222.222.232.232 :52792: i/o timeout") has prevented the request from succeeding or similar because something is pointing to the old node. in gui note stuck in removing " Waiting on node-controller" and "This cluster is currently Error; areas that interact directly with it will not be available until the API is ready. im not sure how to proceed. is there a way to manually force usage of the other server to restore rancher connection to the cluster? thank you in advance.

KuosmTe81 · August 17, 2021, 4:51am

Any possibility that your kube config file in .kube/config is pointing to missing node?

naorw · August 17, 2021, 7:40am

No, the problem is not in my local config but in rancher gui / api only. there is no problem accessing cluster directly.

JohnO · April 15, 2022, 1:23pm

Did you ever solve your problem? I’m having the same issue.

wcoateRR · April 15, 2022, 1:32pm

I’m having a different issue with downstream connectivity to Rancher (the downstream is badly horked and not connecting at all and just shows up as unavailable in Rancher UI).

When I tried to look on the Rancher side to see where it connected, I only got a Kubernetes local IP (10.43.0.1) which was also in local. When I asked about how that worked, a Rancher guy on Slack (Brad Davidson) told me that the way that Rancher knows what systems the downstream clusters are on is via the cattle-cluster-agent on the downstream clusters talking to Rancher (in my instance part of me being horked is DNS not working for the FQDN for Rancher that the cattle-cluster-agent is using so my downstream is marked as Unavailable).

Not quite what you’re asking, but hopefully helpful info to debug a bit further?

Topic		Replies	Views
Cluster unavailable - Failed to communicate with API server - waiting for cluster agent to connect Rancher	0	3347	February 28, 2019
Failing to Communicate with Kubernetes API server after Load Testing Cluster Rancher	5	6077	September 19, 2018
Rancher API Server Rancher	0	1142	July 9, 2020
API-Server unreachable with 1/3 node down, HA? Rancher	0	835	January 29, 2019
Need Help- API Server Unreachable Rancher	0	35	July 22, 2024

Cluster api access is stuck on a missing node

Related topics