Hi guys. i lost one of the control plane nodes, unfortunately the leading one. so now i do not have access to what used to be apiEndpoint, say 222.222.232.232 but still have another control node. however when i try to access anything via kube api I’m getting “error trying to reach service:” 222.222.232.232 :52792: i/o timeout") has prevented the request from succeeding or similar because something is pointing to the old node. in gui note stuck in removing " Waiting on node-controller" and "This cluster is currently Error; areas that interact directly with it will not be available until the API is ready. im not sure how to proceed. is there a way to manually force usage of the other server to restore rancher connection to the cluster? thank you in advance.
Any possibility that your kube config file in .kube/config is pointing to missing node?
No, the problem is not in my local config but in rancher gui / api only. there is no problem accessing cluster directly.
Did you ever solve your problem? I’m having the same issue.
I’m having a different issue with downstream connectivity to Rancher (the downstream is badly horked and not connecting at all and just shows up as unavailable in Rancher UI).
When I tried to look on the Rancher side to see where it connected, I only got a Kubernetes local IP (10.43.0.1) which was also in local. When I asked about how that worked, a Rancher guy on Slack (Brad Davidson) told me that the way that Rancher knows what systems the downstream clusters are on is via the cattle-cluster-agent on the downstream clusters talking to Rancher (in my instance part of me being horked is DNS not working for the FQDN for Rancher that the cattle-cluster-agent is using so my downstream is marked as Unavailable).
Not quite what you’re asking, but hopefully helpful info to debug a bit further?