Solving a Customer Cluster problem: Failed to reconcile etcd plane: Etcd plane nodes are replaced

Shaun.Westy · April 24, 2020, 12:47pm

Hello again.

We have a 4 node Custom cluster and having issues with the Snapshot. Rancher was reporting it could not find one of the nodes (which was not in the cluster anymore).

So we ssh’d into the cluster and removed the offending etcd member that didn’t exist (remove member <node_id>) - and all 3 remaining etcd nodes claim to be ‘healthy’ but it can’t recover (as rancher is unhappy, the 4th node never spins up so we reduced the node count). We tried rebooting the etcd leader to force a change to see if that would resolve the problem, but still not happy.
<Red Herring?> When I click on the Kubeconfig File for the cluster in Rancher, it still shows 4 nodes. 3 are the correct IP and the one that we removed</Red Herring?>

Is there something we need to do to re-configure the information that Rancher has about the cluster? I could not see a way to do this (through the UI or in searching for info).
Should we follow the guide to spin down to one node and back up again (sorry I do not have this link available) and just accept a small outage.
Or do we spin up a new cluster and get Rancher to Add Customer cluster again and then delete the offending Rancher cluster?
Would rebooting the Rancher HA help (did you turn it off and on again?) Sorry, it is getting late.

The error/messages from withing Rancher.
From existing nodes (Custom) running on rancherOS (1.5.4 and 1.5.5) 3 nodes running etcd, worker and control plane `Failed to reconcile etcd plane: Etcd plane nodes are replaced. Stopping provisioning. Please restore your cluster from backup.

Any guidance would be appreciated.

Topic		Replies	Views
My Cluster is about to die - Need Help Rancher 1.x	2	870	November 17, 2022
Failed to reconcile etcd plane: Failed to add etcd member Rancher	1	2579	October 12, 2018
[SOLVED] Remove failed ETCD node Rancher	0	1996	October 13, 2021
Unable to (re)add etcd node to cluster Rancher	0	564	October 21, 2022
Cannot restore etcd snapshot Rancher	0	591	February 21, 2020

Solving a Customer Cluster problem: Failed to reconcile etcd plane: Etcd plane nodes are replaced

Related topics