Reinstall after cluster teardown not working

I had to tear down and rebuild my cluster as kubernetes became unresponsive.

After that, I went to each node and:

  • Removed all Docker objects
  • Deleted /etc/kubernetes and /var/lib/kubelet
  • Cleared out all Longhorn fragments in mounted volume

Then I created a new cluster, and ran the docker commands to install Kubernetes again.

The pods started running, but I got this error in the Rancher UI:

Error while applying agent YAML, it will be retried automatically: exit status 1, error: You must be logged in to the server (the server has asked for the client to provide credentials)

I checked the container logs for an agent container, and saw the following:

time="2020-12-10T20:00:25Z" level=info msg="Connecting to wss://rancher.eresources.com/v3/connect/register with token 89vhg8pdqz4rw4rlcn7nv4q826skj99qhcmxm4mj4sm5lc5l8kfqkn"
time="2020-12-10T20:00:25Z" level=info msg="Connecting to proxy" url="wss://rancher.eresources.com/v3/connect/register"
time="2020-12-10T20:00:25Z" level=error msg="Failed to connect to proxy. Response status: 400 - 400 Bad Request. Response body: Operation cannot be fulfilled on nodes.management.cattle.io \"m-4470b5e3f6a5\": the object has been modified; please apply your changes to the latest version and try again" error="websocket: bad handshake"
time="2020-12-10T20:00:25Z" level=error msg="Remotedialer proxy error" error="websocket: bad handshake"
time="2020-12-10T20:00:35Z" level=info msg="Connecting to wss://rancher.eresources.com/v3/connect/register with token 89vhg8pdqz4rw4rlcn7nv4q826skj99qhcmxm4mj4sm5lc5l8kfqkn"
time="2020-12-10T20:00:35Z" level=info msg="Connecting to proxy" url="wss://rancher.eresources.com/v3/connect/register"
time="2020-12-10T20:00:35Z" level=warning msg="Error while getting agent config: invalid response 500: Operation cannot be fulfilled on nodes.management.cattle.io \"m-4470b5e3f6a5\": the object has been modified; please apply your changes to the latest version and try again"

Versions:
Rancher 2.5.2, latest kubernetes, selected defaults on cluster

1 Like

Previous installs on Nodes leave marks. Installing and/or Scrubbing off a rancher node became less intrusive with the Rancher 2.5.x installs, but nonetheless, no guarantees that everything remains clean after removal. When using physical hosts I use K3D for OTA deployments and when possible service provider provisioned clusters for Production environments.

The resources to remove are listed under https://rancher.com/docs/rancher/v2.x/en/cluster-admin/cleaning-cluster-nodes/#docker-containers-images-and-volumes

1 Like