I had to tear down and rebuild my cluster as kubernetes became unresponsive.
After that, I went to each node and:
- Removed all Docker objects
- Deleted /etc/kubernetes and /var/lib/kubelet
- Cleared out all Longhorn fragments in mounted volume
Then I created a new cluster, and ran the docker commands to install Kubernetes again.
The pods started running, but I got this error in the Rancher UI:
Error while applying agent YAML, it will be retried automatically: exit status 1, error: You must be logged in to the server (the server has asked for the client to provide credentials)
I checked the container logs for an agent container, and saw the following:
time="2020-12-10T20:00:25Z" level=info msg="Connecting to wss://rancher.eresources.com/v3/connect/register with token 89vhg8pdqz4rw4rlcn7nv4q826skj99qhcmxm4mj4sm5lc5l8kfqkn"
time="2020-12-10T20:00:25Z" level=info msg="Connecting to proxy" url="wss://rancher.eresources.com/v3/connect/register"
time="2020-12-10T20:00:25Z" level=error msg="Failed to connect to proxy. Response status: 400 - 400 Bad Request. Response body: Operation cannot be fulfilled on nodes.management.cattle.io \"m-4470b5e3f6a5\": the object has been modified; please apply your changes to the latest version and try again" error="websocket: bad handshake"
time="2020-12-10T20:00:25Z" level=error msg="Remotedialer proxy error" error="websocket: bad handshake"
time="2020-12-10T20:00:35Z" level=info msg="Connecting to wss://rancher.eresources.com/v3/connect/register with token 89vhg8pdqz4rw4rlcn7nv4q826skj99qhcmxm4mj4sm5lc5l8kfqkn"
time="2020-12-10T20:00:35Z" level=info msg="Connecting to proxy" url="wss://rancher.eresources.com/v3/connect/register"
time="2020-12-10T20:00:35Z" level=warning msg="Error while getting agent config: invalid response 500: Operation cannot be fulfilled on nodes.management.cattle.io \"m-4470b5e3f6a5\": the object has been modified; please apply your changes to the latest version and try again"
Versions:
Rancher 2.5.2, latest kubernetes, selected defaults on cluster