Rancher api-server self heal

gdm · April 23, 2019, 7:39pm

I have provisioned a kube cluster via Rancher - 3 nodes, with all roles, just as a test.
I stopped one of the api-server docker containers and expected it to be restarted (systemd unit?).
However, it didn’t start.

Is the only option to stop all the containers in the master node and re-run the agent registration command? rancher/rancher-agent:v2.2.1

gdm · April 23, 2019, 9:17pm

Re-running the agent didn’t bring up the rancher api-server docker container.

gdm · April 23, 2019, 10:13pm

Reconstructing the docker run command for the apiserver, i was able to bring it up, but it throws a lot of errors-

E0423 22:11:14.240110       1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: 
Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object 
has been modified; please apply your changes to the latest version and try again
E0423 22:11:18.324333       1 memcache.go:134] couldn't get resource list for 
metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0423 22:11:19.244952       1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: 
Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object 
has been modified; please apply your changes to the latest version and try again
E0423 22:11:24.350793       1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get 
https://10.83.205.117:443: net/http: request canceled while waiting for connection (Client.Timeout 
exceeded while awaiting headers)

gdm · April 24, 2019, 7:20pm

Rancher advised something like this-

The kube-apiserver container in a Rancher or RKE provisioned cluster is set to restart automatically unless you stop the container manually, i.e. it should restart unless you have performed docker stop on the container. In the event that you manually removed the container altogether docker rm you would need to re-run RKE, in the instance the cluster was provisioned by RKE, or trigger a cluster reconciliation in Rancher, if this is a cluster provisioned in Rancher. You can trigger this reconciliation by making an impactless change to the cluster config yaml in Rancher, such as adding an additional environment variable MYKEY=VALUE to one of the kubernetes component services (Extra Args, Extra Binds, and Extra Environment Variables | RKE1).

I verified only the cluster reconciliation approach via the Rancher web console, which worked very well for me.

Topic		Replies	Views
Rancher Cluster Issue Rancher	1	3235	June 10, 2022
Recreating cluster fails Rancher	1	1476	September 22, 2020
Reinstall after cluster teardown not working Rancher	2	6289	December 16, 2020
Rancher 2.7 on Docker fails start after server reboot Rancher	5	6333	March 7, 2023
Any approach to start the container like kubelet/kube-apiserver from rancher server? Rancher	4	498	March 16, 2022

Rancher api-server self heal

Related topics