I have provisioned a kube cluster via Rancher - 3 nodes, with all roles, just as a test.
I stopped one of the api-server docker containers and expected it to be restarted (systemd unit?).
However, it didn’t start.
Is the only option to stop all the containers in the master node and re-run the agent registration command? rancher/rancher-agent:v2.2.1
Reconstructing the docker run command for the apiserver, i was able to bring it up, but it throws a lot of errors-
E0423 22:11:14.240110 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with:
Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object
has been modified; please apply your changes to the latest version and try again
E0423 22:11:18.324333 1 memcache.go:134] couldn't get resource list for
metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0423 22:11:19.244952 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with:
Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object
has been modified; please apply your changes to the latest version and try again
E0423 22:11:24.350793 1 available_controller.go:311] v1beta1.metrics.k8s.io failed with: Get
https://10.83.205.117:443: net/http: request canceled while waiting for connection (Client.Timeout
exceeded while awaiting headers)
The kube-apiserver container in a Rancher or RKE provisioned cluster is set to restart automatically unless you stop the container manually, i.e. it should restart unless you have performed docker stop on the container. In the event that you manually removed the container altogether docker rm you would need to re-run RKE, in the instance the cluster was provisioned by RKE, or trigger a cluster reconciliation in Rancher, if this is a cluster provisioned in Rancher. You can trigger this reconciliation by making an impactless change to the cluster config yaml in Rancher, such as adding an additional environment variable MYKEY=VALUE to one of the kubernetes component services (Extra Args, Extra Binds, and Extra Environment Variables | RKE1).
I verified only the cluster reconciliation approach via the Rancher web console, which worked very well for me.