Rancher suddenly stopped working

kobozo · November 10, 2019, 9:06am

My rancher stopped working one day to another without changing anything.

the actual logs are not showing any problems but why I do docker log container I get a lot output that is not viewable in the logs.

I run my docker with:
docker run -d --restart=unless-stopped -p 80:80 -p 443:443 -v /opt/rancher:/var/lib/rancher -v /opt/rancher/ssl/cert.pem:/etc/rancher/ssl/cert.pem -v /opt/rancher/ssl/key.pem:/etc/rancher/ssl/key.pem -v /opt/rancher/ssl/ca_bundle.pem:/etc/rancher/ssl/cacerts.pem rancher/rancher:stable

log file can be downloaded: https://www.kobozo.be/rancher.log
The form does not allow to post me a error log with links inside

I already updated from 2.2.x to the new stable now. Running on rancheros

superseb · November 11, 2019, 4:34pm

Can you describe in more detail what stopped working means? Is Rancher crashing and the container is continuously starting -> stopping -> starting etc? Can you post the log to a gist or some other service that lets you paste logs (pastebin/0bin etc)?

kobozo · November 11, 2019, 5:04pm

Well, rancher was working until last friday. I changed nothing on it, I know the default end-user answer, and tried accessing it again this weekend. This is when I saw that I was getting a DNS_NOT_FOUND error in the internet browser.
I restored the rancher to a backup of friday, but still getting the same error. Then I returned to the last running state and updated the rancher to the latest stable docker image. And again the same.
After the startup of the container, and only if I’m fast, I can access the login page of my rancher. But then it suddenly restarts.
I checked the certificates in the meanwhile and they are still valid until December. I checked the rights of the folders as these are mounted to the host system. They are not changed compared with the moment everything went well. I’m guessing something happened inside the /opt/rancher/ folder but I cannot seem to find what.

When I do docker logs container I get more information from the system then what is actually in the logs when I go and look inside the container with docker exec.

https://pastebin.com/4Ca9Ssxd

vincent · November 11, 2019, 6:13pm

If the name you’re pointing at is not resolving then no amount of tinkering with Rancher is going to fix that. You need the DNS record you’re trying to get to to resolve to the IP(s) the server containers are running on.

kobozo · November 12, 2019, 6:53am

Nothing to see with the DNS as it is only when the Rancher is restarting. When he just started, I can access the webpage. DNS records are ok

kobozo · November 12, 2019, 7:47am

I did some more testing today and found out that the problem is inside my etcd, the strange thing however is that I do not understand why this is also not working when I do a restore to a previous version.

The easiest solution for me atm would be setting up a new Rancher, but I need to be able to reconnect the existing clusters, created with rancher, to the new Rancher in this case without loosing the containers already running on this clusters.

Topic		Replies	Views
Rancher Container is stopped and Rancher UI gives 404 backend	3	2610	May 25, 2021
Rancher stops responding and restarts regularly Rancher	13	7150	December 10, 2020
Rancher-server crashed, causes docker to hang Rancher	2	1844	July 24, 2019
Corrupted etcd?	0	907	March 29, 2022
Docker created rancher 2.5.8 failed to start, after power outage Rancher	1	1333	May 20, 2021

Rancher suddenly stopped working

Related topics