SH is booting us out because boot loop

Paul1 · April 1, 2021, 9:12pm

We recently found that our Rancher UI was down. Said there was a proxy/firewall issue but no 504etc. and the subdomain is fine. That is not the case (just the browser).

The first hunch was possibly the Let’s Encrypt cert expired and that was the cause. Don’t think that is it either.

The docker container is there, and when we go in after 10 or so seconds we get booted out. Looks like the etcd is restarting constanly on a boot loop or something. Plenty of resources.

We tried to power down, all the compenent droplets, but that didn’t do any good either.

Being unable to connect remote and use the CLI, any ideas what the cause or solutions we should try would be to;

a) determine what happened and why
b) fix

Thanks, would love any guidance.

vincent · April 2, 2021, 9:19am

The UI is just static HTML/JS files, so not being able to get to it is just a symptom and not your actual problem.

Etcd restarting constantly basically means you have no cluster, in which Rancher is supposed to be running, which provides the API, which serves up the UI assets.

But there’s not much anyone can tell you in detail given just that “a boot loop or something” is happening.

Paul1 · April 3, 2021, 1:12am

Thanks for the response Vincent.

The only expression of the issues is etcd rebooting.

Would you have a suggested next couple of steps to troubleshoot when you have no cluster? etcd is restarting constantly is a better way to define it technically. But feeling rather directionless on correcting and cause determination. Here is a video of the logs but can’t capture and they are going so fast due to the resarting I think. We see all 3 VMs that form the cluster, maybe the VMs are unable to “talk” to each other which prevents them to form a cluster?

Hopefully I caught enough of this, hard to tell where it loops.

Thank you.

Topic		Replies	Views
Rancher UI goes down inconsistency , I'm following single node rancher deployment	0	36	July 5, 2024
Etcd keeps crashing Rancher 2.0 Tech Preview	3	2562	April 17, 2018
Alert: Component etcd-X is unhealthy Rancher	0	550	October 14, 2022
Random cluster "unavailable" downtimes Rancher	3	1396	September 27, 2018
Solving a Customer Cluster problem: Failed to reconcile etcd plane: Etcd plane nodes are replaced Rancher	0	2312	April 24, 2020

SH is booting us out because boot loop

Related topics