Rancher eating all the CPU, is it overloaded?

Valentin_Odier · June 8, 2018, 5:47pm

Hello there,

We are currently running rancher since about 2 years. We just upgraded to v1.6.17 and it almost crashed everything.

Currently rancher is running alone on a server with 8 cpu 32Go ram (yea i know it’s waaaay too much) and we’ve set the jvm to 4G (about 1.5G is actually used sometimes about 2G but not more). We have about 80 hosts managed by rancher.

During our upgrade we had to upgrade internals (ipsec / scheduler …). This is when everything went wrong. We first updated some small env with 5 hosts max, no problem. Then we had to upgrade an environment with 60 hosts. While doing it rancher went absolutely crazy. The load on the rancher server was at 14 - 26 (we have 8 cores and the DB is external so WTF), CPU was maxed, ui was barely responding.
All our hosts started to disconnect and reconnect, stacks were mostly fine.

Nothing in rancher logs except the error on ping to agent (which explains the disconnect / reconnect).
To stop it we just deactivated about 25 server. Rancher finally maid it and went back to normal. We then added our host back 1 by 1.
Currently it is barely working, any big action and rancher goes crazy again everything disconnect and reconnect and we will need to deactivate hosts until rancher calms down.

At this point we still have no idea what happened.

We have updated our docker version from 1.12.6 to 17.03.2-ce no change. We have moved our database on a server with NVME storage 8 CPU and way more ram than necessary no change.

Has anyone encountered this before ?
Maybe 1 rancher server is not enough ? We will be trying a 3 setup HA but we are not confident it will solve the issue.

Thank you for your help,

Rancher	v1.6.17
Cattle	v0.183.49
User Interface	v1.6.42
Rancher CLI	v0.6.9
Rancher Compose	v0.12.5

Valentin_Odier · June 18, 2018, 2:15pm

pimping the SQL database did the job.

Topic		Replies	Views
Memory/CPU leak?	6	2787	February 20, 2018
[Rancher v1.6.18]Facing Memory Issues on Rancher Host Rancher 1.x	4	2783	October 12, 2020
Run out of CPU after agent starts Rancher 1.x	1	1402	November 19, 2015
How many cpu and memory for Rancher Server to support manage 300 client servers Rancher 1.x	1	837	September 12, 2018
Rancher is feeling like a mistake. Help! Rancher 1.x	54	7589	February 20, 2017

Rancher eating all the CPU, is it overloaded?

Related topics