Downstream cluster unable to scale up past 5 VMs

charlie1 · January 2, 2025, 11:34pm

I posted in Slack as well but figured I would here as well.

I’m running into an interesting problem with my Rancher/Harvester setup. When deploying a downstream cluster using the harvester cloud provider I’m able to provision 5 VMs that all come up correctly and work as expected. However, when I go to scale up anything more than that the new VM that is created cannot properly connect to my rancher domain as it just continually times out and cannot properly join the new downstream cluster. The VM can properly ping the IP of the rancher server and curl any other domains without issue.

I’ve tried a couple different configurations as well and I know it’s not a CPU/memory/storage resource limit or anything as the new VMs always come up and function properly outside of not being able to properly curl the rancher server.

I’ve tried this with 2 different downstream clusters each with 3 nodes as well and always the 6th VM to be provisioned fails with the same issue. I’m wondering if there is some kind of configuration I’m missing?

Looking for any insight into what could be causing this. Thanks in advance.

charlie1 · January 3, 2025, 7:22pm

For anyone running into a similar issue it was actually related to a bug in Harvester 1.4.0 (not sure if present in earlier versions or not). This bug report while not the same high level issue I was running into, had the solution.

TLDR: run sysctl net.bridge.bridge-nf-call-iptables=0 as root on all harvester nodes in the cluster.

[BUG] You can’t import a single node Harvester cluster into Rancher running on the same cluster · Issue #7210 · harvester/harvester

Topic		Replies	Views
Rancher cloning new nodes serially Rancher	0	368	September 17, 2021
Cluster api access is stuck on a missing node Rancher	4	1274	April 15, 2022
How scalable is Rancher? Rancher 1.x	3	1588	October 30, 2015
Rancher server crushed	0	488	May 15, 2021
Rancher 2.0 custom node fails to connect proxy Rancher 2.0 Tech Preview	0	1811	April 30, 2018

Downstream cluster unable to scale up past 5 VMs

Related topics