Install monitoring crashing cluster

I’ve got a new cluster, and clicked “install monitoring”. 24 hours later, it’s still not installed, nodes are reporting disk pressure and node loads are high.

kubectl get pod -n cattle-prometheus -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
exporter-kube-state-cluster-monitoring-79c667fdc9-n2kpl 0/1 ContainerCreating 0 9m32s vva-er-k8s0
exporter-node-cluster-monitoring-6jwns 0/1 Pending 0 9m32s
exporter-node-cluster-monitoring-crgvr 0/1 Evicted 0 45s vva-er-k8s1
exporter-node-cluster-monitoring-m92dw 0/1 Evicted 0 45s vva-er-k8s-sw1
exporter-node-cluster-monitoring-s2p6s 0/1 Evicted 0 45s vva-er-k8s2
grafana-cluster-monitoring-84b97685c6-5llxd 0/2 Pending 0 9m32s
prometheus-cluster-monitoring-0 0/5 ContainerCreating 0 9m31s vva-er-k8s1
prometheus-operator-monitoring-operator-7948c99b8b-b25hq 0/1 ContainerCreating 0 34m vva-er-k8s0

kubectl version
Client Version: version.Info{Major:“1”, Minor:“19”, GitVersion:“v1.19.0”, GitCommit:“e19964183377d0ec2052d1f1fa930c4d7575bd50”, GitTreeState:“clean”, BuildDate:“2020-08-26T14:30:33Z”, GoVersion:“go1.15”, Compiler:“gc”, Platform:“linux/amd64”}
Server Version: version.Info{Major:“1”, Minor:“19”, GitVersion:“v1.19.4”, GitCommit:“d360454c9bcd1634cf4cc52d1867af5491dc9c5f”, GitTreeState:“clean”, BuildDate:“2020-11-11T13:09:17Z”, GoVersion:“go1.15.2”, Compiler:“gc”, Platform:“linux/amd64”}

Rancher 2.5.2

Attempting to get my cluster back, I went to apps, tried to delete cluster-monitoring and monitoring-operator; they delete and then come back.

My objective is to get the cluster back without rebuilding; don’t care if monitoring works or not.

Disabling monitoring, waiting for uninstall on all nodes, and re-enabling the monitoring sometimes helps. Diskpressure is never good: looks like you’ve got a lot of latency going on. Questions to be asked here are: is the storage SSD or faster ( HDD raid works too ) and is there enough free space.