2019/03/22 20:07:58 [ERROR] AppController p-tjjnb/monitoring-operator [helm-controller] failed with : failed to install app monitoring-operator. Error: validation failed: unable to recognize "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
I restarted Rancher and took some stats while Helm tried to install Prometheus.
Eventually, the gui becomes unstable and won’t return the cluster page. [https://localhost/c/c-2cmqm]
I’m new to Rancher, but if you could direct me tom some better logs, I can provide them.
Docker Stats: The CPU does spike to +300% sometimes.
Captured via while true; do sudo docker stats -a --no-stream >> stats.txt; done
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
694760e998f6 rancher 213.22% 1.309GiB / 1.952GiB 67.07% 159MB / 160MB 219MB / 2.06GB 80
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
694760e998f6 rancher 186.48% 1.301GiB / 1.952GiB 66.68% 159MB / 160MB 219MB / 2.06GB 90
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
694760e998f6 rancher 155.89% 1.308GiB / 1.952GiB 67.04% 159MB / 160MB 219MB / 2.06GB 81
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
694760e998f6 rancher 158.79% 1.303GiB / 1.952GiB 66.75% 159MB / 160MB 219MB / 2.06GB 69
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
694760e998f6 rancher 259.55% 1.296GiB / 1.952GiB 66.40% 159MB / 160MB 219MB / 2.06GB 86
Screenshot of GUI
Errors started appearing roughly after 5-10 minutes of Monitoring being enabled.
It looks like Helm continuously tries to install, judging on the new folders appearing in /tmp every second or so.
https://localhost/g/clusters > Try to click on my cluster
You’re not going to be able to run rancher and monitoring on a node with 2GB of RAM. Kubernetes (inside the rancher container) and Prometheus/etc all like to burn a lot of resources and you’re probably just getting into the kennel killing random processes because out of memory.
Is Kubernetes/Prometheus running inside the rancher container?
I was under the impression that the Prometheus operator was being deployed to the cluster in AWS that was provisioned by Rancher.
See below for a sample of the Docker stats.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
694760e998f6 rancher 165.33% 1.357GiB / 1.952GiB 69.55% 36.9MB / 11.3MB 296MB / 146MB 73