Completely disabling monitoring and alarming

Hi, I got fed up with the limitations of the build-in monitoring and wanted to install prometheus/alertmanager on my own. However, I realized there is no easy way to remove all the components from Ranches. Even if I disable monitoring, disable all alerts, delete the monitoring-operator and cluster-alerting apps, Rancher eventually creates the monitoring-operator and cluster-alerting apps again. How can I tell Rancher to not create them?

1 Like

You should not need to delete the apps. Just disabling the monitoring feature at the cluster level should remove all of the apps.

I have the same issue, and I have deleted the app, delete the namespace cattle-prometheus, but after a while, the prometheus operator are installed again and again. so any solution?

1 Like

have you found a solution for this ?

I scaled the deployment to 0, then it won’t impact the new version of prometheus operator. but the deployment is still there, no other solution is found yet.

Same problem here, this should not be so difficult to remove

Have any of you found any solution ?

In order to stop it from recreating, you need to delete all of the alerts and notifiers configured in the system, in addition to disabling monitoring. If there are any alerts or notifiers configured, Rancher will redeploy the monitoring operator and cluster alerting apps.

This is also relevant when you’re upgrading to Rancher 2.5 because in that case the old alerting/monitoring stuff is deprecated and you need to disable it completely. If you don’t delete all the alerts and notifiers, the old stuff will keep coming back and breaking the new stuff. Very frustrating.

2 Likes

This is absolutely driving me nuts.

I’ve upgraded from Rancher 2.4 (or was it 2.3…) to 2.5.2, and I now just cannot get rid of cluster-alerting and monitoring-operator apps in the System project.
Which prevents me from using the improved monitoring that’s part of Rancher >2.5.

Could very well be that something in the cluster is still monitored by the older tools, but if that’s the case how do I find out what?
Anyone managed to solve this?

Solved it.

Using the API explorer in Cluster Manager, I eventually realised that some of the new 2.5 features were using the monitoring operator too. Disabling that (once again using the API explorer UI), it was then possible to delete the monitoring/alerting app from 2.4 without them being restarted.
Gave it an hour to settle just to be sure, then added the new monitoring feature of 2.5, and it worked.

For me, I could not get rid of my last notifier for a while until I realized that simply deleting all of the alerts using it was not enough. All of the alert groups had to be deleted as well. At least that was the case for one of my clusters. Maybe I had a custom group added to that on.

Having the same with V2.5.1 it’s almost like a virus you can’t remove (prometheus-operator-monitoring-operator), keeps coming back with various methods tried.

(Deleted all alerts but they got enabled again somehow)

Had the same issue but with help of https://github.com/bashofmann/rancher-monitoring-v1-to-v2/blob/master/check_monitoring_disabled.py I found out that the default namespace had monitoring enabled.

what I did:

  • disable monitoring app
  • delete namespace cattle-prometheus
  • as soon as the cattle-prometheus namespace reappeared check_monitoring_disabled.py reported the setting which was causing the restart of monitoring v1

so far (some days) monitoring v1 stopped reappearing

Thank you, removing all notifiers / alerts solved the issue for us!

So tedious… had to do ALL of this to stop this process.

Step 1

  1. Delete ALL Alerts in ALL (three) PROJECTS & ON CLUSTER LEVEL (=Global level) by going to each project and pressing Tools → Alerts
  2. Delete ALL Alert Groups in ALL (three) PROJECTS & CLUSTER LEVEL (=Global level) by going to each project and pressing Tools → Alerts
  3. Now select all alerts and press Delete

Step 2

  1. From the Cluster Manager , go to CLUSTER LEVEL (=Global level) and then Tools → Notifiers
  2. Now select all nofitiers and press Delete

Step 3

  1. From the Cluster Manager , go to CLUSTER LEVEL (=Global level) and then Tools → Monitoring
  2. Press Disable

Step 4

  1. From the Cluster Manager , go to Project System and then Apps
  2. Go to Monitoring and at the three dotted colon press Delete

Step 5

  1. From the Cluster Manager , go to Project Default and then Tools → Monitoring
  2. Press Disable
  3. From the Cluster Manager , go to Project Default and then Apps
  4. Go to Monitoring and at the three dotted colon press Delete

Step 6

  1. Manually delete all Longhorn Volumes that are left behind in Workload → Volumes

Step 7

  1. Go to CLUSTER LEVEL (=Global level) and delete ALL namespaces that contain cattle-prometheus*
1 Like