Rancher 2.5.1 cannot install new monitoring

I had v.2.4.7 running, with the monitoring installed. I have upgraded that to v2.5.1 and I have disabled monitoring in the Cluster Manager (it is also no longer available there), but I still cannot install the new monitoring, as it keeps telling me this:

Monitoring is currently deployed from Cluster Manager. If you are migrating from an older version of Rancher with monitoring enabled, please disable monitoring in Cluster Manager before attempting to install the new Rancher Monitoring chart in Cluster Explorer.

I have found that under apps, the app ‘monitoring-operator’ is still installed. If I remove it, I can actually install the new monitoring. But, only for a little while, as the ‘monitoring-operator’ app installs itself again?
If I quickly install the new monitoring, before it re-installs, it brings my whole cluster down somehow.

Getting the impression that 2.5 is half-baked. Rancher Labs, what do you have to say about these issues? I have one RKE cluster I can’t fully manage because the “System” project says “Cluster not ready” when I try to click on it, the Monitoring seems to be deprecated but I saw the same issue as the OP here, etc…

How well tested is 2.5? Helm charts list it now as “stable”.

I’ve seen other glitches - look at Alerting in Cluster Manager and you get a generic Rancher error that the bundle “failed to load”, reload the whole Rancher UI and go back and it works the 2nd time. Seen that with Logging too several times.

1 Like

@m-jepson did you uninstall both Monitoring V1 (Project + Cluster) and Alerting / Notifiers V1 from the cluster? We have a note about disabling and removing all existing custom alerts, notifiers and monitoring installations for the whole cluster and in all projects in our migration docs.

The error you posted seems to indicate that one of these are still turned on. You could use the check_monitoring_disabled.py script in https://github.com/bashofmann/rancher-monitoring-v1-to-v2 to help audit why the Monitoring V1 operator still might be installed on your cluster.

re:

If I quickly install the new monitoring, before it re-installs, it brings my whole cluster down somehow.

This is because both Monitoring / Alerting V1 and Monitoring / Alerting V2 are deploying a Prometheus Operator pod, both of which attempt to manage the CRDs within the cluster. Having multiple operators in your cluster is not something upstream Prometheus Operator currently supports.

Alerts and Monitoring have to be disabled. It also took delete all alerts and notifiers.