Alert manager and system-library-rancher-monitoring-0.0.2 not found

Hi

I’ve been running Rancher 2.2.3 for some time now, and now got to setting up notifiers and assigning it to Alerts. Cluster monitoring feature with Prometheus is NOT enabled.

However, we are now seing this alert on the two clusters where I have enabled notifiers:

Failed to ensure catalog “catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2”: failed to find catalog by ID “catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2”: catalogtemplateversions.management.cattle.io “system-library-rancher-monitoring-0.0.2” not found

I would like to get input on what is happening within rancher and what I can do to resolve this. It is worth noting that our setup does not have internet access. So we are running an air-gap environment.

//Marcus

Here is a log snippet as well:

2019/06/13 11:07:52 [ERROR] ClusterAlertGroupController c-4twb9/node-alert [cluster-alert-group-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertGroupController c-4twb9/etcd-alert [cluster-alert-group-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-4twb9/p-hcpph [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-4twb9/p-f7w2j [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/scheduler-system-service [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/node-disk-running-full [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/migrate-clusteralert-controllermanager [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/deployment-event-alert [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/high-cpu-load [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-jbmwt/p-pq8zh [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-jbmwt/p-kwkgs [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-jbmwt/p-vhrjp [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:09:55 [ERROR] CatalogController library [catalog] failed with : Timeout in HTTP GET to [https://git.rancher.io/charts/index.yaml], did not respond in 30s
2019/06/13 11:09:55 [ERROR] CatalogController system-library [catalog] failed with : Timeout in HTTP GET to [https://git.rancher.io/system-charts/index.yaml], did not respond in 30s

I removed all alerts from each of the two clusters and then restarted docker+rancher container. Now i’m only seeing this in the log:

2019/06/13 11:40:53 [ERROR] ProjectController c-4twb9/p-f7w2j [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:40:53 [ERROR] ProjectController c-jbmwt/p-pq8zh [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found

But the original message from the first post is still visible from the main cluster page, at /g/clusters
:frowning: