I’ve been running Rancher 2.2.3 for some time now, and now got to setting up notifiers and assigning it to Alerts. Cluster monitoring feature with Prometheus is NOT enabled.
However, we are now seing this alert on the two clusters where I have enabled notifiers:
Failed to ensure catalog “catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2”: failed to find catalog by ID “catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2”: catalogtemplateversions.management.cattle.io “system-library-rancher-monitoring-0.0.2” not found
I would like to get input on what is happening within rancher and what I can do to resolve this. It is worth noting that our setup does not have internet access. So we are running an air-gap environment.
2019/06/13 11:07:52 [ERROR] ClusterAlertGroupController c-4twb9/node-alert [cluster-alert-group-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertGroupController c-4twb9/etcd-alert [cluster-alert-group-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-4twb9/p-hcpph [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-4twb9/p-f7w2j [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/scheduler-system-service [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/node-disk-running-full [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/migrate-clusteralert-controllermanager [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/deployment-event-alert [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ClusterAlertRuleController c-4twb9/high-cpu-load [cluster-alert-rule-deployer] failed with : deploy alertmanager failed, failed to find catalog by ID "catalog://?catalog=system-library&template=rancher-monitoring&version=0.0.2", catalogtemplateversions.management.cattle.io "system-library-rancher-monitoring-0.0.2" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-jbmwt/p-pq8zh [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-jbmwt/p-kwkgs [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:07:52 [ERROR] ProjectController c-jbmwt/p-vhrjp [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:09:55 [ERROR] CatalogController library [catalog] failed with : Timeout in HTTP GET to [https://git.rancher.io/charts/index.yaml], did not respond in 30s
2019/06/13 11:09:55 [ERROR] CatalogController system-library [catalog] failed with : Timeout in HTTP GET to [https://git.rancher.io/system-charts/index.yaml], did not respond in 30s
I removed all alerts from each of the two clusters and then restarted docker+rancher container. Now i’m only seeing this in the log:
2019/06/13 11:40:53 [ERROR] ProjectController c-4twb9/p-f7w2j [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
2019/06/13 11:40:53 [ERROR] ProjectController c-jbmwt/p-pq8zh [system-image-upgrade-controller] failed with : get template system-library-rancher-logging failed: catalogTemplate.management.cattle.io "cattle-global-data/system-library-rancher-logging" not found
But the original message from the first post is still visible from the main cluster page, at /g/clusters
We have an Air-gapped environment. So this was the result of Rancher not being able to reach the repositories for the needed helm charts.
This call was made even though the monitoring feature was not enabled. Guess some kind of background job.
So when internet access was corrected, through proxy, then this went away.
Then we also realized that provisioning clusters on vsphere does not work with proxy settings, but that is another topic. Rancher now as direct internet access.