Rancher flooding logs with errors

We are currently using Rancher version 2.6.9. While debugging some oidc errors we are experiencing trouble with our error logs: They are being flooded by error messages from which we are unable to identify the source of.

You may find the logs of the last 5 minutes in the gist.

We could identify several types of errors:

error syncing 'p-n7k7r/creator-project-owner': handler mgmt-auth-prtb-controller: clusters.management.cattle.io "c-fvg4w" not found, requeuing

error syncing 'p-pqzmm/creator-project-owner': handler auth-prov-v2-prtb: failed to update fleet-local/r-cluster-local-view-p-pqzmm-creator-project-owner-nk3rmcfzaj rbac.authorization.k8s.io/v1, Kind=RoleBinding for auth-prov-v2-prtb-rolebinding p-pqzmm/creator-project-owner: RoleBinding.rbac.authorization.k8s.io "r-cluster-local-view-p-pqzmm-creator-project-owner-nk3rmcfzaj" is invalid: [metadata.ownerReferences.apiVersion: Invalid value: "": version must not be empty, metadata.ownerReferences.kind: Invalid value: "": kind must not be empty, metadata.ownerReferences.name: Invalid value: "": name must not be empty], requeuing

error syncing 'grb-ftw5p': handler grb-cluster-sync: Index with name by-cluster does not exist, requeuing

error syncing 'c-p6msc/p-jq749': handler system-image-upgrade-controller: upgrade cluster c-p6msc system service alerting failed: template system-library-rancher-monitoring incompatible with rancher version or cluster's [c-p6msc] kubernetes version, requeuing

From our understanding there could be some jobs running in the background referencing already deleted objects. Any suggestions how we could clean this up?

Kind regards



We have the same issue. Any suggestions will be appreciated.

I have something vaguely similar spamming my logs with 2.6.11, prtb-related:

2023/03/28 18:35:21 [ERROR] error syncing ā€˜p-89qlv/prtb-z88r4ā€™: handler mgmt-auth-prtb-controller: cannot determine project and cluster from p-89qlv, requeuing
2023/03/28 18:35:21 [ERROR] error syncing ā€˜p-89qlv/prtb-l4lfpā€™: handler mgmt-auth-prtb-controller: cannot determine project and cluster from p-89qlv, requeuing
2023/03/28 18:35:21 [ERROR] error syncing ā€˜p-89qlv/prtb-2rqq7ā€™: handler mgmt-auth-prtb-controller: cannot determine project and cluster from p-89qlv, requeuing
2023/03/28 18:35:21 [ERROR] error syncing ā€˜p-89qlv/prtb-plm5cā€™: handler mgmt-auth-prtb-controller: cannot determine project and cluster from p-89qlv, requeuing

Fwiw- it looks like these prtbā€™s are from a project that was deleted, but the project namespace (in the rancher-hosting RKE cluster) is still in a ā€œTerminatingā€ state. Guess this is remnants of an old bug. I am going to try deleting the remaining resources in that namespace.

Well if anyone else encounters the specific errors I found, it turns out someone deleted some projects in the cluster and the project-namespaces (IN THE CLUSTER HOSTING RANCHER) were hanging around because the mgmt-auth-prtb-controller finalizer couldnā€™t complete because the ā€œprojectName:ā€ field in the prtb object had the project ID, but not the cluster.

Editing the YAML for the prtb and prepending c-(clusterID): to the project ID in the projectName: field cleared the backlog and now all those project-namespaces successfully terminated, and the logs quit spamming Rancherā€™s pods.


apiVersion: management.cattle.io/v3
kind: ProjectRoleTemplateBinding
    field.cattle.io/creatorId: user-hkvlr
  creationTimestamp: "2020-01-30T15:58:43Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2020-03-16T13:51:30Z"
  - controller.cattle.io/mgmt-auth-prtb-controller
  generation: 3
    cattle.io/creator: norman
  name: prtb-z8vjh
  namespace: p-r7hr6
  resourceVersion: "50750630"
  uid: a18bd8e2-f936-468a-ab63-e4d04e757472
projectName: p-r7hr6
roleTemplateName: project-owner
userName: u-x9pbj
userPrincipalName: local://u-x9pbj

Changing that ā€œprojectNameā€ field to:

projectName: c-c4hlm:p-r7hr6

did the trick.

Quick for/awk/sed script to rip through a terminating namespaceā€™s PRTBs to fix:

for prtb in `kubectl -n $K8SNS get projectroletemplatebindings.v3.management.cattle.io | grep ^prtb- | awk '{print $1}'`; do
    kubectl -n $K8SNS get projectroletemplatebindings.v3.management.cattle.io $prtb -o yaml > tmp.yml
    sed -i 's/^projectName: \(.*\)$/projectName: c-c4hlm:\1/' tmp.yml
    kubectl -n $K8SNS apply -f tmp.yml

Replace the ā€œc-c4hlmā€ with the correct cluster ID as needed.