Rancher crashloopback daily issue

Hi we are facing problems with rancher 2.4.5.
We had the rancher application stuck with crashloopbackoff after a few hours running and get back healthy after hours of downtime.

This behaviour starts 3 weeks ago without any previus occurence for almost 400days of cluster operation.

I appreciate any tips or a way to fix that.

Thanks all.

We’ve got this log of rancher (cattle-system).

2021/01/05 17:45:19 [INFO] Rancher version v2.4.5 (c642f1caa) is starting
2021/01/05 17:45:19 [INFO] Rancher arguments {ACMEDomains:[] AddLocal:auto Embedded:false HTTPListenPort:80 HTTPSListenPort:443 K8sMode:auto Debug:false Trace:false NoCACerts:true AuditLogPath:/var/log/auditlog/rancher-api-audit.log AuditLogMaxage:10 AuditLogMaxsize:100 AuditLogMaxbackup:10 AuditLevel:0 Features:}
2021/01/05 17:45:19 [INFO] Listening on /tmp/log.sock
I0105 17:45:19.704113       8 http.go:116] HTTP2 has been explicitly disabled
2021/01/05 17:45:19 [INFO] Starting API controllers
2021/01/05 17:45:20 [INFO] Running in clustered mode with ID 10.42.2.99, monitoring endpoint cattle-system/rancher
2021/01/05 17:45:20 [INFO] Starting API controllers
I0105 17:45:21.214921       8 http.go:116] HTTP2 has been explicitly disabled
I0105 17:45:21.215542       8 http.go:116] HTTP2 has been explicitly disabled
I0105 17:45:21.216121       8 http.go:116] HTTP2 has been explicitly disabled
I0105 17:45:21.218586       8 http.go:116] HTTP2 has been explicitly disabled
2021/01/05 17:45:21 [INFO] Starting cluster controllers for local
2021/01/05 17:45:21 [INFO] Starting rbac.authorization.k8s.io/v1, Kind=ClusterRole controller
2021/01/05 17:45:21 [INFO] Starting rbac.authorization.k8s.io/v1, Kind=Role controller
2021/01/05 17:45:21 [INFO] Starting cluster agent for local [owner=false]
2021/01/05 17:45:21 [INFO] Starting rbac.authorization.k8s.io/v1, Kind=ClusterRole controller
2021/01/05 17:45:21 [INFO] Starting rbac.authorization.k8s.io/v1, Kind=Role controller
I0105 17:45:22.509559       8 leaderelection.go:242] attempting to acquire leader lease  kube-system/cattle-controllers...
2021/01/05 17:45:22 [INFO] Refreshing all schemas
2021/01/05 17:45:22 [INFO] Starting apiregistration.k8s.io/v1, Kind=APIService controller
2021/01/05 17:45:22 [INFO] Starting apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition controller
2021/01/05 17:45:22 [INFO] Refreshing all schemas
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Binding
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind ComponentStatus
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind ConfigMap
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Endpoints
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Event
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind LimitRange
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Namespace
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Node
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind PersistentVolumeClaim
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind PersistentVolume
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Pod
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind PodTemplate
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind ReplicationController
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind ResourceQuota
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Secret
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind ServiceAccount
2021/01/05 17:45:22 [INFO] APIVersion /v1 Kind Service
2021/01/05 17:45:22 [INFO] APIVersion apiregistration.k8s.io/v1 Kind APIService
2021/01/05 17:45:22 [INFO] APIVersion apiregistration.k8s.io/v1beta1 Kind APIService
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind DaemonSet
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind Deployment
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind Ingress
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind NetworkPolicy
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind PodSecurityPolicy
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind ReplicaSet
2021/01/05 17:45:22 [INFO] APIVersion extensions/v1beta1 Kind ReplicationControllerDummy
2021/01/05 17:45:22 [INFO] APIVersion apps/v1 Kind ControllerRevision
2021/01/05 17:45:22 [INFO] APIVersion apps/v1 Kind DaemonSet
2021/01/05 17:45:22 [INFO] APIVersion apps/v1 Kind Deployment
2021/01/05 17:45:22 [INFO] APIVersion apps/v1 Kind ReplicaSet
2021/01/05 17:45:22 [INFO] APIVersion apps/v1 Kind StatefulSet
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta2 Kind ControllerRevision
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta2 Kind DaemonSet
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta2 Kind Deployment
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta2 Kind ReplicaSet
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta2 Kind StatefulSet
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta1 Kind ControllerRevision
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta1 Kind Deployment
2021/01/05 17:45:22 [INFO] APIVersion apps/v1beta1 Kind StatefulSet
2021/01/05 17:45:22 [INFO] APIVersion events.k8s.io/v1beta1 Kind Event
2021/01/05 17:45:22 [INFO] APIVersion authentication.k8s.io/v1 Kind TokenReview
2021/01/05 17:45:22 [INFO] APIVersion authentication.k8s.io/v1beta1 Kind TokenReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1 Kind LocalSubjectAccessReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1 Kind SelfSubjectAccessReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1 Kind SelfSubjectRulesReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1 Kind SubjectAccessReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1beta1 Kind LocalSubjectAccessReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1beta1 Kind SelfSubjectAccessReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1beta1 Kind SelfSubjectRulesReview
2021/01/05 17:45:22 [INFO] APIVersion authorization.k8s.io/v1beta1 Kind SubjectAccessReview
2021/01/05 17:45:22 [INFO] APIVersion autoscaling/v1 Kind HorizontalPodAutoscaler
2021/01/05 17:45:22 [INFO] APIVersion autoscaling/v2beta1 Kind HorizontalPodAutoscaler
2021/01/05 17:45:22 [INFO] APIVersion autoscaling/v2beta2 Kind HorizontalPodAutoscaler
2021/01/05 17:45:22 [INFO] APIVersion batch/v1 Kind Job
2021/01/05 17:45:22 [INFO] APIVersion batch/v1beta1 Kind CronJob
2021/01/05 17:45:22 [INFO] APIVersion certificates.k8s.io/v1beta1 Kind CertificateSigningRequest
2021/01/05 17:45:22 [INFO] APIVersion networking.k8s.io/v1 Kind NetworkPolicy
2021/01/05 17:45:22 [INFO] APIVersion networking.k8s.io/v1beta1 Kind Ingress
2021/01/05 17:45:22 [INFO] APIVersion policy/v1beta1 Kind PodDisruptionBudget
2021/01/05 17:45:22 [INFO] APIVersion policy/v1beta1 Kind PodSecurityPolicy
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1 Kind ClusterRoleBinding
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1 Kind ClusterRole
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1 Kind RoleBinding
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1 Kind Role
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1beta1 Kind ClusterRoleBinding
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1beta1 Kind ClusterRole
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1beta1 Kind RoleBinding
2021/01/05 17:45:22 [INFO] APIVersion rbac.authorization.k8s.io/v1beta1 Kind Role
2021/01/05 17:45:22 [INFO] APIVersion storage.k8s.io/v1 Kind StorageClass
2021/01/05 17:45:22 [INFO] APIVersion storage.k8s.io/v1 Kind VolumeAttachment
2021/01/05 17:45:22 [INFO] APIVersion storage.k8s.io/v1beta1 Kind CSIDriver
2021/01/05 17:45:22 [INFO] APIVersion storage.k8s.io/v1beta1 Kind CSINode
2021/01/05 17:45:22 [INFO] APIVersion storage.k8s.io/v1beta1 Kind StorageClass
2021/01/05 17:45:22 [INFO] APIVersion storage.k8s.io/v1beta1 Kind VolumeAttachment
2021/01/05 17:45:22 [INFO] APIVersion admissionregistration.k8s.io/v1beta1 Kind MutatingWebhookConfiguration
2021/01/05 17:45:22 [INFO] APIVersion admissionregistration.k8s.io/v1beta1 Kind ValidatingWebhookConfiguration
2021/01/05 17:45:22 [INFO] APIVersion apiextensions.k8s.io/v1beta1 Kind CustomResourceDefinition
2021/01/05 17:45:22 [INFO] APIVersion scheduling.k8s.io/v1 Kind PriorityClass
2021/01/05 17:45:22 [INFO] APIVersion scheduling.k8s.io/v1beta1 Kind PriorityClass
2021/01/05 17:45:22 [INFO] APIVersion coordination.k8s.io/v1 Kind Lease
2021/01/05 17:45:22 [INFO] APIVersion coordination.k8s.io/v1beta1 Kind Lease
2021/01/05 17:45:22 [INFO] APIVersion node.k8s.io/v1beta1 Kind RuntimeClass
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind FelixConfiguration
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind BlockAffinity
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind GlobalNetworkPolicy
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind IPAMHandle
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind GlobalNetworkSet
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind HostEndpoint
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind IPPool
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind NetworkSet
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind IPAMConfig
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind IPAMBlock
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind BGPConfiguration
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind NetworkPolicy
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind ClusterInformation
2021/01/05 17:45:22 [INFO] APIVersion crd.projectcalico.org/v1 Kind BGPPeer
2021/01/05 17:45:22 [INFO] APIVersion monitoring.coreos.com/v1 Kind PrometheusRule
2021/01/05 17:45:22 [INFO] APIVersion monitoring.coreos.com/v1 Kind PodMonitor
2021/01/05 17:45:22 [INFO] APIVersion monitoring.coreos.com/v1 Kind ThanosRuler
2021/01/05 17:45:22 [INFO] APIVersion monitoring.coreos.com/v1 Kind Alertmanager
2021/01/05 17:45:22 [INFO] APIVersion monitoring.coreos.com/v1 Kind ServiceMonitor
2021/01/05 17:45:22 [INFO] APIVersion monitoring.coreos.com/v1 Kind Prometheus
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind AuthConfig
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Notifier
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterMonitorGraph
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind GlobalDnsProvider
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Catalog
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectLogging
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind CisConfig
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind NodeDriver
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Setting
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterCatalog
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind DynamicSchema
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Group
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectMonitorGraph
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Preference
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind NodePool
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind MultiClusterAppRevision
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind KontainerDriver
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectNetworkPolicy
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind TemplateVersion
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind CisBenchmarkVersion
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind MonitorMetric
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind NodeTemplate
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind GlobalDns
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind RoleTemplate
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterRegistrationToken
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterTemplate
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectAlert
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind GlobalRole
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind EtcdBackup
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterAlert
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Project
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind PodSecurityPolicyTemplateProjectBinding
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Cluster
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind CatalogTemplateVersion
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Template
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ComposeConfig
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectRoleTemplateBinding
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind CatalogTemplate
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind PodSecurityPolicyTemplate
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind MultiClusterApp
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind GroupMember
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterScan
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Feature
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectAlertRule
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind RkeK8sServiceOption
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterLogging
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterAlertGroup
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind User
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind GlobalRoleBinding
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind TemplateContent
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ListenConfig
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Node
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterTemplateRevision
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind RkeK8sSystemImage
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind RkeAddon
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterRoleTemplateBinding
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind Token
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectAlertGroup
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ProjectCatalog
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind UserAttribute
2021/01/05 17:45:22 [INFO] APIVersion management.cattle.io/v3 Kind ClusterAlertRule
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind PipelineExecution
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind AppRevision
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind SourceCodeCredential
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind PipelineSetting
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind App
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind Pipeline
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind SourceCodeRepository
2021/01/05 17:45:22 [INFO] APIVersion project.cattle.io/v3 Kind SourceCodeProviderConfig
2021/01/05 17:45:22 [INFO] APIVersion metrics.k8s.io/v1beta1 Kind NodeMetrics
2021/01/05 17:45:22 [INFO] APIVersion metrics.k8s.io/v1beta1 Kind PodMetrics
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=Pipeline
2021/01/05 17:45:23 [INFO] Watching metadata for monitoring.coreos.com/v1, Kind=ServiceMonitor
2021/01/05 17:45:23 [INFO] Watching metadata for networking.k8s.io/v1, Kind=NetworkPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=BGPConfiguration
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectRoleTemplateBinding
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Template
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectAlert
2021/01/05 17:45:23 [INFO] Watching metadata for apps/v1, Kind=ReplicaSet
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Event
2021/01/05 17:45:23 [INFO] Watching metadata for rbac.authorization.k8s.io/v1, Kind=Role
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=IPAMConfig
2021/01/05 17:45:23 [INFO] Watching metadata for storage.k8s.io/v1beta1, Kind=CSIDriver
2021/01/05 17:45:23 [INFO] Watching metadata for policy/v1beta1, Kind=PodDisruptionBudget
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=CatalogTemplateVersion
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=GroupMember
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectAlertRule
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectNetworkPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Notifier
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=RkeK8sServiceOption
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Project
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=ResourceQuota
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=LimitRange
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=GlobalDns
2021/01/05 17:45:23 [INFO] Watching metadata for extensions/v1beta1, Kind=DaemonSet
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=ReplicationController
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=PersistentVolumeClaim
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Feature
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterRegistrationToken
2021/01/05 17:45:23 [INFO] Watching metadata for extensions/v1beta1, Kind=PodSecurityPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=PipelineSetting
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=NodeTemplate
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterAlertGroup
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Node
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=NodePool
2021/01/05 17:45:23 [INFO] Watching metadata for admissionregistration.k8s.io/v1beta1, Kind=MutatingWebhookConfiguration
2021/01/05 17:45:23 [INFO] Watching metadata for batch/v1, Kind=Job
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=PersistentVolume
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=DynamicSchema
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=IPAMBlock
2021/01/05 17:45:23 [INFO] Watching metadata for rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding
2021/01/05 17:45:23 [INFO] Watching metadata for apps/v1, Kind=Deployment
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=RoleTemplate
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Cluster
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=AuthConfig
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=ServiceAccount
2021/01/05 17:45:23 [INFO] Watching metadata for autoscaling/v1, Kind=HorizontalPodAutoscaler
2021/01/05 17:45:23 [INFO] Watching metadata for storage.k8s.io/v1, Kind=StorageClass
2021/01/05 17:45:23 [INFO] Watching metadata for admissionregistration.k8s.io/v1beta1, Kind=ValidatingWebhookConfiguration
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectLogging
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectAlertGroup
2021/01/05 17:45:23 [INFO] Watching metadata for extensions/v1beta1, Kind=NetworkPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=CatalogTemplate
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=RkeK8sSystemImage
2021/01/05 17:45:23 [INFO] Watching metadata for monitoring.coreos.com/v1, Kind=Prometheus
2021/01/05 17:45:23 [INFO] Watching metadata for events.k8s.io/v1beta1, Kind=Event
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterTemplateRevision
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterScan
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=MonitorMetric
2021/01/05 17:45:23 [INFO] Watching metadata for scheduling.k8s.io/v1, Kind=PriorityClass
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=NetworkSet
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=UserAttribute
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=ConfigMap
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=SourceCodeProviderConfig
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Service
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterAlertRule
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=MultiClusterAppRevision
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ListenConfig
2021/01/05 17:45:23 [INFO] Watching metadata for apps/v1, Kind=StatefulSet
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterTemplate
2021/01/05 17:45:23 [INFO] Watching metadata for networking.k8s.io/v1beta1, Kind=Ingress
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Node
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterMonitorGraph
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=TemplateContent
2021/01/05 17:45:23 [INFO] Watching metadata for storage.k8s.io/v1beta1, Kind=CSINode
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Group
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=PodSecurityPolicyTemplateProjectBinding
2021/01/05 17:45:23 [INFO] Watching metadata for apps/v1, Kind=ControllerRevision
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=KontainerDriver
2021/01/05 17:45:23 [INFO] Watching metadata for monitoring.coreos.com/v1, Kind=PrometheusRule
2021/01/05 17:45:23 [INFO] Watching metadata for node.k8s.io/v1beta1, Kind=RuntimeClass
2021/01/05 17:45:23 [INFO] Watching metadata for storage.k8s.io/v1, Kind=VolumeAttachment
2021/01/05 17:45:23 [INFO] Watching metadata for apiregistration.k8s.io/v1, Kind=APIService
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ComposeConfig
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=GlobalNetworkSet
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=PodTemplate
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=BlockAffinity
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterRoleTemplateBinding
2021/01/05 17:45:23 [INFO] Watching metadata for policy/v1beta1, Kind=PodSecurityPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for monitoring.coreos.com/v1, Kind=PodMonitor
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=FelixConfiguration
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=PipelineExecution
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=BGPPeer
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectCatalog
2021/01/05 17:45:23 [INFO] Watching metadata for monitoring.coreos.com/v1, Kind=ThanosRuler
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=RkeAddon
2021/01/05 17:45:23 [INFO] Watching metadata for extensions/v1beta1, Kind=ReplicaSet
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=PodSecurityPolicyTemplate
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Secret
2021/01/05 17:45:23 [INFO] Watching metadata for rbac.authorization.k8s.io/v1, Kind=RoleBinding
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Endpoints
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=SourceCodeRepository
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=ClusterInformation
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=GlobalNetworkPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=TemplateVersion
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Setting
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Preference
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=HostEndpoint
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Catalog
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=EtcdBackup
2021/01/05 17:45:23 [INFO] Watching metadata for apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterLogging
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=Token
2021/01/05 17:45:23 [INFO] Watching metadata for certificates.k8s.io/v1beta1, Kind=CertificateSigningRequest
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=User
2021/01/05 17:45:23 [INFO] Watching metadata for rbac.authorization.k8s.io/v1, Kind=ClusterRole
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=IPAMHandle
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=App
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=IPPool
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=CisConfig
2021/01/05 17:45:23 [INFO] Watching metadata for coordination.k8s.io/v1, Kind=Lease
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=NodeDriver
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Pod
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterAlert
2021/01/05 17:45:23 [INFO] Watching metadata for batch/v1beta1, Kind=CronJob
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=MultiClusterApp
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ProjectMonitorGraph
2021/01/05 17:45:23 [INFO] Watching metadata for apps/v1, Kind=DaemonSet
2021/01/05 17:45:23 [INFO] Watching metadata for crd.projectcalico.org/v1, Kind=NetworkPolicy
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=AppRevision
2021/01/05 17:45:23 [INFO] Watching metadata for /v1, Kind=Namespace
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=GlobalDnsProvider
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=ClusterCatalog
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=GlobalRole
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=CisBenchmarkVersion
2021/01/05 17:45:23 [INFO] Watching metadata for management.cattle.io/v3, Kind=GlobalRoleBinding
2021/01/05 17:45:23 [INFO] Watching metadata for project.cattle.io/v3, Kind=SourceCodeCredential
2021/01/05 17:45:23 [INFO] Watching metadata for monitoring.coreos.com/v1, Kind=Alertmanager
2021/01/05 17:46:54 [INFO] Stopping cluster agent for c-kclzj
2021/01/05 17:46:56 [INFO] Stopping cluster agent for c-bks4b
2021/01/05 17:46:58 [INFO] Stopping cluster agent for c-b6pgw
2021/01/05 17:46:59 [INFO] Stopping cluster agent for c-slvlr
2021/01/05 17:47:40 [INFO] Shutting down CatalogTemplateVersionController controller
2021/01/05 17:47:40 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=ClusterRole workers
2021/01/05 17:47:40 [INFO] Shutting down UserAttributeController controller
2021/01/05 17:47:40 [INFO] Shutting down NodeTemplateController controller
2021/01/05 17:47:40 [INFO] Shutting down GroupMemberController controller
2021/01/05 17:47:40 [INFO] Shutting down SettingController controller
2021/01/05 17:47:40 [INFO] Shutting down SecretController controller
2021/01/05 17:47:40 [INFO] Shutting down RKEK8sSystemImageController controller
2021/01/05 17:47:40 [INFO] Shutting down SecretController controller
2021/01/05 17:47:40 [INFO] Shutting down MultiClusterAppRevisionController controller
2021/01/05 17:47:40 [INFO] Shutting down ClusterTemplateRevisionController controller
2021/01/05 17:47:40 [INFO] Shutting down NodeDriverController controller
2021/01/05 17:47:40 [INFO] Shutting down GlobalRoleBindingController controller
2021/01/05 17:47:40 [INFO] Shutting down UserController controller
2021/01/05 17:47:40 [INFO] Shutting down ProjectCatalogController controller
2021/01/05 17:47:40 [INFO] Shutting down KontainerDriverController controller
2021/01/05 17:47:40 [INFO] Shutting down ClusterCatalogController controller
2021/01/05 17:47:40 [INFO] Shutting down TokenController controller
2021/01/05 17:47:40 [INFO] Shutting down ClusterController controller
2021/01/05 17:47:40 [INFO] Shutting down MultiClusterAppController controller
I0105 17:47:40.354779       8 trace.go:116] Trace[446554467]: "Reflector ListAndWatch" name:github.com/rancher/steve/pkg/clustercache/controller.go:164 (started: 2021-01-05 17:45:23.339854626 +0000 UTC m=+3.808563428) (total time: 2m17.014851119s):
Trace[446554467]: [2m17.014851119s] [2m17.014851119s] END
2021/01/05 17:47:40 [INFO] Shutting down EndpointsController controller
2021/01/05 17:47:40 [INFO] Shutting down RKEAddonController controller
2021/01/05 17:47:40 [INFO] Shutting down CatalogController controller
2021/01/05 17:47:40 [INFO] Shutting down GroupController controller
2021/01/05 17:47:40 [INFO] Shutting down CisConfigController controller
2021/01/05 17:47:40 [INFO] Shutting down PodSecurityPolicyTemplateController controller
2021/01/05 17:47:40 [INFO] Shutting down ProjectRoleTemplateBindingController controller
2021/01/05 17:47:40 [INFO] Shutting down ProjectLoggingController controller
2021/01/05 17:47:40 [INFO] Shutting down CatalogTemplateController controller
2021/01/05 17:47:40 [INFO] Shutting down GlobalDNSController controller
2021/01/05 17:47:40 [INFO] Shutting down ClusterRegistrationTokenController controller
2021/01/05 17:47:40 [INFO] Shutting down RKEK8sServiceOptionController controller
2021/01/05 17:47:40 [INFO] Shutting down CisConfigController controller
2021/01/05 17:47:40 [INFO] Shutting down CisBenchmarkVersionController controller
2021/01/05 17:47:40 [INFO] Shutting down PipelineController controller
2021/01/05 17:47:40 [INFO] Shutting down CisBenchmarkVersionController controller
2021/01/05 17:47:40 [INFO] Shutting down AppController controller
2021/01/05 17:47:40 [INFO] Shutting down SourceCodeCredentialController controller
2021/01/05 17:47:40 [INFO] Shutting down RoleTemplateController controller
2021/01/05 17:47:40 [INFO] Shutting down NodeController controller
2021/01/05 17:47:40 [INFO] Shutting down SecretController controller
2021/01/05 17:47:40 [INFO] Shutting down AuthConfigController controller
2021/01/05 17:47:40 [INFO] Shutting down NamespaceController controller
2021/01/05 17:47:40 [INFO] Shutting down DynamicSchemaController controller
2021/01/05 17:47:40 [INFO] Shutting down apiregistration.k8s.io/v1, Kind=APIService workers
2021/01/05 17:47:40 [INFO] Shutting down ClusterRoleBindingController controller
2021/01/05 17:47:40 [INFO] Shutting down PipelineExecutionController controller
2021/01/05 17:47:40 [INFO] Shutting down ClusterRoleTemplateBindingController controller
2021/01/05 17:47:40 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=Role workers
2021/01/05 17:47:40 [INFO] Shutting down ClusterRoleController controller
2021/01/05 17:47:40 [INFO] Shutting down apiextensions.k8s.io/v1beta1, Kind=CustomResourceDefinition workers
2021/01/05 17:47:40 [FATAL] leaderelection lost for cattle-controllers
2021/01/05 17:47:40 [INFO] Shutting down NodePoolController controller
2021/01/05 17:47:40 [INFO] Shutting down ProjectController controller
2021/01/05 17:47:40 [INFO] Shutting down rbac.authorization.k8s.io/v1, Kind=Role workers

Please share more information on the setup:

  • On what cluster is Rancher running (RKE/K3S/other, what k8s version)
  • Does it happen every day around the same time or different times/random
  • What do the k8s components log at the time it happens (especially etcd/kube-apiserver/kubelet)
  • What are the specs of the machines of the cluster Rancher is running on and how many resources are there in the cluster (apps/pods/secrets/configmaps etc)

Hi there! Thanks for the quick reply. Here it goes:

  • On what cluster is Rancher running (RKE/K3S/other, what k8s version)
    k8s 1.15 - RKE latest

  • Does it happen every day around the same time or different times/random
    Everyday, random times

  • What do the k8s components log at the time it happens (especially etcd/kube-apiserver/kubelet)
    Etcd seems normal as usual
    Kubelet seems like this:
    E0105 21:28:06.392920 2625 pod_workers.go:190] Error syncing pod 00a5ac0d-9308-4e64-99dd-61685f2a0fd4 (“cattle-cluster-agent-666cbff5bc-49pxn_cattle-system(00a5ac0d-9308-4e64-99dd-61685f2a0fd4)”), skipping: failed to “StartContainer” for “cluster-register” with CrashLoopBackOff: “Back-off 5m0s restarting failed container=cluster-register pod=cattle-cluster-agent-666cbff5bc-49pxn_cattle-system(00a5ac0d-9308-4e64-99dd-61685f2a0fd4)”
    E0105 21:28:09.393076 2625 pod_workers.go:190] Error syncing pod 9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7 (“cattle-node-agent-qzkpw_cattle-system(9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7)”), skipping: failed to “StartContainer” for “agent” with CrashLoopBackOff: “Back-off 5m0s restarting failed container=agent pod=cattle-node-agent-qzkpw_cattle-system(9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7)”
    W0105 21:28:13.223144 2625 setters.go:144] replacing cloudprovider-reported hostname of ip-xx-xx-xx-xx.system.local with overridden hostname of xx.xx.xx.xx
    E0105 21:28:16.392972 2625 pod_workers.go:190] Error syncing pod d0770b34-90de-4fdc-9376-b337e193a7ee (“rancher-8548b55b9f-frf4l_cattle-system(d0770b34-90de-4fdc-9376-b337e193a7ee)”), skipping: failed to “StartContainer” for “rancher” with CrashLoopBackOff: “Back-off 5m0s restarting failed container=rancher pod=rancher-8548b55b9f-frf4l_cattle-system(d0770b34-90de-4fdc-9376-b337e193a7ee)”
    E0105 21:28:19.393018 2625 pod_workers.go:190] Error syncing pod 00a5ac0d-9308-4e64-99dd-61685f2a0fd4 (“cattle-cluster-agent-666cbff5bc-49pxn_cattle-system(00a5ac0d-9308-4e64-99dd-61685f2a0fd4)”), skipping: failed to “StartContainer” for “cluster-register” with CrashLoopBackOff: “Back-off 5m0s restarting failed container=cluster-register pod=cattle-cluster-agent-666cbff5bc-49pxn_cattle-system(00a5ac0d-9308-4e64-99dd-61685f2a0fd4)”
    W0105 21:28:23.233435 2625 setters.go:144] replacing cloudprovider-reported hostname of ip-xx-xx-xx-xx.system.local with overridden hostname of xx.xx.xx.xx
    E0105 21:28:24.392980 2625 pod_workers.go:190] Error syncing pod 9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7 (“cattle-node-agent-qzkpw_cattle-system(9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7)”), skipping: failed to “StartContainer” for “agent” with CrashLoopBackOff: “Back-off 5m0s restarting failed container=agent pod=cattle-node-agent-qzkpw_cattle-system(9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7)”
    E0105 21:28:27.393047 2625 pod_workers.go:190] Error syncing pod d0770b34-90de-4fdc-9376-b337e193a7ee (“rancher-8548b55b9f-frf4l_cattle-system(d0770b34-90de-4fdc-9376-b337e193a7ee)”), skipping: failed to “StartContainer” for “rancher” with CrashLoopBackOff: “Back-off 5m0s restarting failed container=rancher pod=rancher-8548b55b9f-frf4l_cattle-system(d0770b34-90de-4fdc-9376-b337e193a7ee)”

kube-apiserver seems like this:
I0105 21:24:34.922590 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
I0105 21:25:34.925029 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
I0105 21:26:34.927044 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
E0105 21:27:03.830186 1 writers.go:163] apiserver was unable to write a JSON response: write tcp xx.xx.xx.xx:6443->xx.xx.xx.xx:4435: write: broken pipe
E0105 21:27:03.830219 1 status.go:71] apiserver received an error that is not an metav1.Status: &net.OpError{Op:“write”, Net:“tcp”, Source:(*net.TCPAddr)(0xc0e9965d70), Addr:(*net.TCPAddr)(0xc0e9965da0), Err:(*os.SyscallError)(0xc142ef0800)}
I0105 21:27:34.929454 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
I0105 21:28:34.932080 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
I0105 21:29:34.934364 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
I0105 21:30:34.936928 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io
I0105 21:31:34.939334 1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1-metrics-k8s-io

  • Got exactly at this moment. Results may vary depending on requests/attempts to make it work

  • What are the specs of the machines of the cluster Rancher is running on and how many resources are there in the cluster (apps/pods/secrets/configmaps etc)
    On that cluster - by the way… this cluster is a intermediate cluster used to have access to the k8s API which means the server entry on kubeconfig is that cluster endpoint - we have 3 nodes with 16vCPU and 64GB RAM with essential services:
    k8s_rancher_rancher-8548b55b9f-frf4l_cattle-system_d0770b34-90de-4fdc-9376-b337e193a7ee_62
    k8s_grafana-proxy_grafana-cluster-monitoring-75c8bd976d-mwt7m_cattle-prometheus_c8e373f9-e70b-4773-9340-3bbf3ee23bf2_12
    k8s_grafana_grafana-cluster-monitoring-75c8bd976d-mwt7m_cattle-prometheus_c8e373f9-e70b-4773-9340-3bbf3ee23bf2_11
    k8s_default-http-backend_default-http-backend-5bcc9fd598-hw4xw_ingress-nginx_050ef89f-e1da-478e-b1d9-d2cb3e69e0a8_24
    k8s_coredns_coredns-799dffd9c4-tqj95_kube-system_e55322cf-91cf-426a-91cc-9bf0f0dda64e_74
    k8s_POD_coredns-799dffd9c4-tqj95_kube-system_e55322cf-91cf-426a-91cc-9bf0f0dda64e_31446
    k8s_tiller_tiller-deploy-54f7455d59-68ttz_kube-system_137c0368-740f-4d77-947c-ade404aaf6b0_12
    k8s_POD_cattle-cluster-agent-666cbff5bc-49pxn_cattle-system_00a5ac0d-9308-4e64-99dd-61685f2a0fd4_31029
    k8s_POD_tiller-deploy-54f7455d59-68ttz_kube-system_137c0368-740f-4d77-947c-ade404aaf6b0_31628
    k8s_POD_rancher-8548b55b9f-frf4l_cattle-system_d0770b34-90de-4fdc-9376-b337e193a7ee_2
    k8s_POD_grafana-cluster-monitoring-75c8bd976d-mwt7m_cattle-prometheus_c8e373f9-e70b-4773-9340-3bbf3ee23bf2_32154
    k8s_POD_default-http-backend-5bcc9fd598-hw4xw_ingress-nginx_050ef89f-e1da-478e-b1d9-d2cb3e69e0a8_17856
    k8s_kube-flannel_canal-hdvfd_kube-system_8759c4ed-0372-4295-8a3a-4204d4ce5bb1_651
    k8s_calico-node_canal-hdvfd_kube-system_8759c4ed-0372-4295-8a3a-4204d4ce5bb1_1056
    k8s_exporter-node_exporter-node-cluster-monitoring-hj4rl_cattle-prometheus_731f9b61-8eb0-4d9c-8b6b-f039a548352e_25
    k8s_POD_exporter-node-cluster-monitoring-hj4rl_cattle-prometheus_731f9b61-8eb0-4d9c-8b6b-f039a548352e_14
    k8s_POD_cattle-node-agent-qzkpw_cattle-system_9da4512f-5b88-4ddc-9db5-9a5f3b3cdcc7_14
    k8s_nginx-ingress-controller_nginx-ingress-controller-9h5ds_ingress-nginx_11da51d0-f071-4933-8083-a06d0e6dac81_1295
    k8s_POD_canal-hdvfd_kube-system_8759c4ed-0372-4295-8a3a-4204d4ce5bb1_17
    k8s_POD_nginx-ingress-controller-9h5ds_ingress-nginx_11da51d0-f071-4933-8083-a06d0e6dac81_19
    etcd-rolling-snapshots
    kube-proxy
    kubelet
    kube-scheduler
    kube-controller-manager
    kube-apiserver
    etcd

Can you check memory usage of Rancher and cattle-cluster-agent pod? How many resources are in the cluster? And do you have any info what happened 3 weeks ago that could have caused the issue (maybe onboarding new clusters/projects which bumped the resource count)