Kubectl get componentstatus fails for scheduler and controller-manager

Dood afternoon,

I have imported some clusters and the Rancher Dashboard can see the folowing alerts

Alert: Component controller-manager is unhealthy.

Alert: Component scheduler is unhealthy.

How can I fix this error?

The dashboard alerts are coming from kubectl get componentstatuses, and if that returns not healthy it will show an alert. See https://github.com/rancher/rancher/issues/11496 for possible causes, please also provide what cluster(s) you are importing (build using what tool) and what Rancher version you are using.

My k8S clusters are imported clusters.

My cluster version is rancher / rancher: v2.3.1

ALL CLUSTER HAVE THE SAME ERRORS
+++++++++++++++++++++++++++++++++++

To be more exact, I sometimes have these errors and sometimes I don’t have them.

All the cluster that you can see are identical and all have the same problem:

clusters_imported|690x105

JMONTERO CLUSTER:
++++++++++++++++++++++
For example:

In the Dasboard of jmontero Cluster ,You can see Controller Manager and Scheduler unhealthy:

but nevertheless the pods Controller Manager and Scheduler of jmontero cluster are always running fine as you can see in the folowing screen:

I dont undestan

THIS SMALL ERROR
+++++++++++++++++++++

However, I can manage the cluster well even if the Dashboard shows these errors.

I can scale up and down pods,delete pods, and take any action on my workloads.

The only thing is that it hurts the eye and for that simple and silly reason my bosses don’t let me put Rancher into production.

My boss say me, this causes confusion for operators, administrators of the clusters of k8s[quote=“jsanguino, post:1, topic:15801, full:true”]
Dood afternoon,

I have imported some clusters and the Rancher Dashboard can see the folowing alerts

Alert: Component controller-manager is unhealthy.

Alert: Component scheduler is unhealthy.

How can I fix this error?
[/quote]

Hi again,

As you can see the componentscontroll Manager and Scheulder in jmontero cluster are green now:

sas

But later they will appear in red.

What can I do to solve this behavior?

Running kubectl get componentstatuses -o yaml when you see the alerts will help with determining what the underlying issue is of the failed check. Logging from the kube-apiserver might help as well. Please also share what tool was used to create the clusters that you imported. Seeing the restart count of the components pods, they are often restarted. If this is not expected, you can check events on the Deployment to see why it was killed/stopped.

1 Like

I have a similar issue with an imported OVH managed kubernetes cluster. Etcd, Controller Manager and Scheduler are unhealthy although the cluster is operational. Maybe I’m missing something or it is a matter of cluster configuration on OVH side.

kubectl get componentstatuses -o yaml
apiVersion: v1
items:

  • apiVersion: v1
    conditions:
    • message: ‘Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect:
      connection refused’
      status: “False”
      type: Healthy
      kind: ComponentStatus
      metadata:
      creationTimestamp: null
      name: controller-manager
      selfLink: /api/v1/componentstatuses/controller-manager
  • apiVersion: v1
    conditions:
    • message: ‘Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect:
      connection refused’
      status: “False”
      type: Healthy
      kind: ComponentStatus
      metadata:
      creationTimestamp: null
      name: scheduler
      selfLink: /api/v1/componentstatuses/scheduler
  • apiVersion: v1
    conditions:
    • message: ‘Get https://coastguard:23790/health: stream error: stream ID 1; INTERNAL_ERROR’
      status: “False”
      type: Healthy
      kind: ComponentStatus
      metadata:
      creationTimestamp: null
      name: etcd-0
      selfLink: /api/v1/componentstatuses/etcd-0
      kind: List
      metadata:
      resourceVersion: “”
      selfLink: “”

I am not yet a lot familiar to debug it so any hints are welcome.

Thanks.

1 Like

Exact same issue for me with OVH managed kubernetes cluster.

How can I check the component pod status? I don’t see any similar namespace around. Kube-system?

Regarding OVH, this is expected due to their configuration (see https://gitter.im/ovh/kubernetes?at=5d1f0431bf25f013e7c67c28).

@balzaczyy The resource is not namespaced, the command shown above should give you the output (see https://kubernetes.io/docs/reference/kubectl/overview/#resource-types)

Helo,
Thanks, I confirm Ovh replied me that is on their configuration.

Is there any way that rancher prevent this alerts on managed cluster ? (I think is the same thing on other Managed K8s provider)

Thanks

2 Likes

Hi,
It’s still a common issue with OVH Managed Kubernetes. Unfortunately disabling or deleting alerts for local cluster in Tools > alerts does not have effect.


I’ll try with integrated Rancher Monitoring (Prometheus).