Rancher 2.7 Cluster agent is not connected

neupart · January 3, 2023, 8:17am

History:

Running Rancher 2.6 with 5 AWS clusters
November 2022, AWS complains that Kubernetes needs to be upgraded from 1.21
November 2022, Upgrading Rancher to 2.7 to allow for upgrade of Kubernetes
December 2022, Succesfull upgraded clusters to version 1.22
December 2022, two clusters fails, ends up deleting them as I cannot get access
December 2022, cannot create a new cluster on AWS (fails on timeout, clusters seems fine in AWS UI)
December 2022, succesfull upgraded two clusters to 1.23, one fails to upgrade to 1.24
January 2023, Clusters are now (after Christmas vacation) not connected

Any guides on how to reconnect the clusters?
Any guides to avoid rancher trying to downgrade from 1.24 to 1.23?

neupart · January 3, 2023, 2:15pm

Tried to create a new Rancher 2.7,

imported an existing EKS Cluster
Created a new EKS cluster with default values

Nothing in Rancher 2.7 seems to work with AWS EKS?
Should I downgrade Rancher? will it help me?

Any help is welcome…

neupart · January 4, 2023, 8:51am

I have tried creating new EKS clusters with Rancher 2.7 (latest), 2.6.9 and 2.7-head - all with same result.

The underlying issue is “Cluster agent is not connected”

I guess that the issue is within Amazon EKS for nodes above version 1.21…

How do I make the cluster agent connect? how do I find the reason?

mikeh89 · January 5, 2023, 3:40pm

Hey there,

It’s probably due to the version of k8s on EKS you’re using.
According to the docs for Rancher 2.7, this supports k8s on EKS from version 1.23 onwards. If you’re running 1.22, it’s not officially supported.
Rancher 2.6.9 is certified from k8s 1.20 to k8s 1.23 on EKS, so that would be a better place to start with until you’ve updated your clusters to at least k8s 1.23.

How are you creating the clusters? Through Rancher, or in EKS and importing them into Rancher?

Do you have any additional logging available? Perhaps from the agents on the clusters themselves? The logging for the agents on clusters may provide some additional insight (for example, authentication errors)

You should be able to check the logs with kubectl as long as you have access to the config, or perhaps with the info from the EKS console.

neupart · January 6, 2023, 1:02pm

Thanks

All our EKS clusters are created with Rancher.

Manually updated all existing clusters to v1.24, but still had connection issues with existing clusters and creating new ended with connection issues as well.

Upgraded Rancher to 2.7-head and success, new Agents was deployed and they are able to connect, and I was able to create new clusters

guptaashwanee · April 6, 2023, 11:58pm

I have created EKS cluster through Rancher 2.7.

deployed kubernetesVersion: ‘1.25’

still having the same issue, is there any ssh or something that I was suppose to setup,

guptaashwanee · April 7, 2023, 12:01am

erodotos · April 27, 2023, 10:01am

Any luck with this issue?

I face the same issue when importing an AKS cluster (v 1.24.10 ) on rancher 2.7.3

Zanthium · May 12, 2023, 1:36am

I am also running into similar issues when trying to start up a vSphere cluster (v1.25.9) on rancher 2.7.3. Its possible I might be doing something wrong but symptoms are the same.

grinono · May 31, 2023, 6:59pm

i’m having the same issues! luckaly we are just running staging stuff on rancher. It’s not production stable of you ask me. Did anyone figure out a potential solution for this?

Abdullah_Manzoor · July 19, 2023, 11:59am

Dear All,

I was stuck in the same situation and after figuring out the lot of staff finally resolve the problem by using the same Kubectl version on both end. Actually Rancher is running with Kubernetes version 1.26.4 and I was using the 1.27.3 version. After downgrade the version to 1.26.4 agent got connected facing no error.

nikhil_rajendran · July 24, 2023, 5:05am

edit the configmap aws-auth in eks and add add the iam user with eks access to make this work
like below
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::xxxxxxxx:role/eksctl-mycluster-nodegroup-eksdem-NodeInstanceRole-KP5A9ZLNY7CC
username: system:node:{{EC2PrivateDNSName}}
mapUsers: |
- userarn: arn:aws:iam::051542606790:user/eks
username: eks
groups:
- system:masters
kind: ConfigMap
metadata:
creationTimestamp: “2023-07-20T03:34:45Z”
name: aws-auth
namespace: kube-system
resourceVersion: “916206”
uid: 9b3d0a03-093f-4a65-b174-d866d8fab748

golem1987 · September 21, 2023, 2:02pm

Have you tried something like below from local management cluster

kubectl patch clusters.management.cattle.io <CLUSTERID> -p '{"status":{"agentImage":"dummy"}}' --type merge

neupart · November 20, 2024, 7:41pm

Now running Rancher 2.10 on a k8s cluster, and again got the same issue

On cluster that I just created, and one cluster that I imported, same issue…

I hope that someone can tell me what I am doing wrong, or if the issues is simple still there…

neupart · November 21, 2024, 5:22pm

I was using a “real” TLS HTTPS certificate that is validated by an official public CA.

This is a “no go” with the default settings from Rancher 2.9 and up. To use an external validated certificate you have to change the settings from Strict (Where only the Rancher generated CA will be accepted, so you have to distribute this CA Certificate manually) to System Store.

Now I can both Import and Create new AWS EKS Clusters.

Topic		Replies	Views
Rancher2.x 导入集群频繁出现Cluster agent is not connected	0	591	July 13, 2023
"Waiting for API to be available" for downstream EKS cluster Rancher	4	4226	September 29, 2023
Rancher 2.6.8 Migration to Rancher 2.7.1 Rancher	4	654	April 24, 2023
Rancher web interface complains "waiting for cluster agent to connect" when installing first node Rancher	2	2401	February 8, 2024
Cluster unavailable - Failed to communicate with API server - waiting for cluster agent to connect Rancher	0	3347	February 28, 2019

Rancher 2.7 Cluster agent is not connected

Related topics