Here is the scenario
- We have backup on for Rancher 2.6.8
- We created new EKS 1.23 cluster to host Rancher 2.7.1
- We followed instructions from Migrating Rancher to a New Cluster | Rancher Manager
- Everything seems to be fine except.
The cattle-cluster-agents on the downstream keep starting every few seconds.
with 2.6.8 image, and this alternates between 2.7.1 and 2.6.8
- So somewhere something is still hanging on to 2.6.8
- Please note that we didn’t manually deimport the downstream clusters from original 2.6.8 Rancher server.
Please advise what needs to be done, here.
Also in the previous scenario all our downstream EKS clusters are 1.23 version
This is what we are seeing
Pulled Pod cattle-cluster-agent-f6d465b9d-sg62s
Container image “rancher/rancher-agent:v2.6.8” already present on machine Thu, Apr 13 2023 3:37:33 pm
Created Pod cattle-cluster-agent-f6d465b9d-sg62s
Created container cluster-register Thu, Apr 13 2023 3:37:33 pm
Started Pod cattle-cluster-agent-f6d465b9d-sg62s
Started container cluster-register Thu, Apr 13 2023 3:37:33 pm
Scheduled Pod cattle-cluster-agent-f6d465b9d-sg62s
Successfully assigned cattle-system/cattle-cluster-agent-f6d465b9d-sg62s to ip-10-98-1-107.ec2.internal Thu, Apr 13 2023 3:37:32 pm
SuccessfulCreate ReplicaSet cattle-cluster-agent-f6d465b9d
(combined from similar events): Created pod: cattle-cluster-agent-f6d465b9d-sg62s Thu, Apr 13 2023 3:37:32 pm
Killing Pod cattle-cluster-agent-f6d465b9d-dzqph
Stopping container cluster-register
Is the new cluster reachable under the old name? And is the old rancher shut-down before starting the new?
Actually, the reason why we saw what we saw was that the old cluster still had rancher agents running on it. Meaning the downstream clusters were pseudo managed by the old rancher instance.
As soon as we got rid of the old rancher instance the problem went away.
Because we kept wondering why we kept seeing 2.6.8 cluster agents come and go on the downstream clusters.