Shadow
February 28, 2024, 4:53am
1
Hi everyone, I am using Kubernetes version 1.286. I want to join my cluster to Rancher, but It happen issue as below, my Rancher is version latest
Please help check, thank you!
tpapp
February 28, 2024, 9:47am
2
Hi,
I can suggest only a workaround not a solution.
Please check if there is data in the secrets named like “custom--machine-plan” in the fleet-default namespace.
If no data, try register another node with all roles (–etcd --controlplane --worker)
This will generate another secret, hopefully containing the data.
Similar issue here:
opened 02:48PM - 09 Jan 24 UTC
[zube]: To Triage
kind/bug
area/fleet
team/fleet
### Is there an existing issue for this?
- [X] I have searched the existing i… ssues
### Current Behavior
When trying to register a new node with a new downstream RKE2 cluster in Rancher 2.7.9 (also 2.7.5) we see the nodes plan Secret is never populated so the `rancher-system-agent` endlessly polls for a plan.
If we re-deploy the `fleet-agent` Deployment prior to creating the new downstream cluster definition in Rancher we can occasionally register nodes.
We have to re-deploy `fleet-agent` each time we need to create a new cluster, though this does not consistently work around the issue.
* re-deploy fleet-agent Deployment on the Rancher cluster (k -n cattle-fleet-local-system rollout restart deployment fleet-agent)
* create new downstream cluster definition
* register node(s) to cluster
if the registration fails or we need to re-create the cluster we wipe the nodes, delete the cluster from Rancher and repeat the steps above.
From the `fleet-controller` logs when creating the downstream cluster named "test":
```
2024-01-09T14:14:27.714641430Z time="2024-01-09T14:14:27Z" level=info msg="While calculating status.ResourceKey, error running helm template for bundle mcc-test-managed-system-upgrade-controller with target options from : chart requires kubeVersion: >= 1.23.0-0 which is incompatible with Kubernetes v1.20.0"
```
The workaround of restarting the `fleet-agent` is not consistent, sometimes repeated manual loops of create cluster, register, delete cluster work.
Registration of nodes to k3s clusters _**appears**_ to work, I've not tested that as much
### Expected Behavior
We can create register nodes to newly created downstream clusters.
### Steps To Reproduce
* create new rke2 cluster
* run registration command on cluster bootstrap node
### Environment
```markdown
- Architecture: x86_64
- Fleet Version: 1.7.1 and 1.8.1
- Cluster:
- Provider: rke2
- Options:
- Kubernetes Version: v1.26.11+rke2r1
```
### Logs
Logs from `fleet-agent` after a restart followed by a failed node registration:
```
I0109 14:34:16.884697 1 leaderelection.go:248] attempting to acquire leader lease cattle-fleet-local-system/fleet-agent-lock...
2024-01-09T14:34:20.761215643Z I0109 14:34:20.760567 1 leaderelection.go:258] successfully acquired lease cattle-fleet-local-system/fleet-agent-lock
2024-01-09T14:34:21.514842587Z time="2024-01-09T14:34:21Z" level=info msg="Starting /v1, Kind=ServiceAccount controller"
2024-01-09T14:34:21.515239711Z time="2024-01-09T14:34:21Z" level=info msg="Starting /v1, Kind=Secret controller"
2024-01-09T14:34:21.515651076Z time="2024-01-09T14:34:21Z" level=info msg="Starting /v1, Kind=Node controller"
2024-01-09T14:34:21.515921289Z time="2024-01-09T14:34:21Z" level=info msg="Starting /v1, Kind=ConfigMap controller"
2024-01-09T14:34:22.245467409Z E0109 14:34:22.245355 1 memcache.go:206] couldn't get resource list for management.cattle.io/v3:
time="2024-01-09T14:34:22Z" level=info msg="Starting fleet.cattle.io/v1alpha1, Kind=BundleDeployment controller"
time="2024-01-09T14:34:22Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
time="2024-01-09T14:34:22Z" level=info msg="getting history for release fleet-agent-local"
time="2024-01-09T14:34:22Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
time="2024-01-09T14:34:23Z" level=info msg="Deleting orphan bundle ID rke2, release kube-system/rke2-canal"
time="2024-01-09T14:34:24Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
time="2024-01-09T14:34:25Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
```
Logs from `fleet-agent` after a restart, create new cluster and successful registration:
```
I0109 14:37:40.958163 1 leaderelection.go:248] attempting to acquire leader lease cattle-fleet-local-system/fleet-agent-lock...
2024-01-09T14:37:44.767848536Z I0109 14:37:44.767654 1 leaderelection.go:258] successfully acquired lease cattle-fleet-local-system/fleet-agent-lock
2024-01-09T14:37:45.799901278Z time="2024-01-09T14:37:45Z" level=info msg="Starting /v1, Kind=ConfigMap controller"
2024-01-09T14:37:45.799938559Z time="2024-01-09T14:37:45Z" level=info msg="Starting /v1, Kind=Secret controller"
2024-01-09T14:37:45.799944609Z time="2024-01-09T14:37:45Z" level=info msg="Starting /v1, Kind=Node controller"
2024-01-09T14:37:45.799949489Z time="2024-01-09T14:37:45Z" level=info msg="Starting /v1, Kind=ServiceAccount controller"
E0109 14:37:45.966607 1 memcache.go:206] couldn't get resource list for management.cattle.io/v3:
2024-01-09T14:37:45.991817525Z time="2024-01-09T14:37:45Z" level=info msg="Starting fleet.cattle.io/v1alpha1, Kind=BundleDeployment controller"
2024-01-09T14:37:45.992046547Z time="2024-01-09T14:37:45Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
2024-01-09T14:37:46.002690980Z time="2024-01-09T14:37:46Z" level=info msg="getting history for release fleet-agent-local"
2024-01-09T14:37:46.255440243Z time="2024-01-09T14:37:46Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
2024-01-09T14:37:47.041131051Z time="2024-01-09T14:37:47Z" level=info msg="Deleting orphan bundle ID rke2, release kube-system/rke2-canal"
2024-01-09T14:37:48.276516222Z time="2024-01-09T14:37:48Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
2024-01-09T14:37:48.527326573Z time="2024-01-09T14:37:48Z" level=info msg="Deploying bundle cluster-fleet-local-local-1a3d67d0a899/fleet-agent-local"
```
### Anything else?
Ref https://github.com/rancher/rancher/issues/43901 specifically https://github.com/rancher/rancher/issues/43901#issuecomment-1881021356