Hello everyone,
With Rancher v2.8.5 and 4 Oracle Linux Server 9.4 nodes. I would like to create a Kubernetes cluster with 1 master and 3 workers.
Selected options via Rancher UI
- Kubernetes: v1.28.15+rke2r1
- Cloud Provider: Default - RKE2 Embedded
- System Services: CoreDNS, Metrics Server
At step executing Registration Command on nodes
- etcd, Control Plan: for master node
- firewall rules to open ports:
rule family="ipv4" source address="each worker IP" port port="10250" protocol="tcp" accept rule family="ipv4" source address="each worker IP" port port="6443" protocol="tcp" accept rule family="ipv4" source address="each worker IP" port port="9345" protocol="tcp" accept
- rancher-system-agent.service is running
- suspicious logs:
level=info msg="[Applyinator] Command sh [-c rke2 etcd-snapshot list --etcd-s3=false 2>/dev/null] finished with err: <nil> and exit code: 0" level=info msg="[K8s] updated plan secret fleet-default/custom-45efbb5ab4e0-machine-plan with feedback"
- suspicious logs:
- deploy/cattle-cluster-agent is running
- last error log:
error syncing 'rancher-charts': handler helm-clusterrepo-download: update failure: git -C /var/lib/rancher-data/local-catalogs/v2/rancher-charts/4b40...a721 fetch origin -- release-v2.8 error: exit status 128, detail: fatal: unable to access 'https://git.rancher.io/charts/': The requested URL returned error: 502\n, requeuing
- last error log:
- pod/rancher-webhook-fd7… is Pending with Event message:
0/1 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
- firewall rules to open ports:
- only worker: for worker nodes
- firewall rules top open ports:
rule family="ipv4" source address="master IP" port port="9345" protocol="tcp" accept rule family="ipv4" source address="master IP" port port="10250" protocol="tcp" accept
- rancher-system-agent.service is running
- logs:
systemd[1]: Started Rancher System Agent. rancher-system-agent[3358]: time="2024-12-18T08:55:34+01:00" level=info msg="Rancher System Agent version v0.3.6 (41c07d0) is starting" rancher-system-agent[3358]: time="2024-12-18T08:55:34+01:00" level=info msg="Using directory /var/lib/rancher/agent/work for work" rancher-system-agent[3358]: time="2024-12-18T08:55:34+01:00" level=info msg="Starting remote watch of plans" rancher-system-agent[3358]: time="2024-12-18T08:55:34+01:00" level=info msg="Starting /v1, Kind=Secret controller"
- logs:
- firewall rules top open ports:
Rancher UI
- All machines has State “Waiting for Node Ref”
- Provisioning Log:
[INFO ] waiting for at least one control plane, etcd, and worker node to be registered [INFO ] non-ready bootstrap machine(s) custom-45efbb5ab4e0 and join url to be available on bootstrap node
I would like to ask how to troubleshoot this problem? Any aspect that I missed?
Thank you in advance!