I am trying to deploy Rancher stable using this on a centos 7 host:
All was going well up until the end, when I got:
error: deployment “rancher” exceeded its progress deadline
Status is stuck at this point:
kubectl -n cattle-system get deploy rancher
NAME READY UP-TO-DATE AVAILABLE AGE
rancher 1/3 3 1 35m
How do I analyse what went wrong?
I have done “journalctl -u k3s” but I don’t see any standouts amongst all the logs.
I would try to list the pods in cattle-system, kubectl -n cattle-system get pods, and then check the pod logs, kubectl logs -n cattle-system rancher-xxxxx, and see if those say why it timed out. It is probably not visible in the journald logs.
That being said, I had similar problems when I had too small disks, and when I had too few cpus in the machines. Everything worked, but it took too long to finish to the deployment was failed.
I’m having trouble with an RKE2 Rancher deploy on CentOS 7 and I’m finding that while I can leave SELinux enforcing, I’m appearing to need to disable firewalld or only the host that cert-manager-webhook is running on can access it and the other two control plane nodes can’t. You might be getting that, though with RKE2 I was getting a different error (though if I’d happened to start with host running cert-manager-webhook then possible it would’ve gone differently).
So as it happened, I left it for the weekend and it was finally up and running on Monday. So just super slow.