I’ve just about given up trying to troubleshoot something that shouldn’t be an issue in the first place. I have two Poweredge servers with fresh Ubuntu 18.04 installs. Firewalls are completely disabled. I followed the instructions to install to the letter, and got as far as here, with everything before it working fine:
kubectl -n cattle-sys rollout status deploy/rancher
The deployment times out, the three containers try to run but they are repeatedly terminated and restarted with a connection refused error to an internal IP in the cluster. I am using Docker CE as the container engine. I have looked EVERYWHERE for possible causes. ???
rancher-556d94d669-bkbp4 0/1 Error 19 15h
rancher-556d94d669-s9vpz 0/1 Unknown 20 15h
rancher-556d94d669-5gmmb 0/1 Unknown 19 15h
rancher-7f9585b56d-s8lqj 0/1 CrashLoopBackOff 5 8m16s
rancher-7f9585b56d-76v6b 0/1 Running 6 8m16s
rancher-7f9585b56d-wk9l5 0/1 Running 4 8m16s
Several of the failed container says this:
Warning FailedMount 8m41s (x71 over 152m) kubelet, zork-poweredge-1950 Unable to attach or mount volumes: unmounted volumes=[rancher-token-dtt4v], unattached volumes=[rancher-token-dtt4v]: timed out waiting for the condition
Warning FailedMount 3m50s (x82 over 154m) kubelet, zork-poweredge-1950 MountVolume.SetUp failed for volume "rancher-token-dtt4v" : secret "rancher-token-dtt4v" not found
The only container left that is running says this:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler
Successfully assigned cattle-sys/rancher-7f9585b56d-76v6b to zork-poweredge-r410
Normal Killing 10m kubelet, zork-poweredge-r410 Container rancher failed liveness probe, will be restarted
Warning Unhealthy 9m20s (x4 over 11m) kubelet, zork-poweredge-r410 Liveness probe failed: Get
http://10.42.0.56:80/healthz: dial tcp 10.42.0.56:80: connect: connection refused
Normal Started 8m59s (x3 over 12m) kubelet, zork-poweredge-r410 Started container rancher
Warning Unhealthy 8m52s (x8 over 12m) kubelet, zork-poweredge-r410 Readiness probe failed: Get
http://10.42.0.56:80/healthz: dial tcp 10.42.0.56:80: connect: connection refused
Normal Pulled 8m24s (x4 over 12m) kubelet, zork-poweredge-r410 Container image "rancher/rancher:v2.4.8" already present on machine
Normal Created 8m24s (x4 over 12m) kubelet, zork-poweredge-r410 Created container rancher
Warning BackOff 2m38s (x20 over 8m37s) kubelet, zork-poweredge-r410 Back-off restarting failed container
Help!