Hi Rancher team,
I installed Rancher v2.6.0 via helm and uninstall it. Afterward, attempted to install v2.5.9 but getting into some problem then I cleanup the Rancher deployment by using the system-tools with the script https://raw.githubusercontent.com/kingsd041/some_script/master/remove-rancher-ha/remove_r_ha.sh
.
After that, runs the helm install again but always failed to start rancher server pods. Any direction to troubleshooting it? thx
hugo@hugok-mlt % kubectl get pods -n cattle-system
NAME READY STATUS RESTARTS AGE
rancher-786f49f5dc-dx9hw 0/1 Running 5 12m
rancher-786f49f5dc-fnvlh 0/1 Running 1 5m15s
rancher-786f49f5dc-vzcvq 0/1 Running 5 12m
The self healthz check always failed.
Name: rancher-786f49f5dc-vzcvq
Namespace: cattle-system
Priority: 0
Node: k8s-node-02/100.65.16.9
Start Time: Fri, 15 Oct 2021 11:46:23 +0800
Labels: app=rancher
pod-template-hash=786f49f5dc
release=rancher
Annotations: cni.projectcalico.org/podIP: 10.233.74.102/32
cni.projectcalico.org/podIPs: 10.233.74.102/32
Status: Running
IP: 10.233.74.102
IPs:
IP: 10.233.74.102
Controlled By: ReplicaSet/rancher-786f49f5dc
Containers:
rancher:
Container ID: docker://f78bf829e40038cdb62aa82cfa05dbb6c52166c5120a67f4bbb1aa79bdc9ee90
Image: docker.test.com/rancher/rancher:v2.5.9
Image ID: docker-pullable://docker.test.com/rancher/rancher@sha256:10e938f788e725d1d2ed7bc909bae8c7a83b756c520fb2596bf559e44e13587d
Port: 80/TCP
Host Port: 0/TCP
Args:
--no-cacerts
--http-listen-port=80
--https-listen-port=443
--add-local=true
State: Running
Started: Fri, 15 Oct 2021 11:58:38 +0800
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 15 Oct 2021 11:56:08 +0800
Finished: Fri, 15 Oct 2021 11:58:37 +0800
Ready: False
Restart Count: 5
Liveness: http-get http://:80/healthz delay=60s timeout=1s period=30s #success=1 #failure=3
Readiness: http-get http://:80/healthz delay=5s timeout=1s period=30s #success=1 #failure=3
Environment:
CATTLE_NAMESPACE: cattle-system
CATTLE_PEER_SERVICE: rancher
CATTLE_BOOTSTRAP_PASSWORD: <set to the key 'bootstrapPassword' in secret 'bootstrap-secret'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from rancher-token-jmsph (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
rancher-token-jmsph:
Type: Secret (a volume populated by a Secret)
SecretName: rancher-token-jmsph
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: cattle.io/os=linux:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned cattle-system/rancher-786f49f5dc-vzcvq to k8s-node-02
Normal Pulled 9m14s (x3 over 13m) kubelet Container image "docker.addpchina.com/rancher/rancher:v2.5.9" already present on machine
Warning Unhealthy 9m14s (x6 over 12m) kubelet Liveness probe failed: Get "http://10.233.74.102:80/healthz": dial tcp 10.233.74.102:80: connect: connection refused
Normal Killing 9m14s (x2 over 11m) kubelet Container rancher failed liveness probe, will be restarted
Normal Created 9m13s (x3 over 13m) kubelet Created container rancher
Normal Started 9m13s (x3 over 13m) kubelet Started container rancher
Warning Unhealthy 3m42s (x18 over 13m) kubelet Readiness probe failed: Get "http://10.233.74.102:80/healthz": dial tcp 10.233.74.102:80: connect: connection refused
The pod logs shows waiting for initial data to be populated.
2021/10/15 04:02:23 [INFO] APIVersion metrics.k8s.io/v1beta1 Kind NodeMetrics
2021/10/15 04:02:23 [INFO] APIVersion metrics.k8s.io/v1beta1 Kind PodMetrics
W1015 04:02:24.010406 7 warnings.go:80] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
2021/10/15 04:02:25 [INFO] Waiting for initial data to be populated
2021/10/15 04:02:27 [INFO] Waiting for initial data to be populated