Cattle-system status CrashLoopBackOff

Hi,

I’m new with rancher and after couple installation my cattle-system pods is all the time in status CrashLoopBackOff. It a fresh install and I try to understand where is my mistake.

I got 4 nodes and 1 server dedicate to the rancher RKE installation. My rancher is behind a reverse proxy.

X.X.X.171 Rancher RKE Installation
X.X.X.151 Node 1
X.X.X.152 Node 2
X.X.X.153 Node 3
X.X.X.154 Node 4
$ kubectl get nodes
NAME           STATUS   ROLES                      AGE   VERSION
10.100.1.151   Ready    controlplane,etcd,worker   89m   v1.13.5
10.100.1.152   Ready    controlplane,etcd,worker   89m   v1.13.5
10.100.1.153   Ready    controlplane,etcd,worker   89m   v1.13.5
10.100.1.154   Ready    controlplane,etcd,worker   89m   v1.13.5
$ kubectl get pods --all-namespaces
NAMESPACE       NAME                                      READY   STATUS             RESTARTS   AGE
cattle-system   cattle-cluster-agent-85cb79bbb5-xg6m8     0/1     CrashLoopBackOff   17         62m
cattle-system   cattle-node-agent-2hn7z                   0/1     CrashLoopBackOff   17         62m
cattle-system   cattle-node-agent-rd9br                   0/1     CrashLoopBackOff   17         62m
cattle-system   cattle-node-agent-rrjfc                   0/1     CrashLoopBackOff   17         62m
cattle-system   cattle-node-agent-zj9d2                   0/1     CrashLoopBackOff   16         62m
cattle-system   rancherqa-79d474dd8c-6qzks                1/1     Running            1          65m
cattle-system   rancherqa-79d474dd8c-ss92f                1/1     Running            0          65m
cattle-system   rancherqa-79d474dd8c-w4mhv                1/1     Running            1          65m
ingress-nginx   default-http-backend-7f8fbb85db-jwzsg     1/1     Running            0          71m
ingress-nginx   nginx-ingress-controller-24lbm            1/1     Running            0          71m
ingress-nginx   nginx-ingress-controller-dvsnt            1/1     Running            0          71m
ingress-nginx   nginx-ingress-controller-lq74b            1/1     Running            0          71m
ingress-nginx   nginx-ingress-controller-wl2br            1/1     Running            0          71m
kube-system     canal-2x87x                               2/2     Running            0          71m
kube-system     canal-b7xkr                               2/2     Running            0          71m
kube-system     canal-bj2ft                               2/2     Running            0          71m
kube-system     canal-lkszr                               2/2     Running            0          71m
kube-system     cert-manager-6464494858-b698d             1/1     Running            0          66m
kube-system     kube-dns-5fd74c7488-5x9z8                 3/3     Running            0          71m
kube-system     kube-dns-autoscaler-c89df977f-w527m       1/1     Running            0          71m
kube-system     metrics-server-7fbd549b78-8bq44           1/1     Running            0          71m
kube-system     rke-ingress-controller-deploy-job-nn2n8   0/1     Completed          0          71m
kube-system     rke-kubedns-addon-deploy-job-94f6c        0/1     Completed          0          71m
kube-system     rke-metrics-addon-deploy-job-pr7jr        0/1     Completed          0          71m
kube-system     rke-network-plugin-deploy-job-6mwps       0/1     Completed          0          71m
kube-system     tiller-deploy-7b489d95c4-clhl2            1/1     Running            0          67m
$ kubectl describe pod cattle-node-agent-zj9d2 -n cattle-system
Name:               cattle-node-agent-zj9d2
Namespace:          cattle-system
Priority:           0
PriorityClassName:  <none>
Node:               10.100.1.152/10.100.1.152
Start Time:         Tue, 11 Jun 2019 14:34:24 -0400
Labels:             app=cattle-agent
                    controller-revision-hash=6844c7dbd8
                    pod-template-generation=1
Annotations:        <none>
Status:             Running
IP:                 10.100.1.152
Controlled By:      DaemonSet/cattle-node-agent
Containers:
  agent:
    Container ID:   docker://0f69b035e0f116d1f3f225077d8cad8fd8e80a45c9495b4ea6b46301f839ea0f
    Image:          rancher/rancher-agent:v2.2.4
    Image ID:       docker-pullable://rancher/rancher-agent@sha256:a895cb47ae81a641db64a3f727fe371cc6f2be7e8c98ee03f6f6a911b9d572ab
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 11 Jun 2019 15:52:43 -0400
      Finished:     Tue, 11 Jun 2019 15:52:43 -0400
    Ready:          False
    Restart Count:  20
    Environment:
      CATTLE_NODE_NAME:       (v1:spec.nodeName)
      CATTLE_SERVER:         https://rancher.qa.backinternal.XXXXXX.com
      CATTLE_CA_CHECKSUM:    ecd6bc7cfc5084b8474ea517bc284e42182cb0f14294aac7cc4e7754f46acdfa
      CATTLE_CLUSTER:        false
      CATTLE_K8S_MANAGED:    true
      CATTLE_AGENT_CONNECT:  true
    Mounts:
      /cattle-credentials from cattle-credentials (ro)
      /etc/kubernetes from k8s-ssl (rw)
      /run from run (rw)
      /var/run from var-run (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cattle-token-2cpwv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  k8s-ssl:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes
    HostPathType:  DirectoryOrCreate
  var-run:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run
    HostPathType:  DirectoryOrCreate
  run:
    Type:          HostPath (bare host directory volume)
    Path:          /run
    HostPathType:  DirectoryOrCreate
  cattle-credentials:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cattle-credentials-71a5ecc
    Optional:    false
  cattle-token-2cpwv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cattle-token-2cpwv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node-role.kubernetes.io/controlplane=true:NoSchedule
                 node-role.kubernetes.io/etcd=true:NoExecute
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason   Age                   From                   Message
  ----     ------   ----                  ----                   -------
  Warning  BackOff  107s (x373 over 81m)  kubelet, 10.100.1.152  Back-off restarting failed container

I have another probleme with my lauch kubectl on my cluster. When I lunch it I receive a :
Closed Code: 1006

If you need more information I can send more.

I read a lot on google, but I do not find the solution to the problem.

Thanks for you answer.

What does docker logs say for the node agent?

I found in log this line :

kubectl logs --follow pod/cattle-node-agent-2hn7z -n cattle-system

INFO: Environment: CATTLE_ADDRESS=10.100.1.153 CATTLE_AGENT_CONNECT=true CATTLE_CA_CHECKSUM=ecd6bc7cfc5084b8474ea517bc284e42182cb0f14294aac7cc4e7754f46acdfa CATTLE_CLUSTER=false CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=10.100.1.153 CATTLE_SERVER=https://rancher.qa.xxxxxx.xxxxxx.com
INFO: Using resolv.conf: nameserver 10.100.1.20 nameserver 10.100.1.21 search xxxxx.com
ERROR: https://rancher.qa.xxxxxx.xxxxxx.com/ping is not accessible (The requested URL returned error: 403 Forbidden)

$ curl https://rancher.qa.xxxxxx.xxxxxx.com/ping

<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>
</html>

Hi,

I found the problem. The problem is not in rancher is in reverse proxy. The 10.0.0.X/8 is not in the nginx whit list. So I just modify the whit list and reboot and all work correctly.

$ kubectl get pod --all-namespaces

NAMESPACE       NAME                                      READY   STATUS      RESTARTS   AGE
cattle-system   cattle-cluster-agent-85cb79bbb5-xg6m8     1/1     Running     1658       5d21h
cattle-system   cattle-node-agent-2hn7z                   1/1     Running     1656       5d21h
cattle-system   cattle-node-agent-rd9br                   1/1     Running     1657       5d21h
cattle-system   cattle-node-agent-rrjfc                   1/1     Running     1659       5d21h
cattle-system   cattle-node-agent-zj9d2                   1/1     Running     1658       5d21h
cattle-system   rancherqa-79d474dd8c-6qzks                1/1     Running     2          5d21h
cattle-system   rancherqa-79d474dd8c-ss92f                1/1     Running     1          5d21h
cattle-system   rancherqa-79d474dd8c-w4mhv                1/1     Running     2          5d21h
ingress-nginx   default-http-backend-7f8fbb85db-jwzsg     1/1     Running     1          5d21h
ingress-nginx   nginx-ingress-controller-24lbm            1/1     Running     1          5d21h
ingress-nginx   nginx-ingress-controller-dvsnt            1/1     Running     1          5d21h
ingress-nginx   nginx-ingress-controller-lq74b            1/1     Running     1          5d21h
ingress-nginx   nginx-ingress-controller-wl2br            1/1     Running     1          5d21h
kube-system     canal-2x87x                               2/2     Running     2          5d21h
kube-system     canal-b7xkr                               2/2     Running     2          5d21h
kube-system     canal-bj2ft                               2/2     Running     2          5d21h
kube-system     canal-lkszr                               2/2     Running     2          5d21h
kube-system     cert-manager-6464494858-b698d             1/1     Running     1          5d21h
kube-system     kube-dns-5fd74c7488-5x9z8                 3/3     Running     3          5d21h
kube-system     kube-dns-autoscaler-c89df977f-w527m       1/1     Running     1          5d21h
kube-system     metrics-server-7fbd549b78-8bq44           1/1     Running     1          5d21h
kube-system     rke-ingress-controller-deploy-job-nn2n8   0/1     Completed   0          5d21h
kube-system     rke-kubedns-addon-deploy-job-94f6c        0/1     Completed   0          5d21h
kube-system     rke-metrics-addon-deploy-job-pr7jr        0/1     Completed   0          5d21h
kube-system     rke-network-plugin-deploy-job-6mwps       0/1     Completed   0          5d21h
kube-system     tiller-deploy-7b489d95c4-clhl2            1/1     Running     1          5d21

Thanks for you help :smiley:

Hi [mbourgeois], I’m having the same issue now after upgrading Kubernetes version to the latest.
May I know how did you whitelist the IP?

Thanks!

Did you use external nginx ?
Can you teach me more about nginx config to solve that problem?

Hello!
How did you whitelist the IP? I have the same issue, please share how you solve this problem.