CrashLoopBackOff in ingress controller

kyentei · February 1, 2019, 12:14pm

Hi all,

I’ve set up a rancher cluster according to the High Availability (HA) Install. My rancher-cluster.yml looks as such (substituted actual FQDN for example.com):

nodes:
  - address: rancher1.example.com
    internal_address: 10.0.0.181
    user: rancher
    role: [controlplane,etcd,worker]
    ssh_key_path: ~/.ssh/id_ed25519
  - address: rancher2.example.com
    internal_address: 10.0.0.182
    user: rancher
    role: [controlplane,etcd,worker]
    ssh_key_path: ~/.ssh/id_ed25519
  - address: rancher3.example.com
    internal_address: 10.0.0.183
    user: rancher
    role: [controlplane,etcd,worker]
    ssh_key_path: ~/.ssh/id_ed25519

services:
  etcd:
    snapshot: true
    creation: 6h

The rke up --config ./rancher-cluster.yml command finishes succesfully and kubectl get nodes reports the following (again, substituted FQDN):

$ kubectl get nodes
NAME                  STATUS   ROLES                      AGE   VERSION
rancher1.example.com   Ready    controlplane,etcd,worker   1h    v1.11.6
rancher2.example.com   Ready    controlplane,etcd,worker   1h    v1.11.6
rancher3.example.com   Ready    controlplane,etcd,worker   1h    v1.11.6

However, kubectl get pods --all-namespaces reports that the nginx-ingress-controller pods have status CrashLoopBackOff

$ kubectl get pods --all-namespaces
NAMESPACE       NAME                                      READY   STATUS             RESTARTS   AGE
ingress-nginx   default-http-backend-797c5bc547-hbnnt     1/1     Running            0          1h
ingress-nginx   nginx-ingress-controller-7n5kn            0/1     CrashLoopBackOff   20         1h
ingress-nginx   nginx-ingress-controller-jpzg7            0/1     CrashLoopBackOff   20         1h
ingress-nginx   nginx-ingress-controller-wxtp2            0/1     CrashLoopBackOff   10         28m
kube-system     canal-4fhwf                               3/3     Running            0          1h
kube-system     canal-8mgbp                               3/3     Running            0          1h
kube-system     canal-97j6n                               3/3     Running            0          1h
kube-system     kube-dns-7588d5b5f5-62lvd                 3/3     Running            0          1h
kube-system     kube-dns-autoscaler-5db9bbb766-xdd4v      1/1     Running            0          1h
kube-system     metrics-server-97bc649d5-6pbfk            1/1     Running            0          1h
kube-system     rke-ingress-controller-deploy-job-tn7ml   0/1     Completed          0          1h
kube-system     rke-kubedns-addon-deploy-job-pt7r6        0/1     Completed          0          1h
kube-system     rke-metrics-addon-deploy-job-whsq7        0/1     Completed          0          1h
kube-system     rke-network-plugin-deploy-job-jnjsn       0/1     Completed          0          1h

Fetching the logs of one of these pods, shows the following:

$ kubectl -n ingress-nginx logs nginx-ingress-controller-wxtp2
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.16.2-rancher1
  Build:      d36d4cc
  Repository: git@github.com:rancher/ingress-nginx.git
-------------------------------------------------------------------------------

F0201 12:08:23.061615       7 main.go:72] Port 80 is already in use. Please check the flag --http-port

Searches have yielded results that port 80 is already in use by a proxy container, but I haven’t found a way to modify the use of port 80 for the ingress controller deployment.

Extra host(s) information:
RAM: 16GB
CPU: 4
OS: Debian 9

Does anyone have a solution or pointer as to how to resolve this CrashLoopBackOff? Thanks in advance.

EDIT:

Extra information:
Searching GitHub leads to this report

Resolving this issue:

Patch the daemonset to run as user 0 (root) instead of 33 (www-data)

kubectl patch ds nginx-ingress-controller -n ingress-nginx -p '{"spec":{"template":{"spec":{"containers":[{"name":"nginx-ingress-controller","securityContext":{"runAsUser":0}}]}}}}'

Delete all pods in the ingress-nginx namespace (so they will get respawned properly)

kubectl delete pods -n ingress-nginx --all

Kosmo · February 1, 2019, 3:53pm

Use this command to check what service is hogging port 80:
netstat -tulpn | grep --color :80

Depending on what you have installed and how you’ve setup your machine, there could be other services using port 80.

In my case Docker was hogging port 80.
Find what service is using port 80 and kill it with pkill command.
Then wait for nginx-ingress-controller to start up again.

duscheln · May 27, 2020, 9:40am

Is there any solution to this problem? I have a single node cluster installation and docker-proxy is blocking port 80. This is because the admin panel of rancher itself listens on 80 and 443. So I assume it is not healthy to kill the docker-proxy process.
Furthermore the suggested solution to run the ingress pods as user 0 (root) did not help and lead again to the same problem.

vincent · May 27, 2020, 10:17am

Two different things cannot listen on the same port and ip. If you’re going to run Rancher in the same machine that will be a node, you can either not run it on port 443 (80 is really irrelevant except as a redirect to 443) or follow the HA installation to install it into an existing cluster and use an ingress rule to get traffic into the Rancher pods (probably adding --replicas=1 if it’s a single node).

duscheln · May 28, 2020, 9:27am

Thank you for your very fast reply.

I think I just understood the problem. I am trying to run the rancher UI and the ingress controller on the same machine and thus on the same ports, which will never work because they would occupy the same port 443.
Example:
The rancher installation is available via rancher.example.com (192.168.0.43). I have an deployment and service on the same machine that should listen on application.example.com (192.168.0.43). This will not work with a single node cluster installation?

If I understood correctly the only possible way for this installation would be application.example. com:8443?

The solution for the desired solution would be to separate the rancher installation and the kubernetes cluster to different machines.

Thank you very much for your effort.

Topic		Replies	Views
Cattle-system status CrashLoopBackOff Rancher	6	13163	June 7, 2022
Ingress-nginx-controller crash Rancher	2	3400	October 8, 2022
Rancker Kuberntes Custom Cluster deployment Nginx Ingress Controller on CrashLoopBackOff Release:nginx-0.25.1-rancher1 Rancher	3	1835	January 11, 2020
Cattle-pods failing Rancher	2	1772	October 25, 2019
Existing Kubernetes Cluster join error	2	890	November 25, 2021

CrashLoopBackOff in ingress controller

Related topics