Rancher-web-ui is un reachable, fresh installation

hi every one,
I just create a new cluster on linode and I deploy rancher to that cluster v. 1.21. the installation finished successfully but still waiting for the web-ui its not coming.

##################
~$ curl http://rancher.mydomain.com/
curl: (28) Failed to connect to rancher.mydomain.com port 80: Connection timed out
~$ curl https://rancher.mydomain.com/
curl: (28) Failed to connect to rancher.mydomain.com port 443: Connection timed out

~$ kubectl -n cattle-system describe ingress
Name:             rancher
Namespace:        cattle-system
Address:          212.71.236.215
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
  tls-rancher-ingress terminates rancher.mydomain.com
Rules:
  Host                     Path  Backends
  ----                     ----  --------
  rancher.mydomain.com  
                              rancher:80 (10.2.0.5:80,10.2.2.4:80,10.2.4.7:80)
Annotations:               cert-manager.io/issuer: rancher
                           cert-manager.io/issuer-kind: Issuer
                           field.cattle.io/publicEndpoints:
                             [{"addresses":["212.71.236.215"],"port":443,"protocol":"HTTPS","serviceName":"cattle-system:rancher","ingressName":"cattle-system:rancher"...
                           meta.helm.sh/release-name: rancher
                           meta.helm.sh/release-namespace: cattle-system
                           nginx.ingress.kubernetes.io/proxy-connect-timeout: 30
                           nginx.ingress.kubernetes.io/proxy-read-timeout: 1800
                           nginx.ingress.kubernetes.io/proxy-send-timeout: 1800
Events:
  Type    Reason             Age                From                      Message
  ----    ------             ----               ----                      -------
  Normal  CreateCertificate  58m                cert-manager              Successfully created Certificate "tls-rancher-ingress"
  Normal  Sync               52m (x3 over 58m)  nginx-ingress-controller  Scheduled for sync


~$ kubectl -n cattle-system describe certificate
Name:         tls-rancher-ingress
Namespace:    cattle-system
Labels:       app=rancher
              app.kubernetes.io/managed-by=Helm
              chart=rancher-2.6.3
              heritage=Helm
              release=rancher
Annotations:  <none>
API Version:  cert-manager.io/v1
Kind:         Certificate
Metadata:
  Creation Timestamp:  2021-12-23T11:23:06Z
  Generation:          1
  Managed Fields:
    API Version:  cert-manager.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .:
          f:app:
          f:app.kubernetes.io/managed-by:
          f:chart:
          f:heritage:
          f:release:
        f:ownerReferences:
          .:
          k:{"uid":"53768dfd-3e54-454c-9f2f-c9706c39d393"}:
            .:
            f:apiVersion:
            f:blockOwnerDeletion:
            f:controller:
            f:kind:
            f:name:
            f:uid:
      f:spec:
        .:
        f:dnsNames:
        f:issuerRef:
          .:
          f:group:
          f:kind:
          f:name:
        f:secretName:
        f:usages:
      f:status:
        .:
        f:conditions:
        f:notAfter:
        f:notBefore:
        f:renewalTime:
        f:revision:
    Manager:    controller
    Operation:  Update
    Time:       2021-12-23T12:19:33Z
  Owner References:
    API Version:           networking.k8s.io/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Ingress
    Name:                  rancher
    UID:                   53768dfd-3e54-454c-9f2f-c9706c39d393
  Resource Version:        17170
  UID:                     4af1f119-fde7-42c5-9179-6cf7a5185a1b
Spec:
  Dns Names:
    rancher.mydomain.com
  Issuer Ref:
    Group:      cert-manager.io
    Kind:       Issuer
    Name:       rancher
  Secret Name:  tls-rancher-ingress
  Usages:
    digital signature
    key encipherment
Status:
  Conditions:
    Last Transition Time:  2021-12-23T12:19:33Z
    Message:               Certificate is up to date and has not expired
    Observed Generation:   1
    Reason:                Ready
    Status:                True
    Type:                  Ready
  Not After:               2022-03-23T11:19:31Z
  Not Before:              2021-12-23T11:19:32Z
  Renewal Time:            2022-02-21T11:19:31Z
  Revision:                1
Events:
  Type    Reason     Age   From          Message
  ----    ------     ----  ----          -------
  Normal  Issuing    56m   cert-manager  Issuing certificate as Secret does not exist
  Normal  Generated  56m   cert-manager  Stored new private key in temporary Secret resource "tls-rancher-ingress-cmjk8"
  Normal  Requested  56m   cert-manager  Created new CertificateRequest resource "tls-rancher-ingress-q8phh"
  Normal  Issuing    28s   cert-manager  The certificate has been successfully issued

##############################

I would appreciate any assistant in this regards,
if any further information needed I would provide.

I still couldn’t understand how could the web-ui acquire the certificate while I couldn’t reach it, haha

The install generated the cert, so that’s why the object is in Kubernetes.

My first question is how you installed Rancher & what version. If you did the one command Docker deploy then that’s different than if you installed a Kubernetes cluster and used helm to deploy which is different from the rancherd deploy through systemd (that’s apparently depreciated and no longer available for latest).

Next how many ingress controllers do you have installed and what nodes are they available on (kubectl get pods -A -o wide | grep ingress likely tells you this). Default for RKE2 is nginx, but I think K3S uses traefik by default. Not sure about RKE and with a vanilla kubernetes you need to install it yourself. If you have no ingress controllers, this is definitely one of your problems.

My final question for the first suspicion is if you use nslookup or ping on rancher.mydomain.com what IP do you get back and is it either one/all the IPs that your ingress controllers are running on or does it point to a load balancer that forwards 80 & 443 to all the IPs your ingress controllers are running on. If the answer is no, this is definitely one of your problems.

That’s the main guess I have from what you’ve said so far. I haven’t looked into how things work with the one line Docker install with regards to ingress controllers and port forwarding for Docker.

I follow the guide “Rancher Docs: Install/Upgrade Rancher on a Kubernetes Cluster
first the recommended ingress controller, then the cert-manager then the helm repo, which all ended successfully, I installed the version is 2.6.3

~$ kubectl get pods -A -o wide | grep ingress
ingress-nginx               ingress-nginx-controller-85dbddf586-fct2l   1/1     Running   0          5h37m   10.2.4.3          lke47310-75565-61c455c91da4   <none>           <none>

I’m using a cloud managed cluster from linode.com , I think it mainly vanilla kubernetes

~$ nslookup rancher.mydomain.com
Server:		127.0.0.53
Address:	127.0.0.53#53

Non-authoritative answer:
Name:	rancher.mydomain.com
Address: 212.71.236.203
** server can't find rancher.mydomain.com: SERVFAIL

~$ ping rancher.mydomain.com
PING rancher.mydomain.com (212.71.236.203) 56(84) bytes of data.
^C
--- rancher.mydomain.com ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7147ms

the ip returned from the nslookup is not the load balancer ip or any node ip, “weird” ; will investigate the dns provider.

my ingress controller is pointing to the load balancer which forward 80 and 443 to the nodes in the cluster.

So, for this to work, you’d need network connectivity to 212.71.236.203 on port 80 and/or 443 and for that system to have network connectivity to 10.2.4.3 on port 80 and/or 443 and forward those two ports.

I’m not using a cloud provider and am on all local VMs, so my setup’s a little different. I’ll describe it to you with some variables and some of my output. I’ve got three nodes running Rancher in HA running RKE2 call them rancher1-rancher3, then I have a DNS entry with three A records pointing to all three of those nodes. The ingress controller is running on all three nodes. I deployed a downstream cluster with three control plane & five workers (control1-control3 & worker1-worker5) on RKE2 and the ingress is on the 5 worker nodes. I have an HAProxy load balancer we’ll call lb which forwards 80 & 443 to the five worker nodes in round robin and I have a wildcard DNS of *.mydomain.blah that points to the IP of lb.

For below I’ll write control1 for the FQDN and $control1_IP for the IP for that node.

For the ingress of rancher itself, I have:

$ kubectl describe ingress -A
Name:             rancher
Namespace:        cattle-system
Address:          $rancher1_IP,$rancher2_IP,$rancher3_IP
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
  tls-rancher-ingress terminates rancher-ui
Rules:
  Host                        Path  Backends
  ----                        ----  --------
  rancher-ui  
                                 rancher:80 (10.42.0.15:80,10.42.1.6:80,10.42.2.5:80)
Annotations:                  cert-manager.io/issuer: rancher
                              cert-manager.io/issuer-kind: Issuer
                              field.cattle.io/publicEndpoints:
                                [{"addresses":["$rancher1_IP","$rancher2_IP","$rancher3_IP"],"port":443,"protocol":"HTTPS","serviceName":"cattle-system:rancher","ingressN...
                              meta.helm.sh/release-name: rancher
                              meta.helm.sh/release-namespace: cattle-system
                              nginx.ingress.kubernetes.io/proxy-connect-timeout: 30
                              nginx.ingress.kubernetes.io/proxy-read-timeout: 1800
                              nginx.ingress.kubernetes.io/proxy-send-timeout: 1800
Events:                       <none>

The 10.42.0.0/16 IPs are the IPs for the three pods running the rancher container (kubectl get pods --namespace cattle-system -o wide to see). From my workstation which has network connectivity to rancher1-rancher3 on ports 80 & 443 and a DNS entry for rancher-ui I can now toss https://rancher-ui in a browser and get to my Rancher UI.

After hooking up the downstream cluster, I installed the default Monitoring package in the downstream cluster from the Rancher UI and then created an ingress for hostname monitoring-grafana.mydomain.blah in the Rancher UI to point to the Grafana instance from monitoring. Below is its’ ingress:

$ kubectl describe ingress -A
Name:             monitoring-grafana-ingress
Namespace:        cattle-monitoring-system
Address:          $worker1_IP,$worker2_IP,$worker3_IP,$worker4_IP,$worker5_IP
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
Rules:
  Host                                     Path  Backends
  ----                                     ----  --------
  monitoring-grafana.mydomain.blah  
                                           /   rancher-monitoring-grafana:80 (10.42.5.5:8080)
Annotations:                               field.cattle.io/description: Ingress for monitoring Grafana instance
                                           field.cattle.io/publicEndpoints:
                                             [{"addresses":["$worker1_IP","$worker2_IP","$worker3_IP","$worker4_IP","$worker5_IP"],"port":80,"protocol":"HTTP","serviceName":"cattle-mo...
Events:                                    <none>

With that, on a host that can contact lb over the network, I can pop https://monitoring-grafana.mydomain.blah in a browser and it takes me to Grafana with the monitoring for the cluster. Also the IP 10.42.5.5 is the IP of the pod running Grafana (which I found with kubectl get pods -A -o wide | grep -i grafana).

So this is where things look weird to me. With a cloud environment from your workstation you’d need to get to the cloud load balancer, which should be set up to point at your ingress controller (which for me are $rancher1_IP-$rancher3_IP for the first example and $worker1_IP-$worker5_IP in the second example) and then the IPs should match up all the way through (with the load balancer being a bit of an odd man out, but discoverable via your control tools I’d assume). Now some of this may make sense with you using a load balancer that Kubernetes knows about, but the publicEndpoints IP doesn’t match either, so I’m a bit lost.

Do those two working examples with IP tracing help? I’m not sure how to help you get things straightened out, but maybe just an idea of how things trace around a bit is within my abilities (though not using a Kubernetes LoadBalancer does make for some differences).