Correct way to install Rancher 2.6 on k8s using external LB?

My question is basically in the subject, as I’ve been struggling with this and making slow progress for over a week now.

I have an existing, functional, clean k8s cluster, on-prem, that I would like to install Rancher 2.x onto. This cluster is made up of 3 k8s nodes with taints removed so the control plane can coexist with the workload along with an external load balancer as normal for k8s control planes to talk to each other over on port 6443.

The load balancer up to forward ports 80 and 443 to the k8s nodes as well. This is highly desirable for us vs. using something like metallb, and I believe my issues stem from this requirement.

I’m running into an issue and would appreciate some guidance rather than sharing everything I’ve tried that hasn’t work. The long and short is that after Rancher is installed, the cluster nodes are not accepting connections on 80 and 443 as intended. My suspicion is that Rancher is using the “ClusterIP” service type mode for it’s ingress when installed while I’m installing the nginx ingress with the “NodePort” service type.

Any assistance in getting this working as outlined, using k8s (not k3s, rke, etc.) without an in-cluster LoadBalancer service like metallb is greatly appreciated!

Infrastructure notes:
on-prem vmware cluster / ubuntu 18-LTS hots / k8s 1.23.6 / Rancher 2.6.4 via Helm.

If you have control over your DNS, I found the simplest thing to do for three control plane nodes is to have four hostnames for the three, so start out with ctrl1, ctrl2, & ctrl3 as the three IPs for the nodes and then have rancher1 with an A record for the IP of ctrl1. Then use rancher1 as the hostname to access the cluster as you install to ctrl1 & ctrl2 then add another A record to rancher1 for the IP of ctrl2 and install to ctrl3 and when done add a final A record to rancher1 pointing to IP for ctrl3 (alternately can just use ctrl1 for install and add both of the other two after it’s done).

Then you’re using DNS’ native round robin functionality to rotate through which node you’re pointed at for your API server but you don’t have to mess with a load balancer config or additional IP space for a Kubernetes LoadBalancer object.

I did that with vanilla Kubernetes and it worked fine. Did the same thing with RKE2 and it worked fine.

Thanks for the suggestion @wcoateRR but I’m not interested in having DNS round robin; we need actual real load balancing (I have an automatic deploy & config of haproxy on another host) because we frequently bring hosts in and out of service and adding/removing address records from DNS just isn’t practical or reliable given various host and applications differing DNS caching strategies.

In any case it’s not the load balancing (or lack thereof) that is the issue. The rancher nodes are not accepting connections on 80 or 443, as I said – it doesn’t matter if I’m trying to connect directly via DNS, or if it’s the load balancer that’s trying.

At this point I believe the documentation that Rancher can run on “any kubernetes cluster” just needs additional work and so I’ve given up for the time being and have gone back to using a k3s cluster. This is suboptimal for our needs, but it’ll work for now.

There are many ways to accomplish what you’re trying to accomplish…Might want to share some of your troubleshooting results. kubectl get svc would be a good start along with the values you used to install rancher.

John, I’m going to have to come back to this some time in the future to answer those questions. I don’t think it’s apparent from the OP but I created this post almost a week ago. It took about five days for the post to be approved, and in the interim I decided I’d already spent too long trying to figure out what was going wrong. I opened an issue regarding the documentation on the Rancher2 docs github and moved on, resorting to going back to using k3s to host Rancher for the time being. As I didn’t know when or even if the post was going to be improved I had to move on.

If you or anyone can point me to a working example install of Rancher on a standalone clean k8s (kubeadm) cluster, using whatever ingress and load balancer (e.g. metallb in the cluster or an external lb like haproxy) they like, I can take some time to try it and see if it works and if I can adapt it to our needs, but once I decided to fall back to k3s, I didn’t keep the rancher install or nginx-ingress configs that I’d been trying.

I can say there wasn’t really anything “special” about the initial attempt though. I have a series of scripts that work perfectly for spinning up a new k8s cluster with kubeadm along side an external haproxy load balancer for the control plane and a metallb install for the “public” address of the cluster, with an nginx-ingress (note: not ingress-nginx) controller within the cluster. Rancher was installed into one of these clusters using the instructions here: Rancher Docs: Install/Upgrade Rancher on a Kubernetes Cluster

The result was a successful install that wasn’t available on port 80 or 443 of the k8s nodes – it’s not a name/loadbalancer issue, as simply telnetting to the node addresses (or metallb address) on those ports resulted in a connection refused. The connection between Rancher’s ingress and nginx-ingress just wasn’t being made for some reason. As I mentioned in my documentation issue, I think that may have been due to an ingressClass mismatch now, but I’m not sure.

The usual ingress-nginx helm command I use for installing is as follows; it’s the only thing I can think of that would need adjusted.

        helm install nginx-ingress nginx-stable/nginx-ingress
        --namespace=ingress
        --create-namespace
        --set controller.enableCustomResources=false
        --set controller.appprotect.enable=false
        --set controller.appprotectdos.enable=false
        --skip-crds
        --set controller.kind=daemonset
        --set controller.service.type=LoadBalancer
        --set rbac.create=true
        --set controller.service.httpsPort.enable=false
        --set controller.service.loadBalancerIP={{ tfconfig['metallb-addr'] }}
        --set controller.ingressClass=nginx