Ingress and Failover

I am trying to wrap my head around how to use ingress as it is set up in Rancher and to accomplish failover in the event of a node going down. But first, a quick bit on how I am setup. I run Rancher on premise from a Docker container running on its own dedicated host on vSphere. I used Rancher to spin up a k8s cluster on said vSphere environment and that is all functioning fine and well. For testing I have 1 etcd / controller node and 2 worker nodes. For some testing, I fired up a distributed (4 replicas) workload of Minio, which it has been working and running fine with 2 replicas on each worker node.

On to the matter that I don’t understand how to setup for / properly accommodate. I setup an L7 ingress “Load Balancer” inside of the Rancher GUI that is hostname based pointing to the Minio workload and port that the container operates on and even a wildcard certificate that we have from GoDaddy. On our Active Directory based DNS running on our network, I setup an A record with the hostname I specified in the ingress pointing to one of the nodes (worker1). I am able to successfully pull up the Minio web interface over SSL just fine when I use this hostname in my browser, so at this point it seems like everything is working great and I have a functioning workload inside my k8s cluster and a way for a client to reach it from outside the cluster.

So what happens if the node goes down that I have my AD DNS record pointing to for this host? It becomes unreachable and there is no way for someone to get to the other node hosting the workload (2 pods still) without manually changing DNS records in Active Directory. I feel like there is still some piece that I need in front of the ingress so things will failover to another node when accessing a given hostname that I have setup in DNS without having to manually change the A record to point to the other node.

I have looked at metallb a bit (and even set it up, though I have no idea how to make it work with the ingress setup in Rancher since metallb doesn’t work with ingresses). I’m not sure if metallb is even right to accomplish what I want to. Is there something else I may need setup in front of the ingress to get failover working in the event of a failed node?

There’s 4 basic approaches:

  • Keep a DNS record up to date with the IPs of healthy nodes. https://github.com/kubernetes-incubator/external-dns

  • Put a static (set of) IP(s) in DNS and use BGP or ARP broadcasting to move the IP(s) to a healthy node when one fails. This is what metal lb does.

  • Advertise the same IP from several machines/locations with Anycast… this is how a lot of real public services like Cloudflare work, but is not something you can practically do on your own.

  • Use a public cloud providers balancer service which does one of those for you.

1 Like

Maybe I need to poke around a bit more at metallb. I had installed it, but was struggling to figure out how to make it work properly with my cluster that I spun up from within Rancher. I’ve tried to dig around for documentation on how to do-so and each one seems to be missing a few pieces I feel, such as https://metallb.universe.tf/tutorial/layer2/ or https://kubernetes.github.io/ingress-nginx/deploy/baremetal/

The way we have it working is to set up a pair of VMs outside of the Kubernetes cluster running Keepalived and HAproxy. Keepalived makes sure that a single (or multiple) VIP is only running on a single host. It also handles the failover to the second host in case that frontend host goes down.

The HAProxy config has a frontend that binds to the VIP, and the backend points to each host in the Kubernetes cluster. Your DNS entries for all of your services are CNAME entries pointing to the VIP. When one of the compute nodes goes down, the HAProxy health checks see it is not there, and take it out of the backend pool so all requests go to the remaining host(s) until the host comes back up.

It’s not a self-contained solution inside of Kubernetes, but thanks to the way the Ingress works, you can have all of your services mapped to a single VIP and the Ingress routes to the proper service based on the hostname of the request. Once you have the HAproxy/Keepalived set up, you don’t ever have to touch it unless you add another cluster, or add another Kubernetes cluster host.

1 Like

I definitely don’t have to have it all self contained in Kubernetes. I think I’ll dig around a bit and investigate the possibility of implementing something along these lines as it sounds basically like it will do what I am trying to get done. Are there any primers you can recommend that you used on getting this setup? If you have nothing on hand, I’ll just get to it with my Google-fu and dig around to find what I can.

The biggest gotcha I had a problem with was that on the secondary host, HAProxy kept complaining that it couldn’t bind to the IP, since the IP was on the other host. You have to set net.ipv4.ip_nonlocal_bind = 1 in sysctl so that it won’t complain.

The keepalived configuration is very simple, and there are a lot of examples of how to set it up.
For HAProxy, here’s a simple frontend/backend template that I use.

frontend <cluster_name>_in_443                  # rename to cluster
  bind <VIP_address>:443                        # change to the DNS-resolvable IP for the cluster
  bind <VIP_address>:80                         # change to the DNS-resolvable IP for the cluster
  acl is_websocket hdr(Upgrade) -i WebSocket
  acl https_port dst_port 443
  acl http_port dst_port 80
  mode tcp
  use_backend <cluster_name>_out_443 if https_port                # rename to cluster
  use_backend <cluster_name>_out_80 if http_port                  # rename to cluster

backend <cluster_name>_out_443                                    # rename to cluster
  server server1 <host_ip_1>:443 check                            # change to host 1
  server server2 <host_ip_2>:443 check                            # change to host 2
  server server3 <host_ip_3>:443 check                            # change to host 2

backend <cluster_name>_out_80                                     # rename to cluster
  server server1 <host_ip_1>:80 check                             # change to host 1
  server server2 <host_ip_2>:80 check                             # change to host 2
  server server3 <host_ip_3>:80 check                             # change to host 3
3 Likes

I have HProxy and Keepalived operational outside the cluster on a couple hosts and I’m pleased to say it seems to be working well and doing what I need. I do have a lot of testing to still do with it, but I think from a rudimentary perspective that it is accomplishing what I need it to. I do appreciate the information and resources you provided.