Use RKE to create downstream HA user cluster for Rancher

Hello,

I just installed a HA Rancher Server cluster using RKE (3 nodes) and now need to create a downstream user cluster. As far as I know I have to do that by logging in as Admin in the Rancher console and click on “Add Cluster” and select Create a new Kubernetes cluster from existing nodes (I have my own Ubuntu VMs).

Instead of doing that I was wondering if I can also install a downstream user cluster for Rancher by using RKE and a cluster.yml file? I was looking for that in the Rancher official documentation but did not find anything.

If this is possible is there maybe a sample RKE cluster.yml file available somewhere?

I want my downstream user cluster to be HA too so I will have 7 nodes (3 etcd, 2 controlplane and 2 worker) so I think it would be easier in terms of automation to use RKE.

Thank you very much for your help.

Regards,
J.

1 Like

Anyone has an idea if this even possible? Unfortunately this is not documented in the official Rancher doc and I would be thankful if someone could give me a hint here.

Creating a downstream cluster which is HA is adding nodes with the correct roles to the cluster. Automating this is interacting with the API to create the cluster (basically what is sent when you create it in the UI), and then creating the command to run on each node (with the correct role). Our vagrant quickstart has scripts for this (https://github.com/rancher/quickstart/tree/master/vagrant) or the other providers have Terraform examples.

If you want to do it using RKE, you can create a cluster to import and then run the kubectl command with kubeconfig pointing to your cluster. You will lose the ability to fully manage this cluster in Rancher as Imported clusters don’t get the same option to manage as a cluster created in Rancher. How to automate importing clusters is shown in https://forums.rancher.com/t/programmatically-import-cluster-no-web-ui/

Thank you for pointing me out to these scripts. To test I would like first to create manually a HA downstream cluster so for that purpose do I simply use the Rancher web interface click on “Add Cluster” then select “Create a new Kubernetes cluster: Existing Nodes”. On the second page " Add Cluster - Custom : Cluster Options : Customize Node Run Command" I should:

  • copy/paste 3 times the command for the etcd node role on 3 different nodes
  • copy/paste 2 time the command for the control plane node role on 2 different nodes
  • copy/paste at least 2 times the command for the worker node role on at least 2 different nodes

is this correct?

Is there anything else I should be aware of or do afterwards?

Yes, it is also described on https://rancher.com/docs/rancher/v2.x/en/cluster-provisioning/production/

Thank your @superseb, I am reading the documentation but still have a question which is not answered by the official docs:

When I have one or more downstream user clusters, does the traffic (e.g. web traffic to web services like a NodeJS container) always go through my Rancher Management server cluster? Btw I have an nginx load-balancer in front which balances the traffic between my 3 Rancher management server cluster as documented.

Or do I need to setup a new and separate nginx load-balancer for each downstream user cluster?

This should all be covered by https://rancher.com/docs/rancher/v2.x/en/overview/architecture/#communicating-with-downstream-user-clusters and https://rancher.com/docs/rancher/v2.x/en/cluster-admin/cluster-access/ace/.

Downstream clusters are independent clusters with their own ingress controller, so access to created ingresses will/should be going to the nodes running the ingress controller on the downstream cluster

So if I understand correctly I will need a new nginx server as layer 4 load balancer and forward the traffic on the nodes of the downstream user cluster.

If this is correct, to which node type should I forward all web traffic from my new nginx layer 4 load balancer? Would that be to the control plane nodes?

Unfortunately the documentation links you sent me do not explain anything about the front-end layer 4 web load balancer for a downstream user cluster.

I agree it’s not clear a single place in the documentation, I can look into making this more clear.

In short, this explains that the NGINX ingress gets deployed to schedulable nodes: https://rancher.com/docs/rke/latest/en/config-options/add-ons/ingress-controllers/. The roles (and taints) are explained on https://rancher.com/docs/rke/latest/en/config-options/nodes/#kubernetes-roles

Creating a load balancer pointing to the worker nodes (that have the nginx ingress controller on them), and having the individual DNS entries/DNS service/DNS wildcard pointing to the load balancer should be enough. There are also other options described on https://rancher.com/docs/rancher/v2.x/en/k8s-in-rancher/load-balancers-and-ingress/

Thank you for your answer.

So I see there are basically two ways of doing it:

  1. using an external and dedicated nginx layer 4 load balancer on a seperate server to forward all traffic to all worker nodes of the downstream user cluster (DNS points to the IP of the external load balancer)
  2. using the ingress of Rancher itself which is located on all worker nodes by default (DNS points to the IP addresses of ALL worker nodes)

If this is correct, what does Rancher recommend for web services, option 1) or option 2)?

And are Let’s Encrypt SSL certificates supported out-of-the-box with both options?

Hi, glad to know you setup HA rancher server. This is my next step to do. Can you share any link you have? thanks

@yys2000 unfortunately I don’t have any documentation for that purpose because it is simply not documented properly at rancher and @superseb never answered my two questions above :frowning:

It looks like setting up Rancher in HA with a downstream cluster in HA is quite a pain because of lack of clear documentation. So I never got to do that because of there are no complete guide from rancher in order to achieve that :frowning:

On https://rancher.com/docs/rancher/v2.x/en/installation/, we describe that a dedicated cluster is recommended for running Rancher. This includes not running any other workloads on that (or ingresses to the workloads).

The other option is fine, all you need is another layer that redirects to existing components (ingress controller). This is also covered in https://rancher.com/learning-paths/building-a-highly-available-kubernetes-cluster/.

The ability to use Let’s Encrypt certificates is usually handled by cert-manager (https://cert-manager.io/docs/installation/kubernetes/) and the existing nginx ingress controller. See https://cert-manager.io/docs/tutorials/acme/ingress/ for an example.