Authorized cluster endpoint setup?

Hello All,

I’m relatively new to kubernetes and even newer to rancher, bear with me if I ask any newbie questions and if this has been asked I’d be grateful for a link to the previous discussion.

I’m running a kubeadm-built bare metal cluster on premise that I’ve imported into rancher and would like to replace it with a rancher-built cluster. I’ve built another cluster using rancher, but my users need direct access via kubectl and I’d rather not bottleneck that connection through my single rancher VM. I’ve upgraded to rancher 2.2.2 so Authorized Cluster Endpoint connections are an option, but am drawing a blank on setting up the certificate(s) for direct kubectl access to the cluster. My sense is that this was done by kubeadm on my original cluster, and I’m about to start nosing into Kelsey Hightower’s “Kubernetes the Hard Way” to see if I can pull out those parts. But in the meantime, I thought I’d also ask here.

Is there a document anywhere on setting up Authorized Cluster Endpoint access that starts with checking that “enabled” radio button and ends with kubectl access from a workstation directly to the cluster?

Hope to hear from you,

Randy Rue

1 Like

OK, I’ve done some flailing but am not making much progress.

Followed the steps here and generated a cert and some keys, pasted the cert into the rancher interface, but when I put those into my kubectl config file I get errors that the certificate is valid for the short name of my primary master node but not for the FQDN I put in the rancher interface when I enabled “authorized cluster endpoint.” If I change my config file entry for that cluster entry to the short name, I get “Unable to connect to the server: x509: certificate signed by unknown authority”

I’m not even as far as trying to get a valid user crt/key pair, but would be happy to get as far as a valid server cert.

Is nobody else using this feature? Seems like if they were there would have to be some reference online to setting it up. Or is it so simple it doesn’t need documenting and I’m missing something obvious?

Hope to hear from you.

OK, progress is being made.

I’ll continue to post my progress as it develops. Best case is if I keep top-posting this thread eventually someone with knowledge will reply. Worst case is when I figure all this out myself I might save some other newbie some time

There’s no need to paste in or upload a new cert to reach the cluster directly. I removed the entries for fqdn and the certificate and using the generated kubectl config entries I can reach the cluster from my workstation, if I set my context to point to one of the three controller nodes.

But now I still have a single point of failure if I want my users to be able to submit Jobs directly.

For my kubeadm-built cluster I have a load balancer configured for the controller nodes, an F5 BigIP. I’ve set up the same thing for this new cluster, but if I put the fqdn for that front end in the rancher UI under the Authorized Cluster Endpoint settings, I still get the cert error I mentioned earlier: "certificate is valid for {list of the short names for the controllers, etc}, not {fqdn}

So I do need to generate a cert for the ACE settings? One for my front-end fqdn? And if so, will I also need to generate new user certs to go with it?

Hope to hear from you…

You can just click Enable and hit save (this is the default for new clusters). A kubeconfig entry can only point to one endpoint, so it continues to proxy through the server by default, but adds another context on the generated kubeconfig files that points directly at the control plane node(s) (one at a time). You can manually switch to that context if the server container was down, and this is generally good enough to have a way in for emergencies.

To (reliably) point directly at the cluster all the time, you need a load balancer (e.g. in AWS) which targets only the healthy control plane nodes. And then a domain name pointing at it; that’s the FQDN field.

Kubectl (mostly) only does TLS, and you’re now pointing it at some hostname you picked for that the cluster knows nothing about. So you need the balancer to do TLS termination. If the cert you use is issued by a CA that’s in the client’s cacerts list, then you’re done. If not we need the certificate (and/or CA chain) so that we can put it into the generated kubeconfig files to make kubectl trust it. That’s the Certificates field.

Hi Vincent, thanks for your reply.

We have an F5 bigIP that’s currently load-balancing controller access for our kubeadm-built cluster, passing traffic on port 6443 and letting the nodes handle TLS.

If I change the FQDN entry in the Rancher UI to the FQDN of a similar Virtual Server on our bigIP, traffic is being directed to the controller node but the connection fails because kubectl is getting the cert for only the node it’s connecting to:

Unable to connect to the server: x509: certificate is valid for km-alpha-m01, localhost, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not kubernetes-alpha.pc.scharp.org

no real harm in showing you real host and domain names here, BTW, this is all on a local network and DNS zone…

Interesting that if I change the FQDN entry in the Rancher UI to a short name, the error at least includes all three controller nodes:

Unable to connect to the server: x509: certificate is valid for km-alpha-m01, km-alpha-m03, km-alpha-m02, localhost, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not kubernetes-alpha

I suspect what I need is to come up with a cert to load into the Rancher UI that includes all of the above names as well as kubernetes-alpha.pc.scharp.org? How can I go about that? Can I extract a ca.crt file from the certificate-authority-data in the UI’s kubeconfig file and use that to generate that cert? What other information do I need and where can I extract it from Rancher and/or the cluster?

If this is more of an SSL question than a Rancher question, I’d be grateful for any guidance…

Randy

Been sidetracked a week and am getting back to this.

I found the cacert content under GlobalSecurity in the webUI, I’m guessing that’s comparable to the contents of the ca.crt file generated if I did this from scratch i.e. the Hard Way or via kubeadm. But to use that to generate a cert, I also need ca.key?

Dealing with the same problem here. Did you manage to solve this issue?

Sorry, no. I ended up using our F5 BigIP as a load balancer for the endpoint, with our wildcard cert configured there, and the F5 seems to like the self-signed certs offered by the individual nodes.

The remaining issue is that our devs are using java modules to reach the K8S API and while they can submit jobs successfully, other processes that monitor the work can’t stay connected via the load balancer. Most likely scenario is we’ll use the load-balanced endpoint for submitting jobs and a direct connection to a controller node for monitoring (and live with the risk of a single point of failure).

1 Like

I know this is an old topic. But didn’t seem to get a solid answer. I recently struggled with this so it’s still valid. Here is the option I found.

Go in and edit your cluster. Then click the edit as yaml.

In the yaml, there is a section called rancher_kubernetes_engine_config. In that section is something for authentication.
Add the additional “sans” to the certificate by adding the entries like i have below.

rancher_kubernetes_engine_config:
  addon_job_timeout: 30
  authentication:
    sans:
      - 10.10.60.105
      - k8s.mydomain.com

Save the yaml, and let it regenerate your certs. That 10.10.60.105 is the VIP in my F5 load balancer. The domain is the fqdn i have pointed at the vip. I then have my F5 setup to do layer 4 performance pass through and the VIP is setup for 6443.

The health check is doing simple tcp check on 6443.

All our tools, clients ect are pointing now to my fqdn, since i have all rancher nodes acting as masters, if the api goes down (so does port 6443) health check fails, F5 stops directing traffic to it till it comes back up.

Again, sorry revive an old topic, but I am hoping this helps someone else as it was a pain for me to find

4 Likes

Thank you sir! Adding the “sans” fields solved my issues with authenticating our cluster to GitLab.

1 Like

I had a really similar scenario as you described it (and I did it really the same way). The only thing I am missing, especially from an end-user perspective (which are in my case not always experienced in administrating Kubernetes) is the possibility to keep the default CA within the Kubeconfig that is generated by Rancher UI.

The current behavior (from my observation) is, that when you use the fqdn parameter in local_cluster_auth_endpoint you also need to provide a ca. If not, the generated kubeconfig file will not contain a value for certificate-authority-data (which is probaly needed since the clients will, by default, not trust the kube-ca of the cluster). Thereby you will get the following error: Unable to connect to the server: x509: certificate signed by unknown authority.

To overcome this issue for the described scenario I made a feature request: