Authorized Cluster Endpoint - Unable to access from outside of the control-plane node

I am trying to use the Authorized Cluster Endpoint feature on a downstream cluster v1.28.9+rke2r1 managed by Rancher v2.8.4.
I followed the steps described here: Authorized Cluster Endpoint Support for RKE2 and K3s Clusters

(on each control-plane nodes)

  • check the content of the /var/lib/rancher/rke2/kube-api-authn-webhook.yaml file
  • create the /etc/rancher/rke2/config.yaml file and set the content of the file as per procedure
  • restart the rke2-server service

(in rancher)

  • edit the configuration of the downstream cluster
  • edit the networking tab and Enable the “Authorized EndPoint” with the correct FQDN of my L7 load-balancer with SSL termination on the load balancer
  • Save and let the cluster converge

Unfortunately, the load-balancer (HAProxy) sees the downstream nodes as “down”I can confirm that from my workstation by using the nc command:

nc -vz control_plane_node_ip 6440 nc: connectx to control_plane_ip port 6440 (tcp) failed: Connection refused

However, the same test works well when connected via ssh on the control-plane node:

ssh root@control_plane_node_ip nc -vz 127.0.0.1 6440 Connection to 127.0.0.1 6440 port [tcp/*] succeeded!

How can I change the configuration of the endpoint to accept connexions from my load-balancer and not only from localhost?
Do you have other ideas?

1 Like

It seems you are doing things correctly so far but kube-auth-api service is only used by the kube-apiserver to authenticate a request locally (hence why it’s only on localhost) so you do not have to expose it with a load balancer.

The FQDN/Load balancer is supposed to point to the control plane nodes ip’s. Given that your ACE is enabled you should be able to generate a Kubeconfig for that cluster and it should have a context in it that you use to connect to the cluster through the ACE.

This section explains how to use the context you download.

For the control plane port are you using 6443?

Hi Aaron,

thank you for looking into it.

As you stated, the kubeconfig is indeed generated with 2 contexts:

  • one pointed to https://upstream_fqdn/k8s/clusters/id
  • one pointed to https://load_balancer_fqdn

In order to simplify debugging, I configured the load balancer as follow:

  • frontend port 80 → downstream control-plane node 1 on port 80
  • frontend port 443 → downstream control-plane node 1 on port 443 using ssl
  • frontend port 6443 (ssl) → downstream control-plane node 1 on port 6443 using ssl

(if you are familiar with HAProxy, I copied the configuration below)

Querying the nodes through Rancher works fine (as before):

kubectl --kubeconfig ic-caas-test.yaml --context ic-caas-test get nodes
NAME           STATUS   ROLES                       AGE   VERSION
iccluster027   Ready    worker                      9d    v1.28.9+rke2r1
iccluster031   Ready    worker                      9d    v1.28.9+rke2r1
iccluster041   Ready    worker                      9d    v1.28.9+rke2r1
icvm0179       Ready    control-plane,etcd,master   9d    v1.28.9+rke2r1
icvm0180       Ready    control-plane,etcd,master   9d    v1.28.9+rke2r1
icvm0182       Ready    control-plane,etcd,master   9d    v1.28.9+rke2r1

But querying the nodes directly on the downstream server fails:

kubectl --kubeconfig ic-caas-test.yaml --context ic-caas-test-fqdn get nodes
E0613 09:42:39.706497   50811 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E0613 09:42:39.720455   50811 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E0613 09:42:39.733128   50811 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E0613 09:42:39.745679   50811 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E0613 09:42:39.757090   50811 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
Error from server (ServiceUnavailable): the server is currently unable to handle the request

Thanks in advance for your insights.

Emmanuel


HAProxy configuration:

frontend k8s_api_frontend
    mode http
    bind :6443 ssl crt /etc/ssl/private/ic-caas-test.pem
    option httplog
    option forwardfor
    default_backend k8s_api_backend

backend k8s_api_backend
    mode http
    option forwardfor
    server icvm0179 icvm0179.xaas.epfl.ch:6443 check ssl verify none

frontend https_frontend
    mode http
    bind :443 ssl crt /etc/ssl/private/ic-caas-test.pem
    option httplog
    option forwardfor
    default_backend https_backend

backend https_backend
    mode http
    option forwardfor
    server icvm0179 icvm0179.xaas.epfl.ch:443 check ssl verify none

frontend http_frontend
    mode http
    bind :80
    option httplog
    option forwardfor
    default_backend http_backend

backend http_backend
    mode http
    option forwardfor
    server icvm0179 icvm0179.xaas.epfl.ch:80 check

Hi Matt,

Thanks for taking the time to look into this.
I’m not entirely sure I understand your question.

Yes, I believe the control-plane is using port 6443

  • both the upstream cluster and the downstream cluster have been installed without customizations
  • port 6443 are bound on both clusters (nc -vz <ip> 6443 is successful on all IPs)

Thanks in advance for your insights.

Emmanuel

The URL in the kubeconfig needs to be able to resolve to the apiserver including the port number, so if port 443 on your HAProxy FQDN maps to port 443 on the control plane it will not work. You need the load balancer to point to port 6443 on the control plane nodes.

I admittedly can’t read the config enough to tell if you are already doing that. But your explanation seems to indicate that HAProxy is pointing at port 443 on the control plane node.

You can quickly check if this is the issue by changing the URL in the kubeconfig https://load_balancer_fqdn -> https://load_balancer_fqdn:6443

I made the change you suggested into the kubeconfig (adding port 6443 to the server parameter).

So now, my kubeconfig points to https://load-balancer:6443, which, in turn, points to port 6443 of my control-plane node.

I’m getting the following error message:

error: You must be logged in to the server (the server has asked for the client to provide credentials)

Okay well from my (limited) experience, either this is relating to [BUG] Enabling ACE after cluster provisioning results in unusable kubeconfig contexts · Issue #41832 · rancher/rancher · GitHub and the clusterauthtoken CRDs are not present on the downstream cluster and maybe the fix is restart the rancher upstream cluster.

But when I was trying to fix this I was getting that error regardless of the clusterauthtokens existing on the downstream cluster. I was unable to resolve this error, but in this issue and some related issues people say restarting the upstream rancher pods might fix it.

You can check if the clusterauthtokens are on the downstream cluster with rancher or kubectl, if they do exist already this issue might be out of my wheelhouse and hopefully someone else could shed some light on fixing this.

The only way I got ACE working was enabling ACE when creating the cluster in rancher and using a FQDN. So it’s possible that recreating the cluster from scratch in rancher might fix this.

You can also check the apiserver logs to see if they have any additional information about why authentication is failing.

I hope you can get this resolved.

Aaron

Hi @JanssenAaron ,

thank you for time an consideration. I’ll post here if I manage to move forward with this.

cheers,

Emmanuel

Just a quick note that:

  1. making sure that the load balancer is configured before the installation of the downstream cluster
  2. reinstall the downstream cluster from scratch with the ACE option enabled did solve the issue

Thanks again for your help!


Another side note about the generated kubeconfig file:
The server points to https://load_balancer_fqdn. You should edit the kubeconfig to make sure that the server points to https://load_balancer_fqdn:6443

1 Like