Kubectl exec fails with 403 forbidden

I am having a persistent issue where I cannot kubectl exec into a pod using

  • The UI based terminal in rancher
  • The rancher cli from a local machine
  • kubectl using a Rancher genenerated kubeconfig from a local machine

I have a single node k3s cluster with Rancher deployed. From this I am creating a 3 node rke2 cluster hosted on vsphere.

Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.12+rke2r1

I am using the Default Admin (admin) account. To the rke2 cluster I have deployed a single pod with one container (wbitt/multitool) to the default namespace which is part of the Default project.

I can exec straight to the running container via the Execute Shell option on the pod itself without issue. This leads me to believe that this is not permissions problem, ignoring the fact that I am Default Admin.

When I try to exec with any other method it fails, for example:

kubectl -n default exec -it multitool -- sh
Error from server:

Cranking up the verbosity reveals a more detailed error

[nix-shell:~/code/kubernetes/clusters/tools]$ rancher kubectl -v=7 -n default exec -it multitool -- sh
I0410 15:08:03.235139   25175 loader.go:395] Config loaded from file:  /run/user/1000/rancher-3936582872
I0410 15:08:03.238706   25175 round_trippers.go:463] GET https://company.ie/k8s/clusters/c-m-5wrsnz9q/api/v1/namespaces/default/pods/multitool
I0410 15:08:03.238733   25175 round_trippers.go:469] Request Headers:
I0410 15:08:03.238740   25175 round_trippers.go:473]     Accept: application/json, */*
I0410 15:08:03.238759   25175 round_trippers.go:473]     User-Agent: kubectl/v1.29.3 (linux/amd64) kubernetes/6813625
I0410 15:08:03.238782   25175 round_trippers.go:473]     Authorization: Bearer <masked>
I0410 15:08:03.349554   25175 round_trippers.go:574] Response Status: 200 OK in 110 milliseconds
I0410 15:08:03.351282   25175 podcmd.go:88] Defaulting container name to container-0
I0410 15:08:03.351850   25175 round_trippers.go:463] POST https://company.ie/k8s/clusters/c-m-5wrsnz9q/api/v1/namespaces/default/pods/multitool/exec?command=sh&container=container-0&stdin=true&stdout=true&tty=true
I0410 15:08:03.351874   25175 round_trippers.go:469] Request Headers:
I0410 15:08:03.351883   25175 round_trippers.go:473]     X-Stream-Protocol-Version: v5.channel.k8s.io
I0410 15:08:03.351888   25175 round_trippers.go:473]     X-Stream-Protocol-Version: v4.channel.k8s.io
I0410 15:08:03.351891   25175 round_trippers.go:473]     X-Stream-Protocol-Version: v3.channel.k8s.io
I0410 15:08:03.351894   25175 round_trippers.go:473]     X-Stream-Protocol-Version: v2.channel.k8s.io
I0410 15:08:03.351897   25175 round_trippers.go:473]     X-Stream-Protocol-Version: channel.k8s.io
I0410 15:08:03.351899   25175 round_trippers.go:473]     User-Agent: kubectl/v1.29.3 (linux/amd64) kubernetes/6813625
I0410 15:08:03.351902   25175 round_trippers.go:473]     Authorization: Bearer <masked>
I0410 15:08:03.435878   25175 round_trippers.go:574] Response Status: 403 Forbidden in 83 milliseconds
I0410 15:08:03.436434   25175 helpers.go:246] server response object: [{
  "metadata": {}
Error from server:
exit status 1

Response Status: 403 Forbidden is obviously the issue here, but it is not clear exactly what is forbidden.

Similar issues are present on the forum, this post for example, but the answer make no sense.

Most of the posts point to a websockets issue. I have followed the steps specified in this kb article and received a stream of json. To me that rules out a websocket issue.

The cluster hosting Rancher has the Cilium CNI installed and is using the Gateway API. To be 100% sure this is not a websocket issue I deployed a websocat pod and tested it form my local command line which works perfectly.

Clarifications and updates

This actually does work. There is just a horrible delay on the command

When I said the following;

I meant this UI functionality

I have two clusters.

  • A manually created k3s cluster that Rancher is deployed to
  • An rke 2 cluster on vSphere created by Rancher

Dowloading a kubeconfig for either cluster and attempting kubectl exec fails. In both cases sshing onto the nodes and using the kubconfigs /etc/rancher/k3s/k3s.yaml and /etc/rancher/rke2/rke2.yaml work as expected.

I have done further testing with websocat. This command

websocat \
  "wss://company.ie/k8s/clusters/c-m5wrsnz9q/api/v1/namespaces/default/pods/multitool/exec?command=hostname&container=container-0&stdin=true&stdout=true&tty=true" \
--header "Authorization: Bearer token-m4v5s:xxxxxxxxx"

Will perform an exec and return the hostname. This is more confusing as it is further proof that this is not a websocket issue.

Sounds a bit related to [BUG] Rancher can no longer provision harvester machines after restart · Issue #44912 · rancher/rancher · GitHub and [BUG] The Kubeconfig from `Support -> Download Kubeconfig`, for use with Terraform eventually goes bad / stops working, spitting back `unauthorized` · Issue #5669 · harvester/harvester · GitHub