I have a few Problems/Questions

I am new to rancher, and I am currently testing a few things and have a few questions or problems which I simply cannot explain… :neutral_face:

  1. Can I define a default namespace for a user in Rancher? That when he opens a shell and enters a kubectl command, the defined namespace is used?

  2. I have installed Rancher in K8S. If more than 50% of the masters in my cluster go offline, I can no longer reach Rancher… (HTTP Error 500: Internal Server Error) I know that this has something to do with the etcd database but is it possible to work around this so that at least the website is accessible so that I can still read out some info?

  3. Why does rancher create a namespace for each user by default? What is the point of this, and can I work around it? Because I don’t want to have 10000000 namespaces one day which are not used at all…

  4. If I have a kubectl shell open, then it disconnects after a certain period of time? Is it possible to bypass this or make it last longer, and why is this the case?

  1. No clue on defaults, but the user can use a command like kubectl config set-context --current --namespace=my-namespace to make all commands happen after it assume a namespace other than default.
  2. Etcd won’t function if you don’t have quorum, so you need >50% of the total expected etcd nodes up for etcd to work. If you use k3s with an external database taking the place of etcd you’d have a single point of failure for that external database but you could lose however many control plane nodes at that point, so it’s really just passing the buck but technically it’d do what you asked. Alternately you could go from 3 control plane nodes to 5 if you wanted to be able to lose 2 and still function (I’m not sure what the functional max would be, but keep in mind that too many control plane nodes would end up spending more time keeping each other in sync rather than run things).
  3. Just how it works I guess? It does it in the local cluster and not the downstream from what I saw, so since you should be running workloads in downstream clusters rather than local it shouldn’t get in the way?
  4. I’m not certain but my guess would be that the OS in the container for the shell has a logout after X min of inactivity sort of rule set up. You could potentially change the container or its launch parameters using instructions at How can I change the "rancher/shell" image? - #5 by Blopo , or you could try running a command like watch -n 60 ls -l /dev/null and see if that counts as activity to keep it running and just CTRL+C it when you want it back?
  1. I know this command, but unfortunately this command does not work because of missing write permissions in the Container/VM…
    image

  2. As I wrote, I know about this quorum if I’m not mistaken, but I thought there was a way to open the GUI without a 50% quorum but okay, all right then.

  3. Okay, so does that mean that you can’t simply switch off this “function” here either?

  4. Okay, I’ll try that right away.

Thank you so far for all your help!

By running watch -n 60 ls -l /dev/null the session still disconnects. So I’m wondering now if that is kind of a Problem or if it has just to do with that auto. logout after X min.

There is no timeout in the Rancher shell or its connection on the client or server. This is usually from a load balancer, ssl proxy, firewall or other device in the middle timing out.

1 Like

All right, then it has to do something with haproxy which is in front of my rancher installation. But funny is, when I run:

for((i=1;i<=99999999;i+=2)); do echo “test”; done

then the session remains open.

It’s a development choice if deciding that no quorum means it’ll allow read but no write or if it decides to not even answer read due to considering it’s data unknowable in accuracy, I guess etcd took the second approach. Since etcd holds all the data for Kubernetes, I wouldn’t expect much that depends on the Kubernetes API to work without it. If you had a separate container hosting a web app with an ingress and not interacting with the Kubernetes API at all then I could see that working while you don’t have quorum with etcd, but the Rancher UI does a lot querying Kubernetes, so it doesn’t surprise me if it has problems.

If you’re doing a shell loop, I think there’s a while loop as well so you could do a while true sort of thing to not have it eventually die if you’re wanting to wait over a weekend or something. If it’s something network-wise doing it then I could see how a loop would cause interaction from the browser side and keep it alive where watch wouldn’t.

1 Like