Rancher Kubernetes Dashboard Not Working

I am new to Rancher and containers in general. While setting up Kubernetes cluster using Rancher, i’m facing problem.

rancher/server: 1.6.6

Single node Rancher server + External MySQL + 3 agent nodes

Infrastructure Stack versions:
healthcheck: v0.3.1
ipsec: net:v0.11.5
network-services: metadata:v0.9.2 / network-manager:v0.7.7
scheduler: k8s:v1.7.2-rancher5
kubernetes (if applicable): kubernetes-agent:v0.6.3

# docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server Version: 17.03.1-ce
Storage Driver: overlay
Backing Filesystem: extfs
Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.34-rancher
Operating System: RancherOS v1.0.3
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.798 GiB
Name: ch7radod1
ID: IUNS:4WT2:Y3TV:2RI4:FZQO:4HYD:YSNN:6DPT:HMQ6:S2SI:OPGH:TX4Y
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy: http://proxy.ch.abc.net:8080
Https Proxy: http://proxy.ch.abc.net:8080
No Proxy: localhost,.xyz.net,abc.net
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false


Environment Template: Kubernetes

Accessing UI URL http://10.216.30.10/r/projects/1a6633/kubernetes-dashboard:9090/# shows “Service unavailable”

If i use the CLI section from the UI, i get the following:

> kubectl get nodes
NAME              STATUS    AGE       VERSION
ch7radod3       Ready     1d        v1.7.2
ch7radod4       Ready     5d        v1.7.2
ch7radod1       Ready     1d        v1.7.2

> kubectl get pods --all-namespaces
NAMESPACE     NAME                                   READY     STATUS              RESTARTS   AGE
kube-system   heapster-4285517626-4njc2              0/1       ContainerCreating   0          5d
kube-system   kube-dns-3942128195-ft56n              0/3       ContainerCreating   0          19d
kube-system   kube-dns-646531078-z5lzs               0/3       ContainerCreating   0          5d
kube-system   kubernetes-dashboard-716739405-lpj38   0/1       ContainerCreating   0          5d
kube-system   monitoring-grafana-3552275057-qn0zf    0/1       ContainerCreating   0          5d
kube-system   monitoring-influxdb-4110454889-79pvk   0/1       ContainerCreating   0          5d
kube-system   tiller-deploy-737598192-f9gcl          0/1       ContainerCreating   0          5d

The setup uses private registry (Artifactory).

Kubernetes currently only supports Docker version 1.12.6, from the info you provided it looks like you are running version 17.03.1-ce.

Thanks for noticing the version. I was aware of this compatibility issue but it slipped while trying to fix other issues. While i was doing all this, i recently put NGINX for HTTPS. Now when i am trying to change the docker version to 1.12.6, i am getting the following message:

[rancher@ch1 ~]$ sudo ros engine switch docker-1.12.6
> ERRO[0031] Failed to load https://raw.githubusercontent.com/rancher/os-services/v1.0.3/index.yml: Get https://raw.githubusercontent.com/rancher/os-services/v1.0.3/index.yml: Proxy Authentication Required
> FATA[0031] docker-1.12.6 is not a valid engine

I thought may be it’s due to NGINX so i stopped the NGINX container but i am still getting the above error. Earlier i have tried the same command on this Rancher server and it used to work fine. It’s working fine on agent nodes although they are already having 1.12.6 configured.

The person who did the setup earlier had used his own credential for proxy and once it started working, he removed the proxy settings from the config of all the instances. Since the agent machines were not restarted (but server instance was) after that, things were working fine there and i was banging my head why it’s not working on the server instance.

Now since that mystery is clear, i am again back to my original issue because even after changing the Docker version to 1.12.6 on server (agent nodes are already on 1.12.6), Kubernetes dashboard shows Service Unavailable.

In this link (http://rancher.com/docs/rancher/v1.6/en/kubernetes/private-registry/), for Helm, Dashboard etc, it mentions copying the exact version of images. I have not performed any step mentioned here. Is that the reason for dashboard not working? What exactly needs to be copied and where? By the way, I am using private registry (Artifactory).

> kubectl -n kube-system get po
NAME                                 READY STATUS            RESTARTS AGE
heapster-4285517626-4njc2            1/1   Running           0        12d
kube-dns-2588877561-26993            0/3   ImagePullBackOff  0        5h
kube-dns-646531078-z5lzs             0/3   ContainerCreating 0        12d
kubernetes-dashboard-716739405-zq3s9 0/1   CrashLoopBackOff  67       5h
monitoring-grafana-3552275057-qn0zf  1/1   Running           0        12d
monitoring-influxdb-4110454889-79pvk 1/1   Running           0        12d
tiller-deploy-737598192-f9gcl        0/1   CrashLoopBackOff  72       12d

Can you post the logs for those containers that aren’t running?

> kubectl get pods -a -o wide --all-namespaces
NAMESPACE     NAME                                   READY  STATUS              RESTARTS   AGE  IP                  NODE
kube-system   heapster-4285517626-4njc2              1/1    Running             0          12d  10.42.224.157       radod4
kube-system   kube-dns-2588877561-26993              0/3    ImagePullBackOff    0          5h   <none>              radod1
kube-system   kube-dns-646531078-z5lzs               0/3    ContainerCreating   0          12d  <none>              radod4
kube-system   kubernetes-dashboard-716739405-zq3s9   0/1    Error               70         5h   10.42.218.11        radod1
kube-system   monitoring-grafana-3552275057-qn0zf    1/1    Running             0          12d  10.42.202.44        radod4
kube-system   monitoring-influxdb-4110454889-79pvk   1/1    Running             0          12d  10.42.111.171       radod4
kube-system   tiller-deploy-737598192-f9gcl          0/1    CrashLoopBackOff    76         12d  10.42.213.24        radod4

Then i went to the host where the process was executing and tried the following command:

[rancher@radod1 ~]$
[rancher@radod1 ~]$ docker ps -a | grep dash
282334b0ed38  gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:b537ce8988510607e95b8d40ac9824523b1f9029e6f9f90e9fccc663c355cf5d  "/dashboard --insecur"   About a minute ago   Exited (1) 55 seconds ago   k8s_kubernetes-dashboard_kubernetes-dashboard-716739405-zq3s9_kube-system_7b0afda7-8271-11e7-ae86-021bfe69c163_72
99836d7824fd  gcr.io/google_containers/pause-amd64:3.0                                                                                     "/pause"                 5 hours ago          Up 5 hours                  k8s_POD_kubernetes-dashboard-716739405-zq3s9_kube-system_7b0afda7-8271-11e7-ae86-021bfe69c163_1
[rancher@radod1 ~]$
[rancher@radod1 ~]$
[rancher@radod1 ~]$ docker logs 282334b0ed38
Using HTTP port: 8443
Creating API server client for https://10.43.0.1:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: the server has asked for the client to provide credentials
Refer to the troubleshooting guide for more information: https://github.com/kubernetes/dashboard/blob/master/docs/user-guide/troubleshooting.md

After i got the above error, i again searched online and tried few things. Finally, this link helped. After i executed the following commands on all agent nodes, Kubernetes dashboard finally started working!

docker volume rm etcd
rm -rf /var/etcd/backups/*

I have this problem, too.I think maybe the contents of the tokens is invalid.

View all secrets : kubectl get secrets -n kube-system
Delete rancher secret :kubectl delete secret io-rancher-system-XXXXXX
Dashboard use the io-rancher-system secret ,It will automatically be recreated.
Delete dns secret :kubectl delete secret kube-dns-token-XXXXXX
Dns pod use the secret

In my environment,this will work!

1 Like

If secrets not automatically created,you can delete the service account,re create it.
kubectl get sa -n kube-system
kubectl delete sa xxxxx kube-system

1 Like

Thanks for sharing your findings peterchen82! This may help someone. :slight_smile: