I am new to Rancher and containers in general. While setting up Kubernetes cluster using Rancher, i’m facing problem.
rancher/server: 1.6.6
Single node Rancher server + External MySQL + 3 agent nodes
Infrastructure Stack versions:
healthcheck: v0.3.1
ipsec: net:v0.11.5
network-services: metadata:v0.9.2 / network-manager:v0.7.7
scheduler: k8s:v1.7.2-rancher5
kubernetes (if applicable): kubernetes-agent:v0.6.3
# docker info
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 1
Server Version: 17.03.1-ce
Storage Driver: overlay
Backing Filesystem: extfs
Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.34-rancher
Operating System: RancherOS v1.0.3
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.798 GiB
Name: ch7radod1
ID: IUNS:4WT2:Y3TV:2RI4:FZQO:4HYD:YSNN:6DPT:HMQ6:S2SI:OPGH:TX4Y
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy: http://proxy.ch.abc.net:8080
Https Proxy: http://proxy.ch.abc.net:8080
No Proxy: localhost,.xyz.net,abc.net
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Environment Template: Kubernetes
Accessing UI URL http://10.216.30.10/r/projects/1a6633/kubernetes-dashboard:9090/# shows “Service unavailable”
If i use the CLI section from the UI, i get the following:
> kubectl get nodes
NAME STATUS AGE VERSION
ch7radod3 Ready 1d v1.7.2
ch7radod4 Ready 5d v1.7.2
ch7radod1 Ready 1d v1.7.2
> kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system heapster-4285517626-4njc2 0/1 ContainerCreating 0 5d
kube-system kube-dns-3942128195-ft56n 0/3 ContainerCreating 0 19d
kube-system kube-dns-646531078-z5lzs 0/3 ContainerCreating 0 5d
kube-system kubernetes-dashboard-716739405-lpj38 0/1 ContainerCreating 0 5d
kube-system monitoring-grafana-3552275057-qn0zf 0/1 ContainerCreating 0 5d
kube-system monitoring-influxdb-4110454889-79pvk 0/1 ContainerCreating 0 5d
kube-system tiller-deploy-737598192-f9gcl 0/1 ContainerCreating 0 5d
The setup uses private registry (Artifactory).
Kubernetes currently only supports Docker version 1.12.6, from the info you provided it looks like you are running version 17.03.1-ce.
Thanks for noticing the version. I was aware of this compatibility issue but it slipped while trying to fix other issues. While i was doing all this, i recently put NGINX for HTTPS. Now when i am trying to change the docker version to 1.12.6, i am getting the following message:
[rancher@ch1 ~]$ sudo ros engine switch docker-1.12.6
> ERRO[0031] Failed to load https://raw.githubusercontent.com/rancher/os-services/v1.0.3/index.yml: Get https://raw.githubusercontent.com/rancher/os-services/v1.0.3/index.yml: Proxy Authentication Required
> FATA[0031] docker-1.12.6 is not a valid engine
I thought may be it’s due to NGINX so i stopped the NGINX container but i am still getting the above error. Earlier i have tried the same command on this Rancher server and it used to work fine. It’s working fine on agent nodes although they are already having 1.12.6 configured.
The person who did the setup earlier had used his own credential for proxy and once it started working, he removed the proxy settings from the config of all the instances. Since the agent machines were not restarted (but server instance was) after that, things were working fine there and i was banging my head why it’s not working on the server instance.
Now since that mystery is clear, i am again back to my original issue because even after changing the Docker version to 1.12.6 on server (agent nodes are already on 1.12.6), Kubernetes dashboard shows Service Unavailable.
In this link (http://rancher.com/docs/rancher/v1.6/en/kubernetes/private-registry/), for Helm, Dashboard etc, it mentions copying the exact version of images. I have not performed any step mentioned here. Is that the reason for dashboard not working? What exactly needs to be copied and where? By the way, I am using private registry (Artifactory).
> kubectl -n kube-system get po
NAME READY STATUS RESTARTS AGE
heapster-4285517626-4njc2 1/1 Running 0 12d
kube-dns-2588877561-26993 0/3 ImagePullBackOff 0 5h
kube-dns-646531078-z5lzs 0/3 ContainerCreating 0 12d
kubernetes-dashboard-716739405-zq3s9 0/1 CrashLoopBackOff 67 5h
monitoring-grafana-3552275057-qn0zf 1/1 Running 0 12d
monitoring-influxdb-4110454889-79pvk 1/1 Running 0 12d
tiller-deploy-737598192-f9gcl 0/1 CrashLoopBackOff 72 12d
Can you post the logs for those containers that aren’t running?
> kubectl get pods -a -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system heapster-4285517626-4njc2 1/1 Running 0 12d 10.42.224.157 radod4
kube-system kube-dns-2588877561-26993 0/3 ImagePullBackOff 0 5h <none> radod1
kube-system kube-dns-646531078-z5lzs 0/3 ContainerCreating 0 12d <none> radod4
kube-system kubernetes-dashboard-716739405-zq3s9 0/1 Error 70 5h 10.42.218.11 radod1
kube-system monitoring-grafana-3552275057-qn0zf 1/1 Running 0 12d 10.42.202.44 radod4
kube-system monitoring-influxdb-4110454889-79pvk 1/1 Running 0 12d 10.42.111.171 radod4
kube-system tiller-deploy-737598192-f9gcl 0/1 CrashLoopBackOff 76 12d 10.42.213.24 radod4
Then i went to the host where the process was executing and tried the following command:
[rancher@radod1 ~]$
[rancher@radod1 ~]$ docker ps -a | grep dash
282334b0ed38 gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:b537ce8988510607e95b8d40ac9824523b1f9029e6f9f90e9fccc663c355cf5d "/dashboard --insecur" About a minute ago Exited (1) 55 seconds ago k8s_kubernetes-dashboard_kubernetes-dashboard-716739405-zq3s9_kube-system_7b0afda7-8271-11e7-ae86-021bfe69c163_72
99836d7824fd gcr.io/google_containers/pause-amd64:3.0 "/pause" 5 hours ago Up 5 hours k8s_POD_kubernetes-dashboard-716739405-zq3s9_kube-system_7b0afda7-8271-11e7-ae86-021bfe69c163_1
[rancher@radod1 ~]$
[rancher@radod1 ~]$
[rancher@radod1 ~]$ docker logs 282334b0ed38
Using HTTP port: 8443
Creating API server client for https://10.43.0.1:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: the server has asked for the client to provide credentials
Refer to the troubleshooting guide for more information: https://github.com/kubernetes/dashboard/blob/master/docs/user-guide/troubleshooting.md
After i got the above error, i again searched online and tried few things. Finally, this link helped. After i executed the following commands on all agent nodes, Kubernetes dashboard finally started working!
docker volume rm etcd
rm -rf /var/etcd/backups/*
I have this problem, too.I think maybe the contents of the tokens is invalid.
View all secrets : kubectl get secrets -n kube-system
Delete rancher secret :kubectl delete secret io-rancher-system-XXXXXX
Dashboard use the io-rancher-system secret ,It will automatically be recreated.
Delete dns secret :kubectl delete secret kube-dns-token-XXXXXX
Dns pod use the secret
In my environment,this will work!
1 Like
If secrets not automatically created,you can delete the service account,re create it.
kubectl get sa -n kube-system
kubectl delete sa xxxxx kube-system
1 Like
Thanks for sharing your findings peterchen82! This may help someone.