Bad certs in single node docker installation

Austin-Earl · March 26, 2020, 5:34pm

Hey So Our testing rancher instance just burned up… -_- Im trying to figure out what happened
Here are some details:
Its a single node rancher installation through docker. We are useing self signed certs that rancher and kubernetes creates we have the server on 172.16.0.230
We are getting these errors when I do docker logs on rancher docker container

E0326 17:24:15.341812 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.StorageClass: Get https://localhost:6443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.341983 I | http: TLS handshake error from 127.0.0.1:60508: remote error: tls: bad certificate
E0326 17:24:15.342834 6 reflector.go:134] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:178: Failed to list *v1.Pod: Get https://localhost:6443/api/v1/pods?fieldSelector=status.phase!%3DFailed%2Cstatus.phase!%3DSucceeded&limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.342887 I | http: TLS handshake error from 127.0.0.1:60510: remote error: tls: bad certificate
2020-03-26 17:24:15.343833 I | http: TLS handshake error from 127.0.0.1:60512: remote error: tls: bad certificate
E0326 17:24:15.343885 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.PersistentVolume: Get https://localhost:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
E0326 17:24:15.350151 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.PersistentVolumeClaim: Get https://localhost:6443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.350252 I | http: TLS handshake error from 127.0.0.1:60514: remote error: tls: bad certificate
2020-03-26 17:24:15.356600 I | http: TLS handshake error from 127.0.0.1:60516: remote error: tls: bad certificate
E0326 17:24:15.358639 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.Service: Get https://localhost:6443/api/v1/services?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
E0326 17:24:15.365366 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.StatefulSet: Get https://localhost:6443/apis/apps/v1/statefulsets?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.367842 I | http: TLS handshake error from 127.0.0.1:60518: remote error: tls: bad certificate
2020-03-26 17:24:15.368033 I | http: TLS handshake error from 127.0.0.1:60520: remote error: tls: bad certificate
E0326 17:24:15.368080 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.ReplicaSet: Get https://localhost:6443/apis/apps/v1/replicasets?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.374517 I | http: TLS handshake error from 127.0.0.1:60522: remote error: tls: bad certificate
E0326 17:24:15.374564 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1beta1.PodDisruptionBudget: Get https://localhost:6443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.376002 I | http: TLS handshake error from 127.0.0.1:60524: remote error: tls: bad certificate
E0326 17:24:15.376042 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.ReplicationController: Get https://localhost:6443/api/v1/replicationcontrollers?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid
2020-03-26 17:24:15.376960 I | http: TLS handshake error from 127.0.0.1:60526: remote error: tls: bad certificate
E0326 17:24:15.377020 6 reflector.go:134] k8s.io/client-go/informers/factory.go:127: Failed to list *v1.Node: Get https://localhost:6443/api/v1/nodes?limit=500&resourceVersion=0: x509: certificate has expired or is not yet valid

When Looking at the rancher agents I get this:

ERROR: https://172.16.0.230:7443/ping is not accessible (Failed to connect to 172.16.0.230 port 7443: Connection refused)
INFO: Arguments: --server https://172.16.0.230:7443 --token REDACTED --ca-checksum 6a2f0c412cdd4499b5ace7ac81407d616a197b7b1a6ee0ad5e8412140d8ccc62 --no-register --only-write-certs
INFO: Environment: CATTLE_ADDRESS=172.16.0.230 CATTLE_AGENT_CONNECT=true CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=BigThink CATTLE_SERVER=https://172.16.0.230:7443 CATTLE_TOKEN=REDACTED CATTLE_WRITE_CERT_ONLY=true
INFO: Using resolv.conf: nameserver 172.16.0.1
ERROR: https://172.16.0.230:7443/ping is not accessible (Failed to connect to 172.16.0.230 port 7443: Connection refused)

Any ideas? I dont want to have to reinstall everything.
Note: Yes I know a single node installation is a bad thing. I am current setting a multi node ha rancher cluster on one of our public servers. This was just to get us familar with the system, but we are currently relying on it for testing and dev. So I need to get it back up so others can continue to work on their projects (Or untill I configure everything to use the public servers).

Austin-Earl · April 20, 2020, 6:29pm

For future ref, I found the reason. When I installed rancher in the docker image I used the latest tag to it. Rancher updated and thus broke. Make sure you specify the version if you do single node install!

Contadmin · May 26, 2020, 7:15am

Any way to fix this without losing all data?

mhoskiso · June 2, 2020, 5:12pm

Hi Contadmin,

I also had a cert issue, not sure if it’s the same as OP but I resolved it by installing certbot on the node and launching Rancher with the --no-cacerts option and the certificates mounted from the nodes. I had no data loss when taking cert issuance out of Rancher. You may need to relaunch Rancher at some point for it to pickup the new certificate.

docker run -d --restart=unless-stopped
-p 80:80 -p 443:443
-v /root/rancher:/var/lib/rancher
-v /etc/letsencrypt/live/removed/fullchain.pem:/etc/rancher/ssl/cert.pem
-v /etc/letsencrypt/live/removed/privkey.pem:/etc/rancher/ssl/key.pem
rancher/rancher:latest
–no-cacerts

Topic		Replies	Views
Rancher 2.2.2 certificate expiration issues Rancher	5	10177	March 8, 2023
X509 certificate has expired or is not yet valid Rancher	13	26257	October 19, 2022
Rancher 2.xdocker install single node Rancher	3	410	June 18, 2021
Cert error on Rancher UI after 1 year (Rancher UI 2.2.1) Rancher	0	482	April 9, 2020
Serving-cert on AKS Cluster has expired	3	1010	May 11, 2021

Bad certs in single node docker installation

Related topics