Etcd 3 nodes, 1 lost, 1 can't start. Weird certificates changes

I’m running rancher 2.5.15.
Something get wrong during the certificate rotation for etcd component (3 nodes): one node get lost (bad luck, HDD crashed that day) and one replica can’t restart:

2023-05-14 21:26:37.231839 I | etcdmain: etcd Version: 3.4.15
2023-05-14 21:26:37.231877 I | etcdmain: Git SHA: aa7126864
2023-05-14 21:26:37.231882 I | etcdmain: Go Version: go1.12.17
2023-05-14 21:26:37.231884 I | etcdmain: Go OS/Arch: linux/amd64
2023-05-14 21:26:37.231887 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2023-05-14 21:26:37.253777 C | etcdmain: cannot listen on TLS for [::]:2380: KeyFile and CertFile are not presented

I see that /etc/kubernetes/ssl exists and have correct CA and CA-key. I see PEM and keys for every nodes, BUT I see that etcd pem changes every restart.

On the last etcd “alive”, I see the etcd pem change regulary as well.

I guess that etcd pem rotation is abnormal, how can I stop that rotation and make etcd start correctly?

Thank you.

1 Like