After two weeks of normal operation, my kubernetes cluster stopped working compaining to etcd_1
8/13/2016 3:57:07 PM2016-08-13 12:57:07.185065 I | raft: d884709ae7327d50 is starting a new election at term 284679 8/13/2016 3:57:07 PM2016-08-13 12:57:07.185145 I | raft: d884709ae7327d50 became candidate at term 284680 8/13/2016 3:57:07 PM2016-08-13 12:57:07.185163 I | raft: d884709ae7327d50 received vote from d884709ae7327d50 at term 284680 8/13/2016 3:57:07 PM2016-08-13 12:57:07.185180 I | raft: d884709ae7327d50 [logterm: 570, index: 2213890] sent vote request to 80724d2907b90ef1 at term 284680 8/13/2016 3:57:08 PM2016-08-13 12:57:08.785109 I | raft: d884709ae7327d50 is starting a new election at term 284680 8/13/2016 3:57:08 PM2016-08-13 12:57:08.785186 I | raft: d884709ae7327d50 became candidate at term 284681 8/13/2016 3:57:08 PM2016-08-13 12:57:08.785195 I | raft: d884709ae7327d50 received vote from d884709ae7327d50 at term 284681 8/13/2016 3:57:08 PM2016-08-13 12:57:08.785206 I | raft: d884709ae7327d50 [logterm: 570, index: 2213890] sent vote request to 80724d2907b90ef1 at term 284681 8/13/2016 3:57:08 PM2016-08-13 12:57:08.916894 E | etcdserver: publish error: etcdserver: request timed out 8/13/2016 3:57:08 PM2016-08-13 12:57:08.942930 E | etcdhttp: got unexpected response error (etcdserver: request timed out) [merged 2 repeated lines in 2s] 8/13/2016 3:57:10 PM2016-08-13 12:57:10.085109 I | raft: d884709ae7327d50 is starting a new election at term 284681 8/13/2016 3:57:10 PM2016-08-13 12:57:10.085151 I | raft: d884709ae7327d50 became candidate at term 284682 8/13/2016 3:57:10 PM2016-08-13 12:57:10.085159 I | raft: d884709ae7327d50 received vote from d884709ae7327d50 at term 284682 8/13/2016 3:57:10 PM2016-08-13 12:57:10.085318 I | raft: d884709ae7327d50 [logterm: 570, index: 2213890] sent vote request to 80724d2907b90ef1 at term 284682 8/13/2016 3:57:10 PM2016-08-13 12:57:10.942894 E | etcdhttp: got unexpected response error (etcdserver: request timed out) [merged 2 repeated lines in 2s] 8/13/2016 3:57:11 PM2016-08-13 12:57:11.485102 I | raft: d884709ae7327d50 is starting a new election at term 284682 8/13/2016 3:57:11 PM2016-08-13 12:57:11.485157 I | raft: d884709ae7327d50 became candidate at term 284683
It not available and kubectl also shows:
client: etcd cluster is unavailable or misconfigured
How to recover the etcd? It doesn’t look self healing.