Etcd unhealthy - hot restart

On Rancher2 cluster page I’ve seen that etcd gets unhealty pretty often lately.
I’m also trying to add new host with rke, however it always stops with this message:

FATA[0006] [network] Host [31.18.12.45] is not able to connect to the following ports: [10.16.132.51:2379, 10.16.132.51:2380]. Please check network policies and firewall rules

Investigating, this one is the log of etcd pod that is not replying, it is looping with this panic error.

2019-08-06 11:13:20.116695 I | etcdmain: etcd Version: 3.2.18
2019-08-06 11:13:20.116845 I | etcdmain: Git SHA: eddf599c6
2019-08-06 11:13:20.116858 I | etcdmain: Go Version: go1.8.7
2019-08-06 11:13:20.116880 I | etcdmain: Go OS/Arch: linux/amd64
2019-08-06 11:13:20.116894 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2019-08-06 11:13:20.117202 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2019-08-06 11:13:20.117248 I | embed: peerTLS: cert = /etc/kubernetes/ssl/kube-etcd-10-6-32-251.pem, key = /etc/kubernetes/ssl/kube-etcd-10-16-132-51-key.pem, ca = , trusted-ca = /etc/kubernetes/ssl/kube-ca.pem, client-cert-auth = true
2019-08-06 11:13:20.120955 I | embed: listening for peers on https://10.16.132.51:2380
2019-08-06 11:13:20.121424 I | embed: listening for client requests on 10.16.132.51:2379
2019-08-06 11:13:20.501338 I | etcdserver: recovered store from snapshot at index 261657170
panic: page 20651 already freed

goroutine 1 [running]:

github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*freelist).free(0xc4202336e0, 0x3ace022, 0x7f0f448ed000)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/freelist.go:143 +0x3d0
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*node).spill(0xc42015fb90, 0x10, 0x10)    /tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/node.go:363 +0x1e0
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Bucket).spill(0xc420294018, 0x1de59f32, 0x1507160)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/bucket.go:568 +0x17b
github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt.(*Tx).Commit(0xc420294000, 0x1de45992, 0x1507160)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/bbolt/tx.go:160 +0x11f
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.(*batchTx).commit(0xc4200176e0, 0x0)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/batch_tx.go:179 +0x82
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.(*batchTxBuffered).unsafeCommit(0xc4200176e0, 0xfb4600)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/batch_tx.go:251 +0x49
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.(*batchTxBuffered).commit(0xc4200176e0, 0xfb1600)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/batch_tx.go:239 +0x80
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.(*batchTxBuffered).Commit(0xc4200176e0)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/batch_tx.go:226 +0x66
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.(*backend).ForceCommit(0xc42029f500)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:165 +0x2f
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc.NewStore(0x14be760, 0xc42029f500, 0x14bfdc0, 0x15279d0, 0x14abc80, 0xc420165bc8, 0x7a65a)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/kvstore.go:127 +0x3ca
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc.newWatchableStore(0x14be760, 0xc42029f500, 0x14bfdc0, 0x15279d0, 0x14abc80, 0xc420165bc8, 0x15279d0)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/watchable_store.go:75 +0x81
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc.New(0x14be760, 0xc42029f500, 0x14bfdc0, 0x15279d0, 0x14abc80, 0xc420165bc8, 0x1, 0xc4201f0090)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/watchable_store.go:70 +0x5d
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.recoverSnapshotBackend(0xc420298180, 0x14be760, 0xc42029f500, 0xc4202f2a00, 0x5274, 0x5500, 0xc4200185c0, 0x5, 0x8, 0x0, ...)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/backend.go:74 +0xe0
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver.NewServer(0xc420298180, 0x0, 0x0, 0x0)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/server.go:378 +0x2d85
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/embed.StartEtcd(0xc42027a000, 0xc420286000, 0x0, 0x0)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/embed/etcd.go:157 +0x782
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.startEtcd(0xc42027a000, 0x6, 0xf72773, 0x6, 0x1)
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/etcd.go:186 +0x58
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.startEtcdOrProxyV2()
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/etcd.go:103 +0x15ba
github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain.Main()
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdmain/main.go:39 +0x61
main.main()
/tmp/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/etcd/main.go:28 +0x20

I would like to “reset” the container, maybe there is some dirty config.
Doing a restart of the container doesn’t solve the issue.

Can you please tell me how to restart properly the etcd container?
This cluster is running in production environment and I would like to avoid any downtime. So any “hot” fix is very appreciated.

Many thanks