I’m trying to restore an rke cluster from an etcd snapshot, and i’m having some troubles.
I’ve followed the instructions in the docs -
Unfortunately, after step 5 (bringing up the cluster), and rebooting the node, all the pods in my cluster seem to be stuck in a pending state.
I’ve also noticed that the calico node agent is stuck in a CrashLoopBackOff state (I’m guessing all other pods rely on the overlay network to be properly working - makes sense) - after reading the logs i discovered that the calico-node pod is getting an unauthorized response when trying to access the datastore (kubernetes).
I guess it has something to do with the service-accounts not being restored correctly, even though it seems there shouldn’t be any problems in that area, but still.
rke version - 0.1.14
hyperkube version - 1.11.5
calico version - 3.1.3
rancher version (that is deployed inside the cluster) - 2.1.4
Does anyone have an idea that might help ?