Steps to reproduce (least amount of steps as possible):
In Rancher UI, select Restore Snapshot
Result:
- etcd nodes (some or all 3 nodes) state is UNAVAILABLE with the “Run-time network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized” error. This behavior does not happen all the time after a snapshot restore.
- etcd nodes state is AVAILABLE but calico-node pod (all pods) are not running. Hence, application is not running either.
Other details that may be helpful:
When the calico-node pods are not Running, redeploy calico-node workload seems to solve it. However, the pod(s) can also go in/out of Running state. Ultimately, I have to deploy the calico-node workload and recreate the calico-node daemonset.
Sometimes, the cattle-cluster-agent, coreden, coredns-autoscaler, and metric-server workloads show Active status but the pod(s) are not running. I’m not sure if they are impacted by calico-node not working.
Environment information
- Rancher version (
rancher/rancher
/rancher/server
image tag or shown bottom left in the UI):
v2.2.6 - Installation option (single install/HA):
HA AirGap
Cluster information
- Cluster type (Hosted/Infrastructure Provider/Custom/Imported):
Import (initializing local rancher cluster) - Machine type (cloud/VM/metal) and specifications (CPU/memory):
t2.large AWS ec2-instance - Kubernetes version (use
kubectl version
):
v1.14.3 - Docker version (use
docker version
):
1.13.1