Problems creating large number of worker nodes

I have set node count for an instance type to 30. I have aws permissions to create more than 100 of this instance type. Around 20% of instances created give one of two errors:

“Unavailable: Runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized”


“Unavailable: [container runtime status check may not have completed yet, runtime network not ready: NetworkRead=false reason NetowrkPluginNotReady message:docker: network plugin is not ready: cni config uninitizlied, missing node capacity for resources: ephemeral-storage]”

I am unsure if this is a kubernetes issue, a docker container issue, or a rancher issue and I am unsure of how to go about debugging this problem.

It take a little while for all the nodes to connect and for the CNI plugin to initialize. I’ve had clusters on Digital Ocean take 5 minutes and sometimes take 20. It’s something about the underlying network layer on the cloud provider. I’ve seen this behavior on EC2 quite a bit as well.