Getting random timeouts

We’ve hit a perplexing issue running Rancher 2.0.

When running a test suite we used called kuberang, we get random timeouts like this:

./kuberang-linux-amd64-1.1.3-2

Kubectl configured on this node [OK]
Delete existing deployments if they exist [OK]
Nginx service does not already exist [OK]
BusyBox service does not already exist [OK]
Nginx service does not already exist [OK]
Issued BusyBox start request [OK]
Issued Nginx start request [OK]
Issued expose Nginx service request [OK]
Both deployments completed successfully within timeout [OK]
Grab nginx pod ip addresses [OK]
Grab nginx service ip address [OK]
Grab BusyBox pod name [OK]
Accessed Nginx service at 172.31.61.91 from BusyBox [OK]
Accessed Nginx service via DNS kuberang-nginx-1541091260863910044 from BusyBox [OK]
Accessed Nginx pod at 172.31.144.2 from BusyBox [OK]
Accessed Nginx pod at 172.31.136.5 from BusyBox [OK]
Accessed Nginx pod at 172.31.151.2 from BusyBox [OK]
Accessed Nginx pod at 172.31.136.3 from BusyBox [OK]

Accessed Nginx pod at 172.31.142.2 from BusyBox [ERROR]
-------- OUTPUT --------
wget: can’t connect to remote host (172.31.142.2): Connection timed out
command terminated with exit code 1

Accessed Nginx pod at 172.31.146.3 from BusyBox [OK]
Accessed Nginx pod at 172.31.149.2 from BusyBox [OK]
Accessed Nginx pod at 172.31.150.2 from BusyBox [OK]

What’s odd these exact same nodes were a standalone kube cluster before build using kubeadm. Kuberang ran like a champ.

The nodes that trigger the timeout are random. We’ve found absolutely nothing network related.

We’ve tried rancher deployment with calico and canal. We’ve tried docker 17.03, 17.05 and 17.06.

We can’t figure out this issue.

Anyone have any ideas?

The rancher busybox image is not under any project unlike the rest of the images.
Some custom docker registries require all images be under a project. Can we run rancher without the busybox image?