Rancher 2.0.8 high IO

Hi all,

I have recently been building a single node Rancher 2.0 K8s cluster for some dev testing.
Running

  • CentOS 7.5.1804
  • Docker 17.03.2-ce
  • Overlay storage driver
  • Swap disabled, dnsmasq disabled appropriate network driver changes made etc

Basically when the rancher 2 container runs the server falls over with 1000MB/s read
I have tried multiple versions of Docker and different storage drivers, deleted and rebuilt and every time shortly after I add a custom cluster on the same box everything falls over.

If I stop the rancher 2 container usage goes back to normal and k8s operates without issue, its as if rancher is in a continuous loop against the cluster for some reason.

Any ideas or assistance would be appreciated.

The highest disk uses in terms of mb/s are always
rancher --http-listen-port=80 --https-listen-port=443 --audit-log-path=/va~el=1 --audit-log-maxage=20 --audit-log-maxbackup=20 --audit-log-maxsize=100

iotop

Total DISK READ : 1043.02 M/s | Total DISK WRITE : 0.00 B/s
Actual DISK READ: 1057.03 M/s | Actual DISK WRITE: 4.11 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
18933 be/4 root 4.08 M/s 0.00 B/s 0.00 % 99.99 % iptables -w 5 -N KUBE-EXTERNAL-SERVICES -t filter
18934 be/4 root 14.68 M/s 0.00 B/s 0.00 % 99.99 % iptables -t nat -C POSTROUTING ! -s 10.42.0.0/16 -d 10.42.0.0/24 -j RETURN --wait
16731 be/4 root 68.54 K/s 0.00 B/s 0.00 % 99.99 % dnsmasq-nanny -v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -res~3 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053
10945 be/4 root 15.16 M/s 0.00 B/s 0.00 % 99.99 % etcd --peer-client-cert-auth --client-cert-auth --listen-client-urls=https~tate=new --peer-key-file=/etc/kubernetes/ssl/kube-etcd-60-234-73-10-key.pem
15092 be/4 root 74.03 K/s 0.00 B/s 0.00 % 99.78 % dnsmasq-nanny -v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -res~3 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053
18663 be/4 root 41.05 M/s 0.00 B/s 0.00 % 99.59 % rancher --http-listen-port=80 --https-listen-port=443 --audit-log-path=/va~el=1 --audit-log-maxage=20 --audit-log-maxbackup=20 --audit-log-maxsize=100
10971 be/4 root 20.57 M/s 0.00 B/s 0.00 % 99.53 % etcd --peer-client-cert-auth --client-cert-auth --listen-client-urls=https~tate=new --peer-key-file=/etc/kubernetes/ssl/kube-etcd-60-234-73-10-key.pem
11779 be/4 root 35.87 M/s 0.00 B/s 0.00 % 98.60 % kubelet --cni-conf-dir=/etc/cni/net.d --allow-privileged=true --pod-infra-~_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --v=2 --cgroup-driver=cgroupfs
18661 be/4 root 34.80 M/s 0.00 B/s 0.00 % 98.59 % rancher --http-listen-port=80 --https-listen-port=443 --audit-log-path=/va~el=1 --audit-log-maxage=20 --audit-log-maxbackup=20 --audit-log-maxsize=100
12907 be/4 root 35.13 M/s 0.00 B/s 0.00 % 98.52 % calico-felix
15696 be/4 root 22.04 M/s 0.00 B/s 0.00 % 98.05 % kubelet --cni-conf-dir=/etc/cni/net.d --allow-privileged=true --pod-infra-~_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --v=2 --cgroup-driver=cgroupfs
11181 be/4 root 20.52 M/s 0.00 B/s 0.00 % 97.93 % etcd --peer-client-cert-auth --client-cert-auth --listen-client-urls=https~tate=new --peer-key-file=/etc/kubernetes/ssl/kube-etcd-60-234-73-10-key.pem
15827 be/4 root 25.63 M/s 0.00 B/s 0.00 % 97.54 % kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
13212 be/4 root 31.69 M/s 0.00 B/s 0.00 % 96.73 % agent
16539 be/4 root 24.15 M/s 0.00 B/s 0.00 % 96.65 % calico-felix
15820 be/4 root 25.67 M/s 0.00 B/s 0.00 % 96.50 % kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
18664 be/4 root 36.73 M/s 0.00 B/s 0.00 % 96.48 % rancher --http-listen-port=80 --https-listen-port=443 --audit-log-path=/va~el=1 --audit-log-maxage=20 --audit-log-maxbackup=20 --audit-log-maxsize=100
14440 be/4 65534 41.88 M/s 0.00 B/s 0.00 % 96.40 % cluster-proportional-autoscaler --namespace=kube-system --configmap=kube-d~coresPerReplica":128,“nodesPerReplica”:4,“min”:1}} --logtostderr=true --v=2
13799 be/4 root 31.31 M/s 0.00 B/s 0.00 % 96.02 % agent
15823 be/4 root 21.54 M/s 0.00 B/s 0.00 % 95.02 % kube-dns --domain=cluster.local. --dns-port=10053 --config-dir=/kube-dns-config --v=2
18570 be/4 33 42.11 M/s 0.00 B/s 0.00 % 93.36 % nginx-ingress-controller --default-backend-service=ingress-nginx/default-h~ingress-nginx/udp-services --annotations-prefix=nginx.ingress.kubernetes.io
18680 be/4 root 39.47 M/s 0.00 B/s 0.00 % 92.19 % rancher --http-listen-port=80 --https-listen-port=443 --audit-log-path=/va~el=1 --audit-log-maxage=20 --audit-log-maxbackup=20 --audit-log-maxsize=100
11109 be/4 root 15.54 M/s 0.00 B/s 0.00 % 87.29 % kube-apiserver --requestheader-client-ca-file=/etc/kubernetes/ssl/kube-api~0-32767 --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem
11111 be/4 root 17.25 M/s 0.00 B/s 0.00 % 79.18 % kube-apiserver --requestheader-client-ca-file=/etc/kubernetes/ssl/kube-api~0-32767 --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem
11780 be/4 root 14.87 M/s 0.00 B/s 0.00 % 76.94 % kubelet --cni-conf-dir=/etc/cni/net.d --allow-privileged=true --pod-infra-~_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --v=2 --cgroup-driver=cgroupfs
15700 be/4 root 14.53 M/s 0.00 B/s 0.00 % 68.16 % kubelet --cni-conf-dir=/etc/cni/net.d --allow-privileged=true --pod-infra-~_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --v=2 --cgroup-driver=cgroupfs
16347 be/4 65534 16.55 M/s 0.00 B/s 0.00 % 65.74 % sidecar --v=2 --logtostderr --probe=kubedns,127.0.0.1:10053,kubernetes.def~l,5,A --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A
11110 be/4 root 13.60 M/s 0.00 B/s 0.00 % 64.57 % kube-apiserver --requestheader-client-ca-file=/etc/kubernetes/ssl/kube-api~0-32767 --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem
11312 be/4 root 15.64 M/s 0.00 B/s 0.00 % 61.66 % kube-controller-manager --allocate-node-cidrs=true --service-cluster-ip-ra~nager.yaml --leader-elect=true --v=2 --use-service-account-credentials=true
10967 be/4 root 10.54 M/s 0.00 B/s 0.00 % 61.44 % etcd --peer-client-cert-auth --client-cert-auth --listen-client-urls=https~tate=new --peer-key-file=/etc/kubernetes/ssl/kube-etcd-60-234-73-10-key.pem
11095 be/4 root 15.21 M/s 0.00 B/s 0.00 % 60.64 % kube-apiserver --requestheader-client-ca-file=/etc/kubernetes/ssl/kube-api~0-32767 --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem
11314 be/4 root 14.63 M/s 0.00 B/s 0.00 % 60.64 % kube-controller-manager --allocate-node-cidrs=true --service-cluster-ip-ra~nager.yaml --leader-elect=true --v=2 --use-service-account-credentials=true
11307 be/4 root 10.56 M/s 0.00 B/s 0.00 % 57.28 % kube-controller-manager --allocate-node-cidrs=true --service-cluster-ip-ra~nager.yaml --leader-elect=true --v=2 --use-service-account-credentials=true
11099 be/4 root 13.87 M/s 0.00 B/s 0.00 % 55.49 % kube-apiserver --requestheader-client-ca-file=/etc/kubernetes/ssl/kube-api~0-32767 --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem
11310 be/4 root 11.65 M/s 0.00 B/s 0.00 % 50.82 % kube-controller-manager --allocate-node-cidrs=true --service-cluster-ip-ra~nager.yaml --leader-elect=true --v=2 --use-service-account-credentials=true
10972 be/4 root 6.48 M/s 0.00 B/s 0.00 % 44.66 % etcd --peer-client-cert-auth --client-cert-auth --listen-client-urls=https~tate=new --peer-key-file=/etc/kubernetes/ssl/kube-etcd-60-234-73-10-key.pem
11090 be/4 root 13.66 M/s 0.00 B/s 0.00 % 44.25 % kube-apiserver --requestheader-client-ca-file=/etc/kubernetes/ssl/kube-api~0-32767 --kubelet-client-certificate=/etc/kubernetes/ssl/kube-apiserver.pem

I have changed to CentOS’s built in release of Docker 1.13.1-72 using devicemapper direct-lvm for the storage driver and all the problems have gone away.
The node is running as expected without issue, so there must have been something strange going on with the required version of docker-ce on CentOS 7