I’ve got a set of 4 RancherOS nodes in a Rancher cluster. Containers are mostly web apps, so I have HAProxy as built into Rancher to direct traffic to the right containers. In front of the HAProxy nodes, I have an Elastic Load Balancer, and in front of the ELB, I have CloudFlare (mostly for the free SSL) and in theory a CDN that properly caches assets would result in higher performance for most users. This just ins’t playing out in practice. Today I ran some tests with ApacheBench to see how each additional layer drags down performance, and it’s a bit excessive.
# from inside container on RancherOS server
ab -n 10000 -c 100 -H "Host: dreamscarred.com" http://localhost/license.txt
Finished 10000 requests
Server Software: nginx/1.4.6
Server Hostname: localhost
Server Port: 80
Document Path: /license.txt
Document Length: 19935 bytes
Concurrency Level: 100
Time taken for tests: 3.837 seconds
Complete requests: 10000
Failed requests: 0
Total transferred: 201800000 bytes
HTML transferred: 199350000 bytes
Requests per second: 2605.94 [#/sec] (mean)
Time per request: 38.374 [ms] (mean)
Time per request: 0.384 [ms] (mean, across all concurrent requests)
Transfer rate: 51355.35 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 3.0 0 32
Processing: 1 37 13.7 36 104
Waiting: 1 35 13.5 34 104
Total: 1 38 13.6 38 104
Percentage of the requests served within a certain time (ms)
50% 38
66% 41
75% 43
80% 45
90% 53
95% 65
98% 82
99% 87
100% 104 (longest request)
# going through haproxy from my laptop at home
ab -n 10000 -c 100 -H "Host: dreamscarred.com" https://wolfsbane.windsofstorm.net/license.txt
Finished 10000 requests
Server Software: nginx/1.4.6
Server Hostname: wolfsbane.windsofstorm.net
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
TLS Server Name: dreamscarred.com
Document Path: /license.txt
Document Length: 19935 bytes
Concurrency Level: 100
Time taken for tests: 83.196 seconds
Complete requests: 10000
Failed requests: 0
Total transferred: 201800000 bytes
HTML transferred: 199350000 bytes
Requests per second: 120.20 [#/sec] (mean)
Time per request: 831.960 [ms] (mean)
Time per request: 8.320 [ms] (mean, across all concurrent requests)
Transfer rate: 2368.75 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 102 582 612.9 321 11530
Processing: 54 245 148.5 237 4315
Waiting: 30 126 116.7 108 4131
Total: 201 827 643.5 577 11925
Percentage of the requests served within a certain time (ms)
50% 577
66% 683
75% 790
80% 1575
90% 1788
95% 1938
98% 2075
99% 2926
100% 11925 (longest request)
# going through ElasticLoadBalancer in front of HAProxy from my laptop at home
ab -n 10000 -c 100 -H "Host: dreamscarred.com" http://wos-elb-classic-1604117553.us-west-2.elb.amazonaws.com/license.txt
Finished 10000 requests
Server Software: nginx/1.4.6
Server Hostname: wos-elb-classic-1604117553.us-west-2.elb.amazonaws.com
Server Port: 80
Document Path: /license.txt
Document Length: 19935 bytes
Concurrency Level: 100
Time taken for tests: 38.867 seconds
Complete requests: 10000
Failed requests: 0
Total transferred: 201800000 bytes
HTML transferred: 199350000 bytes
Requests per second: 257.29 [#/sec] (mean)
Time per request: 388.672 [ms] (mean)
Time per request: 3.887 [ms] (mean, across all concurrent requests)
Transfer rate: 5070.36 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 34 260 536.0 43 3916
Processing: 75 122 74.3 97 1047
Waiting: 38 77 73.3 52 999
Total: 116 382 541.8 142 4174
Percentage of the requests served within a certain time (ms)
50% 142
66% 149
75% 297
80% 337
90% 1416
95% 1433
98% 1625
99% 2687
100% 4174 (longest request)
# going through CloudFlare from my laptop at home to simulate normal requests
ab -n 10000 -c 100 https://dreamscarred.com/license.txt
Finished 10000 requests
Server Software: cloudflare
Server Hostname: dreamscarred.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-ECDSA-CHACHA20-POLY1305,256,256
TLS Server Name: dreamscarred.com
Document Path: /license.txt
Document Length: 19935 bytes
Concurrency Level: 100
Time taken for tests: 138.384 seconds
Complete requests: 10000
Failed requests: 0
Total transferred: 204470000 bytes
HTML transferred: 199350000 bytes
Requests per second: 72.26 [#/sec] (mean)
Time per request: 1383.839 [ms] (mean)
Time per request: 13.838 [ms] (mean, across all concurrent requests)
Transfer rate: 1442.93 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 133 926 693.5 644 6029
Processing: 94 450 265.1 405 4988
Waiting: 55 195 143.4 165 4166
Total: 337 1375 788.5 1031 6272
Percentage of the requests served within a certain time (ms)
50% 1031
66% 1197
75% 1820
80% 2063
90% 2557
95% 2775
98% 3637
99% 3961
100% 6272 (longest request)
This seems like the nginx containers are doing fine, it immediately tanks RPS as soon as HAProxy (from 2605.94 [#/sec] to 120.20 [#/sec]) enters into the picture, and each additional stack on top further reduces it.
I don’t really want to give up the CloudFlare with the bonuses they provide (especially security, given how frequently WordPress is a target), but it seems to me that the ELB and/or HAProxy setups could potentially be optimized. Right now I’m just doing one ELB as they cost about $20 a month last I looked, and I don’t want to pay more for ELBs as I do EC2 instances.