RDS under heavy load, services are slow to come up

Background:

Our infrastructure runs on AWS. We are have 1 server (c4.4xlarge) and 50 nodes (m4.large). The database for cattle is hosted in an RDS (db.m3.xlarge). We are running about 1100 stacks and 1 service (httpd) in each.

Our rancher server is running v1.0.1 and is NOT in HA mode.

Problem:

  1. For the last 24 hours, our RDS has been under heavy load.

  2. Spinning up new stack / service takes a really long time (20 mins, usually it takes 5 mins). Services are in “Waiting for [instance:xxxxxxxx_1]. Instance status: Networking” state for a long time.

Here are some of the ERRORs we keep seeing in the logs:

2016-06-16 12:31:49,894 ERROR [444fa322-5139-40f6-8465-ab1b3b54b34c:2575718] [instance:30845] [instance.start->(InstanceStart)->instance.allocate->(InstanceAllocate)] [] [cutorService-18] [c.p.a.e.i.AllocatorEventListenerImpl] No allocator handled [EventVO [id=451ab5dc-0214-460e-9a67-033b5bc68967, name=instance.allocate, previousNames=null, replyTo=reply.8152449744309803993, resourceId=30845, resourceType=instance, publisher=null, transitioning=null, transitioningMessage=null, transitioningInternalMessage=null, previousIds=null, data={}, time=Thu Jun 16 12:31:49 UTC 2016, listenerKey=null, transitioningProgress=null]]

2016-06-16 12:31:49,895 ERROR [444fa322-5139-40f6-8465-ab1b3b54b34c:2575718] [instance:30845] [instance.start->(InstanceStart)->instance.allocate] [] [torService-1883] [c.p.e.p.i.DefaultProcessInstanceImpl] Unknown exception io.cattle.platform.eventing.exception.EventExecutionException: Scheduling failed: No candidates available at io.cattle.platform.eventing.exception.EventExecutionException.fromEvent(EventExecutionException.java:53) ~[cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.eventing.impl.AbstractEventService.callSync(AbstractEventService.java:258) ~[cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.EventBasedProcessHandler.handle(EventBasedProcessHandler.java:109) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.instance.InstanceAllocate.handle(InstanceAllocate.java:48) ~[cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:446) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:393) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:387) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:387) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:493) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runWithProcessLock(DefaultProcessInstanceImpl.java:320) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$2.doWithLockNoResult(DefaultProcessInstanceImpl.java:260) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.acquireLockAndRun(DefaultProcessInstanceImpl.java:257) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:185) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:158) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:108) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:105) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:105) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.AbstractObjectProcessLogic.execute(AbstractObjectProcessLogic.java:131) [cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.instance.InstanceStart.allocate(InstanceStart.java:217) [cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.instance.InstanceStart.handle(InstanceStart.java:75) [cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:446) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:393) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:387) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:387) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:493) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]

2016-06-16 12:31:45,824 ERROR [:] [] [] [] [ServiceReplay-2] [i.c.p.e.e.i.ProcessEventListenerImpl] Unknown exception running process [volume.activate:2280417] on [28158] io.cattle.platform.eventing.exception.AgentRemovedException: Agent is removed at io.cattle.platform.agent.impl.WrappedEventService.call(WrappedEventService.java:93) ~[cattle-iaas-agent-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.agent.impl.EventCallProgressHelper.call(EventCallProgressHelper.java:57) ~[cattle-iaas-agent-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.agent.impl.RemoteAgentImpl.call(RemoteAgentImpl.java:99) ~[cattle-iaas-agent-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.agent.impl.RemoteAgentImpl.callSync(RemoteAgentImpl.java:72) ~[cattle-iaas-agent-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.agent.impl.RemoteAgentImpl.callSync(RemoteAgentImpl.java:135) ~[cattle-iaas-agent-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.AgentBasedProcessHandler.callSync(AgentBasedProcessHandler.java:180) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.AgentBasedProcessHandler.handleEvent(AgentBasedProcessHandler.java:166) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.AgentBasedProcessHandler.handle(AgentBasedProcessHandler.java:104) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:446) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:393) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:387) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:387) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:493) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runWithProcessLock(DefaultProcessInstanceImpl.java:320) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$2.doWithLockNoResult(DefaultProcessInstanceImpl.java:260) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.acquireLockAndRun(DefaultProcessInstanceImpl.java:257) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:185) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:158) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:108) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:105) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:105) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.object.process.impl.DefaultObjectProcessManager.executeStandardProcess(DefaultObjectProcessManager.java:29) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.AbstractObjectProcessLogic.createIgnoreCancel(AbstractObjectProcessLogic.java:89) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.common.handler.AbstractObjectProcessLogic.createThenActivate(AbstractObjectProcessLogic.java:83) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.volume.VolumeActivate.activatePool(VolumeActivate.java:48) ~[cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.process.volume.VolumeActivate.handle(VolumeActivate.java:37) ~[cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:446) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:393) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:387) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:387) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:493) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runWithProcessLock(DefaultProcessInstanceImpl.java:320) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$2.doWithLockNoResult(DefaultProcessInstanceImpl.java:260) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.acquireLockAndRun(DefaultProcessInstanceImpl.java:257) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:185) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:158) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:108) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:105) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:105) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.eventing.impl.ProcessEventListenerImpl.processExecute(ProcessEventListenerImpl.java:74) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at io.cattle.platform.engine.server.impl.ProcessInstanceParallelDispatcher$1.runInContext(ProcessInstanceParallelDispatcher.java:27) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na] at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na] at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na] at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na] at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na] at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_95] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_95] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_95] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_95] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_95]

@msound can you get me the output from docker info on your agent host. Another helpful tidbit would be what processes are the highest CPU consumers on your agent hosts?

@aemneina Thanks for looking into this.

Here is docker info output from one of our hosts:

ubuntu@ip-10-0-13-169:~$ docker info
Containers: 10
 Running: 8
 Paused: 0
 Stopped: 2
Images: 7
Server Version: 1.11.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 92
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: host bridge null
Kernel Version: 3.13.0-85-generic
Operating System: Ubuntu 14.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.797 GiB
Name: ip-10-0-13-169
ID: ASOB:OZFZ:ZYE3:GP2F:MHQJ:TZ44:SJ35:UPVC:VRP2:HHRI:KUOS:777A
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support

And here are the top processes:

@msound
Can I trouble you for a couple more things, how many connections a second are you seeing on your DB? Can I also get the output from netstat -an on your rancher server and a loaded host, as well as the output from ps -ef on your rancher server and host.

Are you launching your server with more memory? e.g.: passing in -e JAVA_OPTS="-Xmx4096m" on your docker run, for running the rancher server.

@aemneina We are not using any JAVA_OPTS parameter. Should we be doing so?

Also, our hosts are not heavily loaded. Its only our server that seems to be slow.

RDS connections:

Output of netstat -an on rancher server:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN     
tcp        0     36 10.0.13.115:22          69.244.191.147:43762    ESTABLISHED
tcp6       0      0 :::80                   :::*                    LISTEN     
tcp6       0      0 :::22                   :::*                    LISTEN     
udp        0      0 0.0.0.0:47366           0.0.0.0:*                          
udp        0      0 0.0.0.0:68              0.0.0.0:*                          
udp6       0      0 :::39989                :::*                               
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ACC ]     STREAM     LISTENING     1323007  /var/run/docker.sock
unix  2      [ ACC ]     STREAM     LISTENING     21798    /var/run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING     1339478  /var/run/docker/libcontainerd/docker-containerd.sock
unix  2      [ ACC ]     STREAM     LISTENING     10053    @/com/ubuntu/upstart
unix  2      [ ACC ]     STREAM     LISTENING     10346    /var/run/dbus/system_bus_socket
unix  5      [ ]         DGRAM                    14485    /dev/log
unix  2      [ ACC ]     SEQPACKET  LISTENING     10150    /run/udev/control
unix  2      [ ACC ]     STREAM     LISTENING     1323924  /var/lib/docker/network/files/d876803faf26a870bbdb767b0b199119a39d3bba6bfb11ddbf8096b0c4722f08.sock
unix  2      [ ]         DGRAM                    118376   
unix  3      [ ]         STREAM     CONNECTED     1307197  
unix  3      [ ]         STREAM     CONNECTED     3289041  /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     22622    /var/run/dbus/system_bus_socket
unix  2      [ ]         DGRAM                    3298164  
unix  3      [ ]         STREAM     CONNECTED     10227    
unix  3      [ ]         STREAM     CONNECTED     3294741  
unix  3      [ ]         DGRAM                    14424    
unix  3      [ ]         STREAM     CONNECTED     10139    @/com/ubuntu/upstart
unix  3      [ ]         STREAM     CONNECTED     24400    
unix  3      [ ]         STREAM     CONNECTED     24379    @/com/ubuntu/upstart
unix  3      [ ]         STREAM     CONNECTED     1338614  /var/run/docker/libcontainerd/docker-containerd.sock
unix  3      [ ]         STREAM     CONNECTED     3294738  
unix  3      [ ]         STREAM     CONNECTED     3294739  
unix  3      [ ]         STREAM     CONNECTED     22621    
unix  3      [ ]         STREAM     CONNECTED     22620    
unix  3      [ ]         STREAM     CONNECTED     20584    
unix  2      [ ]         DGRAM                    20598    
unix  3      [ ]         STREAM     CONNECTED     11325    /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     18510    @/com/ubuntu/upstart
unix  3      [ ]         STREAM     CONNECTED     21732    
unix  3      [ ]         DGRAM                    14425    
unix  3      [ ]         STREAM     CONNECTED     21684

Here is output of ps -ef on rancher server:

ps -ef
UID         PID   PPID  C STIME TTY          TIME CMD
root          1      0  0 Jun15 ?        00:00:03 /sbin/init
root          2      0  0 Jun15 ?        00:00:00 [kthreadd]
root          3      2  0 Jun15 ?        00:00:00 [ksoftirqd/0]
root          5      2  0 Jun15 ?        00:00:00 [kworker/0:0H]
root          7      2  0 Jun15 ?        00:02:19 [rcu_sched]
root          8      2  0 Jun15 ?        00:00:16 [rcuos/0]
root          9      2  0 Jun15 ?        00:00:15 [rcuos/1]
root         10      2  0 Jun15 ?        00:00:15 [rcuos/2]
root         11      2  0 Jun15 ?        00:00:15 [rcuos/3]
root         12      2  0 Jun15 ?        00:00:39 [rcuos/4]
root         13      2  0 Jun15 ?        00:00:23 [rcuos/5]
root         14      2  0 Jun15 ?        00:00:18 [rcuos/6]
root         15      2  0 Jun15 ?        00:00:17 [rcuos/7]
root         16      2  0 Jun15 ?        00:00:06 [rcuos/8]
root         17      2  0 Jun15 ?        00:00:06 [rcuos/9]
root         18      2  0 Jun15 ?        00:00:06 [rcuos/10]
root         19      2  0 Jun15 ?        00:00:07 [rcuos/11]
root         20      2  0 Jun15 ?        00:00:06 [rcuos/12]
root         21      2  0 Jun15 ?        00:00:05 [rcuos/13]
root         22      2  0 Jun15 ?        00:00:05 [rcuos/14]
root         23      2  0 Jun15 ?        00:00:06 [rcuos/15]
root         24      2  0 Jun15 ?        00:00:00 [rcuos/16]
root         25      2  0 Jun15 ?        00:00:00 [rcuos/17]
root         26      2  0 Jun15 ?        00:00:00 [rcuos/18]
root         27      2  0 Jun15 ?        00:00:00 [rcuos/19]
root         28      2  0 Jun15 ?        00:00:00 [rcuos/20]
root         29      2  0 Jun15 ?        00:00:00 [rcuos/21]
root         30      2  0 Jun15 ?        00:00:00 [rcuos/22]
root         31      2  0 Jun15 ?        00:00:00 [rcuos/23]
root         32      2  0 Jun15 ?        00:00:00 [rcuos/24]
root         33      2  0 Jun15 ?        00:00:00 [rcuos/25]
root         34      2  0 Jun15 ?        00:00:00 [rcuos/26]
root         35      2  0 Jun15 ?        00:00:00 [rcuos/27]
root         36      2  0 Jun15 ?        00:00:00 [rcuos/28]
root         37      2  0 Jun15 ?        00:00:00 [rcuos/29]
root         38      2  0 Jun15 ?        00:00:00 [rcuos/30]
root         39      2  0 Jun15 ?        00:00:00 [rcuos/31]
root         40      2  0 Jun15 ?        00:00:00 [rcuos/32]
root         41      2  0 Jun15 ?        00:00:00 [rcuos/33]
root         42      2  0 Jun15 ?        00:00:00 [rcuos/34]
root         43      2  0 Jun15 ?        00:00:00 [rcuos/35]
root         44      2  0 Jun15 ?        00:00:00 [rcuos/36]
root         45      2  0 Jun15 ?        00:00:00 [rcuos/37]
root         46      2  0 Jun15 ?        00:00:00 [rcuos/38]
root         47      2  0 Jun15 ?        00:00:00 [rcuos/39]
root         48      2  0 Jun15 ?        00:00:00 [rcuos/40]
root         49      2  0 Jun15 ?        00:00:00 [rcuos/41]
root         50      2  0 Jun15 ?        00:00:00 [rcuos/42]
root         51      2  0 Jun15 ?        00:00:00 [rcuos/43]
root         52      2  0 Jun15 ?        00:00:00 [rcuos/44]
root         53      2  0 Jun15 ?        00:00:00 [rcuos/45]
root         54      2  0 Jun15 ?        00:00:00 [rcuos/46]
root         55      2  0 Jun15 ?        00:00:00 [rcuos/47]
root         56      2  0 Jun15 ?        00:00:00 [rcuos/48]
root         57      2  0 Jun15 ?        00:00:00 [rcuos/49]
root         58      2  0 Jun15 ?        00:00:00 [rcuos/50]
root         59      2  0 Jun15 ?        00:00:00 [rcuos/51]
root         60      2  0 Jun15 ?        00:00:00 [rcuos/52]
root         61      2  0 Jun15 ?        00:00:00 [rcuos/53]
root         62      2  0 Jun15 ?        00:00:00 [rcuos/54]
root         63      2  0 Jun15 ?        00:00:00 [rcuos/55]
root         64      2  0 Jun15 ?        00:00:00 [rcuos/56]
root         65      2  0 Jun15 ?        00:00:00 [rcuos/57]
root         66      2  0 Jun15 ?        00:00:00 [rcuos/58]
root         67      2  0 Jun15 ?        00:00:00 [rcuos/59]
root         68      2  0 Jun15 ?        00:00:00 [rcuos/60]
root         69      2  0 Jun15 ?        00:00:00 [rcuos/61]
root         70      2  0 Jun15 ?        00:00:00 [rcuos/62]
root         71      2  0 Jun15 ?        00:00:00 [rcuos/63]
root         72      2  0 Jun15 ?        00:00:00 [rcuos/64]
root         73      2  0 Jun15 ?        00:00:00 [rcuos/65]
root         74      2  0 Jun15 ?        00:00:00 [rcuos/66]
root         75      2  0 Jun15 ?        00:00:00 [rcuos/67]
root         76      2  0 Jun15 ?        00:00:00 [rcuos/68]
root         77      2  0 Jun15 ?        00:00:00 [rcuos/69]
root         78      2  0 Jun15 ?        00:00:00 [rcuos/70]
root         79      2  0 Jun15 ?        00:00:00 [rcuos/71]
root         80      2  0 Jun15 ?        00:00:00 [rcuos/72]
root         81      2  0 Jun15 ?        00:00:00 [rcuos/73]
root         82      2  0 Jun15 ?        00:00:00 [rcuos/74]
root         83      2  0 Jun15 ?        00:00:00 [rcuos/75]
root         84      2  0 Jun15 ?        00:00:00 [rcuos/76]
root         85      2  0 Jun15 ?        00:00:00 [rcuos/77]
root         86      2  0 Jun15 ?        00:00:00 [rcuos/78]
root         87      2  0 Jun15 ?        00:00:00 [rcuos/79]
root         88      2  0 Jun15 ?        00:00:00 [rcuos/80]
root         89      2  0 Jun15 ?        00:00:00 [rcuos/81]
root         90      2  0 Jun15 ?        00:00:00 [rcuos/82]
root         91      2  0 Jun15 ?        00:00:00 [rcuos/83]
root         92      2  0 Jun15 ?        00:00:00 [rcuos/84]
root         93      2  0 Jun15 ?        00:00:00 [rcuos/85]
root         94      2  0 Jun15 ?        00:00:00 [rcuos/86]
root         95      2  0 Jun15 ?        00:00:00 [rcuos/87]
root         96      2  0 Jun15 ?        00:00:00 [rcuos/88]
root         97      2  0 Jun15 ?        00:00:00 [rcuos/89]
root         98      2  0 Jun15 ?        00:00:00 [rcuos/90]
root         99      2  0 Jun15 ?        00:00:00 [rcuos/91]
root        100      2  0 Jun15 ?        00:00:00 [rcuos/92]
root        101      2  0 Jun15 ?        00:00:00 [rcuos/93]
root        102      2  0 Jun15 ?        00:00:00 [rcuos/94]
root        103      2  0 Jun15 ?        00:00:00 [rcuos/95]
root        104      2  0 Jun15 ?        00:00:00 [rcuos/96]
root        105      2  0 Jun15 ?        00:00:00 [rcuos/97]
root        106      2  0 Jun15 ?        00:00:00 [rcuos/98]
root        107      2  0 Jun15 ?        00:00:00 [rcuos/99]
root        108      2  0 Jun15 ?        00:00:00 [rcuos/100]
root        109      2  0 Jun15 ?        00:00:00 [rcuos/101]
root        110      2  0 Jun15 ?        00:00:00 [rcuos/102]
root        111      2  0 Jun15 ?        00:00:00 [rcuos/103]
root        112      2  0 Jun15 ?        00:00:00 [rcuos/104]
root        113      2  0 Jun15 ?        00:00:00 [rcuos/105]
root        114      2  0 Jun15 ?        00:00:00 [rcuos/106]
root        115      2  0 Jun15 ?        00:00:00 [rcuos/107]
root        116      2  0 Jun15 ?        00:00:00 [rcuos/108]
root        117      2  0 Jun15 ?        00:00:00 [rcuos/109]
root        118      2  0 Jun15 ?        00:00:00 [rcuos/110]
root        119      2  0 Jun15 ?        00:00:00 [rcuos/111]
root        120      2  0 Jun15 ?        00:00:00 [rcuos/112]
root        121      2  0 Jun15 ?        00:00:00 [rcuos/113]
root        122      2  0 Jun15 ?        00:00:00 [rcuos/114]
root        123      2  0 Jun15 ?        00:00:00 [rcuos/115]
root        124      2  0 Jun15 ?        00:00:00 [rcuos/116]
root        125      2  0 Jun15 ?        00:00:00 [rcuos/117]
root        126      2  0 Jun15 ?        00:00:00 [rcuos/118]
root        127      2  0 Jun15 ?        00:00:00 [rcuos/119]
root        128      2  0 Jun15 ?        00:00:00 [rcuos/120]
root        129      2  0 Jun15 ?        00:00:00 [rcuos/121]
root        130      2  0 Jun15 ?        00:00:00 [rcuos/122]
root        131      2  0 Jun15 ?        00:00:00 [rcuos/123]
root        132      2  0 Jun15 ?        00:00:00 [rcuos/124]
root        133      2  0 Jun15 ?        00:00:00 [rcuos/125]
root        134      2  0 Jun15 ?        00:00:00 [rcuos/126]
root        135      2  0 Jun15 ?        00:00:00 [rcuos/127]
root        136      2  0 Jun15 ?        00:00:00 [rcu_bh]
root        137      2  0 Jun15 ?        00:00:00 [rcuob/0]
root        138      2  0 Jun15 ?        00:00:00 [rcuob/1]
root        139      2  0 Jun15 ?        00:00:00 [rcuob/2]
root        140      2  0 Jun15 ?        00:00:00 [rcuob/3]
root        141      2  0 Jun15 ?        00:00:00 [rcuob/4]
root        142      2  0 Jun15 ?        00:00:00 [rcuob/5]
root        143      2  0 Jun15 ?        00:00:00 [rcuob/6]
root        144      2  0 Jun15 ?        00:00:00 [rcuob/7]
root        145      2  0 Jun15 ?        00:00:00 [rcuob/8]
root        146      2  0 Jun15 ?        00:00:00 [rcuob/9]
root        147      2  0 Jun15 ?        00:00:00 [rcuob/10]
root        148      2  0 Jun15 ?        00:00:00 [rcuob/11]
root        149      2  0 Jun15 ?        00:00:00 [rcuob/12]
root        150      2  0 Jun15 ?        00:00:00 [rcuob/13]
root        151      2  0 Jun15 ?        00:00:00 [rcuob/14]
root        152      2  0 Jun15 ?        00:00:00 [rcuob/15]
root        153      2  0 Jun15 ?        00:00:00 [rcuob/16]
root        154      2  0 Jun15 ?        00:00:00 [rcuob/17]
root        155      2  0 Jun15 ?        00:00:00 [rcuob/18]
root        156      2  0 Jun15 ?        00:00:00 [rcuob/19]
root        157      2  0 Jun15 ?        00:00:00 [rcuob/20]
root        158      2  0 Jun15 ?        00:00:00 [rcuob/21]
root        159      2  0 Jun15 ?        00:00:00 [rcuob/22]
root        160      2  0 Jun15 ?        00:00:00 [rcuob/23]
root        161      2  0 Jun15 ?        00:00:00 [rcuob/24]
root        162      2  0 Jun15 ?        00:00:00 [rcuob/25]
root        163      2  0 Jun15 ?        00:00:00 [rcuob/26]
root        164      2  0 Jun15 ?        00:00:00 [rcuob/27]
root        165      2  0 Jun15 ?        00:00:00 [rcuob/28]
root        166      2  0 Jun15 ?        00:00:00 [rcuob/29]
root        167      2  0 Jun15 ?        00:00:00 [rcuob/30]
root        168      2  0 Jun15 ?        00:00:00 [rcuob/31]
root        169      2  0 Jun15 ?        00:00:00 [rcuob/32]
root        170      2  0 Jun15 ?        00:00:00 [rcuob/33]
root        171      2  0 Jun15 ?        00:00:00 [rcuob/34]
root        172      2  0 Jun15 ?        00:00:00 [rcuob/35]
root        173      2  0 Jun15 ?        00:00:00 [rcuob/36]
root        174      2  0 Jun15 ?        00:00:00 [rcuob/37]
root        175      2  0 Jun15 ?        00:00:00 [rcuob/38]
root        176      2  0 Jun15 ?        00:00:00 [rcuob/39]
root        177      2  0 Jun15 ?        00:00:00 [rcuob/40]
root        178      2  0 Jun15 ?        00:00:00 [rcuob/41]
root        179      2  0 Jun15 ?        00:00:00 [rcuob/42]
root        180      2  0 Jun15 ?        00:00:00 [rcuob/43]
root        181      2  0 Jun15 ?        00:00:00 [rcuob/44]
root        182      2  0 Jun15 ?        00:00:00 [rcuob/45]
root        183      2  0 Jun15 ?        00:00:00 [rcuob/46]
root        184      2  0 Jun15 ?        00:00:00 [rcuob/47]
root        185      2  0 Jun15 ?        00:00:00 [rcuob/48]
root        186      2  0 Jun15 ?        00:00:00 [rcuob/49]
root        187      2  0 Jun15 ?        00:00:00 [rcuob/50]
root        188      2  0 Jun15 ?        00:00:00 [rcuob/51]
root        189      2  0 Jun15 ?        00:00:00 [rcuob/52]
root        190      2  0 Jun15 ?        00:00:00 [rcuob/53]
root        191      2  0 Jun15 ?        00:00:00 [rcuob/54]
root        192      2  0 Jun15 ?        00:00:00 [rcuob/55]
root        193      2  0 Jun15 ?        00:00:00 [rcuob/56]
root        194      2  0 Jun15 ?        00:00:00 [rcuob/57]
root        195      2  0 Jun15 ?        00:00:00 [rcuob/58]
root        196      2  0 Jun15 ?        00:00:00 [rcuob/59]
root        197      2  0 Jun15 ?        00:00:00 [rcuob/60]
root        198      2  0 Jun15 ?        00:00:00 [rcuob/61]
root        199      2  0 Jun15 ?        00:00:00 [rcuob/62]
root        200      2  0 Jun15 ?        00:00:00 [rcuob/63]
root        201      2  0 Jun15 ?        00:00:00 [rcuob/64]
root        202      2  0 Jun15 ?        00:00:00 [rcuob/65]
root        203      2  0 Jun15 ?        00:00:00 [rcuob/66]
root        204      2  0 Jun15 ?        00:00:00 [rcuob/67]
root        205      2  0 Jun15 ?        00:00:00 [rcuob/68]
root        206      2  0 Jun15 ?        00:00:00 [rcuob/69]
root        207      2  0 Jun15 ?        00:00:00 [rcuob/70]
root        208      2  0 Jun15 ?        00:00:00 [rcuob/71]
root        209      2  0 Jun15 ?        00:00:00 [rcuob/72]
root        210      2  0 Jun15 ?        00:00:00 [rcuob/73]
root        211      2  0 Jun15 ?        00:00:00 [rcuob/74]
root        212      2  0 Jun15 ?        00:00:00 [rcuob/75]
root        213      2  0 Jun15 ?        00:00:00 [rcuob/76]
root        214      2  0 Jun15 ?        00:00:00 [rcuob/77]
root        215      2  0 Jun15 ?        00:00:00 [rcuob/78]
root        216      2  0 Jun15 ?        00:00:00 [rcuob/79]
root        217      2  0 Jun15 ?        00:00:00 [rcuob/80]
root        218      2  0 Jun15 ?        00:00:00 [rcuob/81]
root        219      2  0 Jun15 ?        00:00:00 [rcuob/82]
root        220      2  0 Jun15 ?        00:00:00 [rcuob/83]
root        221      2  0 Jun15 ?        00:00:00 [rcuob/84]
root        222      2  0 Jun15 ?        00:00:00 [rcuob/85]
root        223      2  0 Jun15 ?        00:00:00 [rcuob/86]
root        224      2  0 Jun15 ?        00:00:00 [rcuob/87]
root        225      2  0 Jun15 ?        00:00:00 [rcuob/88]
root        226      2  0 Jun15 ?        00:00:00 [rcuob/89]
root        227      2  0 Jun15 ?        00:00:00 [rcuob/90]
root        228      2  0 Jun15 ?        00:00:00 [rcuob/91]
root        229      2  0 Jun15 ?        00:00:00 [rcuob/92]
root        230      2  0 Jun15 ?        00:00:00 [rcuob/93]
root        231      2  0 Jun15 ?        00:00:00 [rcuob/94]
root        232      2  0 Jun15 ?        00:00:00 [rcuob/95]
root        233      2  0 Jun15 ?        00:00:00 [rcuob/96]
root        234      2  0 Jun15 ?        00:00:00 [rcuob/97]
root        235      2  0 Jun15 ?        00:00:00 [rcuob/98]
root        236      2  0 Jun15 ?        00:00:00 [rcuob/99]
root        237      2  0 Jun15 ?        00:00:00 [rcuob/100]
root        238      2  0 Jun15 ?        00:00:00 [rcuob/101]
root        239      2  0 Jun15 ?        00:00:00 [rcuob/102]
root        240      2  0 Jun15 ?        00:00:00 [rcuob/103]
root        241      2  0 Jun15 ?        00:00:00 [rcuob/104]
root        242      2  0 Jun15 ?        00:00:00 [rcuob/105]
root        243      2  0 Jun15 ?        00:00:00 [rcuob/106]
root        244      2  0 Jun15 ?        00:00:00 [rcuob/107]
root        245      2  0 Jun15 ?        00:00:00 [rcuob/108]
root        246      2  0 Jun15 ?        00:00:00 [rcuob/109]
root        247      2  0 Jun15 ?        00:00:00 [rcuob/110]
root        248      2  0 Jun15 ?        00:00:00 [rcuob/111]
root        249      2  0 Jun15 ?        00:00:00 [rcuob/112]
root        250      2  0 Jun15 ?        00:00:00 [rcuob/113]
root        251      2  0 Jun15 ?        00:00:00 [rcuob/114]
root        252      2  0 Jun15 ?        00:00:00 [rcuob/115]
root        253      2  0 Jun15 ?        00:00:00 [rcuob/116]
root        254      2  0 Jun15 ?        00:00:00 [rcuob/117]
root        255      2  0 Jun15 ?        00:00:00 [rcuob/118]
root        256      2  0 Jun15 ?        00:00:00 [rcuob/119]
root        257      2  0 Jun15 ?        00:00:00 [rcuob/120]
root        258      2  0 Jun15 ?        00:00:00 [rcuob/121]
root        259      2  0 Jun15 ?        00:00:00 [rcuob/122]
root        260      2  0 Jun15 ?        00:00:00 [rcuob/123]
root        261      2  0 Jun15 ?        00:00:00 [rcuob/124]
root        262      2  0 Jun15 ?        00:00:00 [rcuob/125]
root        263      2  0 Jun15 ?        00:00:00 [rcuob/126]
root        264      2  0 Jun15 ?        00:00:00 [rcuob/127]
root        265      2  0 Jun15 ?        00:00:00 [migration/0]
root        266      2  0 Jun15 ?        00:00:00 [watchdog/0]
root        267      2  0 Jun15 ?        00:00:00 [watchdog/1]
root        268      2  0 Jun15 ?        00:00:00 [migration/1]
root        269      2  0 Jun15 ?        00:00:00 [ksoftirqd/1]
root        271      2  0 Jun15 ?        00:00:00 [kworker/1:0H]
root        272      2  0 Jun15 ?        00:00:00 [watchdog/2]
root        273      2  0 Jun15 ?        00:00:00 [migration/2]
root        274      2  0 Jun15 ?        00:00:00 [ksoftirqd/2]
root        276      2  0 Jun15 ?        00:00:00 [kworker/2:0H]
root        277      2  0 Jun15 ?        00:00:00 [watchdog/3]
root        278      2  0 Jun15 ?        00:00:00 [migration/3]
root        279      2  0 Jun15 ?        00:00:00 [ksoftirqd/3]
root        281      2  0 Jun15 ?        00:00:00 [kworker/3:0H]
root        282      2  0 Jun15 ?        00:00:00 [watchdog/4]
root        283      2  0 Jun15 ?        00:00:04 [migration/4]
root        284      2  2 Jun15 ?        00:25:49 [ksoftirqd/4]
root        286      2  0 Jun15 ?        00:00:00 [kworker/4:0H]
root        287      2  0 Jun15 ?        00:00:00 [watchdog/5]
root        288      2  0 Jun15 ?        00:00:00 [migration/5]
root        289      2  0 Jun15 ?        00:00:01 [ksoftirqd/5]
root        291      2  0 Jun15 ?        00:00:00 [kworker/5:0H]
root        292      2  0 Jun15 ?        00:00:00 [watchdog/6]
root        293      2  0 Jun15 ?        00:00:00 [migration/6]
root        294      2  0 Jun15 ?        00:00:00 [ksoftirqd/6]
root        296      2  0 Jun15 ?        00:00:00 [kworker/6:0H]
root        297      2  0 Jun15 ?        00:00:00 [watchdog/7]
root        298      2  0 Jun15 ?        00:00:00 [migration/7]
root        299      2  0 Jun15 ?        00:00:00 [ksoftirqd/7]
root        301      2  0 Jun15 ?        00:00:00 [kworker/7:0H]
root        302      2  0 Jun15 ?        00:00:00 [watchdog/8]
root        303      2  0 Jun15 ?        00:00:00 [migration/8]
root        304      2  0 Jun15 ?        00:00:00 [ksoftirqd/8]
root        306      2  0 Jun15 ?        00:00:00 [kworker/8:0H]
root        307      2  0 Jun15 ?        00:00:00 [watchdog/9]
root        308      2  0 Jun15 ?        00:00:00 [migration/9]
root        309      2  0 Jun15 ?        00:00:00 [ksoftirqd/9]
root        311      2  0 Jun15 ?        00:00:00 [kworker/9:0H]
root        312      2  0 Jun15 ?        00:00:00 [watchdog/10]
root        313      2  0 Jun15 ?        00:00:00 [migration/10]
root        314      2  0 Jun15 ?        00:00:00 [ksoftirqd/10]
root        316      2  0 Jun15 ?        00:00:00 [kworker/10:0H]
root        317      2  0 Jun15 ?        00:00:00 [watchdog/11]
root        318      2  0 Jun15 ?        00:00:00 [migration/11]
root        319      2  0 Jun15 ?        00:00:00 [ksoftirqd/11]
root        321      2  0 Jun15 ?        00:00:00 [kworker/11:0H]
root        322      2  0 Jun15 ?        00:00:00 [watchdog/12]
root        323      2  0 Jun15 ?        00:00:00 [migration/12]
root        324      2  0 Jun15 ?        00:00:00 [ksoftirqd/12]
root        326      2  0 Jun15 ?        00:00:00 [kworker/12:0H]
root        327      2  0 Jun15 ?        00:00:00 [watchdog/13]
root        328      2  0 Jun15 ?        00:00:00 [migration/13]
root        329      2  0 Jun15 ?        00:00:00 [ksoftirqd/13]
root        331      2  0 Jun15 ?        00:00:00 [kworker/13:0H]
root        332      2  0 Jun15 ?        00:00:00 [watchdog/14]
root        333      2  0 Jun15 ?        00:00:00 [migration/14]
root        334      2  0 Jun15 ?        00:00:00 [ksoftirqd/14]
root        336      2  0 Jun15 ?        00:00:00 [kworker/14:0H]
root        337      2  0 Jun15 ?        00:00:00 [watchdog/15]
root        338      2  0 Jun15 ?        00:00:00 [migration/15]
root        339      2  0 Jun15 ?        00:00:00 [ksoftirqd/15]
root        341      2  0 Jun15 ?        00:00:00 [kworker/15:0H]
root        342      2  0 Jun15 ?        00:00:00 [khelper]
root        343      2  0 Jun15 ?        00:00:00 [kdevtmpfs]
root        344      2  0 Jun15 ?        00:00:00 [netns]
root        345      2  0 Jun15 ?        00:00:00 [xenwatch]
root        346      2  0 Jun15 ?        00:00:00 [xenbus]
root        348      2  0 Jun15 ?        00:00:00 [writeback]
root        349      2  0 Jun15 ?        00:00:00 [kintegrityd]
root        350      2  0 Jun15 ?        00:00:00 [bioset]
root        351      2  0 Jun15 ?        00:00:00 [kworker/u257:0]
root        352      2  0 Jun15 ?        00:00:00 [kblockd]
root        354      2  0 Jun15 ?        00:00:00 [ata_sff]
root        355      2  0 Jun15 ?        00:00:00 [khubd]
root        356      2  0 Jun15 ?        00:00:00 [md]
root        357      2  0 Jun15 ?        00:00:00 [devfreq_wq]
root        361      2  0 Jun15 ?        00:00:01 [kworker/5:1]
root        365      2  0 Jun15 ?        00:00:00 [kworker/9:1]
root        366      2  0 Jun15 ?        00:00:00 [kworker/10:1]
root        367      2  0 Jun15 ?        00:00:00 [kworker/11:1]
root        368      2  0 Jun15 ?        00:00:00 [kworker/12:1]
root        370      2  0 Jun15 ?        00:00:00 [kworker/14:1]
root        373      2  0 Jun15 ?        00:00:00 [khungtaskd]
root        374      2  0 Jun15 ?        00:00:00 [kswapd0]
root        375      2  0 Jun15 ?        00:00:00 [ksmd]
root        376      2  0 Jun15 ?        00:00:01 [khugepaged]
root        377      2  0 Jun15 ?        00:00:00 [fsnotify_mark]
root        378      2  0 Jun15 ?        00:00:00 [ecryptfs-kthrea]
root        379      2  0 Jun15 ?        00:00:00 [crypto]
root        391      2  0 Jun15 ?        00:00:00 [kthrotld]
root        393      2  0 Jun15 ?        00:00:00 [scsi_eh_0]
root        394      2  0 Jun15 ?        00:00:00 [scsi_eh_1]
root        415      2  0 Jun15 ?        00:00:00 [deferwq]
root        416      2  0 Jun15 ?        00:00:00 [charger_manager]
root        471      2  0 Jun15 ?        00:00:00 [kpsmoused]
root        492      2  0 Jun15 ?        00:00:00 [ttm_swap]
root        538      2  0 Jun15 ?        00:00:00 [jbd2/xvda1-8]
root        539      2  0 Jun15 ?        00:00:00 [ext4-rsv-conver]
root        727      2  0 Jun15 ?        00:00:01 [jbd2/xvdf-8]
root        728      2  0 Jun15 ?        00:00:00 [ext4-rsv-conver]
root        854      1  0 Jun15 ?        00:00:00 upstart-udev-bridge --daemon
root        859      1  0 Jun15 ?        00:00:00 /lib/systemd/systemd-udevd --daemon
root        999      1  0 Jun15 ?        00:00:00 upstart-socket-bridge --daemon
root       1057      1  0 Jun15 ?        00:00:00 dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
root       1236      1  0 Jun15 ?        00:00:00 upstart-file-bridge --daemon
message+   1240      1  0 Jun15 ?        00:00:00 dbus-daemon --system --fork
root       1279      1  0 Jun15 ?        00:00:00 /lib/systemd/systemd-logind
syslog     1344      1  0 Jun15 ?        00:00:00 rsyslogd
root       1355      1  0 Jun15 tty4     00:00:00 /sbin/getty -8 38400 tty4
root       1358      1  0 Jun15 tty5     00:00:00 /sbin/getty -8 38400 tty5
root       1366      1  0 Jun15 tty2     00:00:00 /sbin/getty -8 38400 tty2
root       1367      1  0 Jun15 tty3     00:00:00 /sbin/getty -8 38400 tty3
root       1369      1  0 Jun15 tty6     00:00:00 /sbin/getty -8 38400 tty6
root       1419      1  0 Jun15 ?        00:00:00 /usr/sbin/sshd -D
root       1446      1  0 Jun15 ?        00:00:00 cron
daemon     1447      1  0 Jun15 ?        00:00:00 atd
root       1494      1  0 Jun15 ?        00:00:08 /usr/sbin/irqbalance
root       1501      1  0 Jun15 ?        00:00:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root       1574      1  0 Jun15 tty1     00:00:00 /sbin/getty -8 38400 tty1
root       1575      1  0 Jun15 ttyS0    00:00:00 /sbin/getty -8 38400 ttyS0
root       3280      2  0 Jun15 ?        00:00:00 [kauditd]
root       4577      2  0 Jun15 ?        00:00:05 [kworker/4:1]
root      39829      2  0 Jun15 ?        00:00:00 [kworker/15:0]
root      40076      2  0 Jun15 ?        00:00:00 [kworker/2:1]
root      41419      2  0 Jun15 ?        00:00:00 [kworker/u257:1]
root      41842      2  0 Jun15 ?        00:00:00 [kworker/11:2]
root      66872      2  0 Jun15 ?        00:00:00 [kworker/1:2]
root      66902      2  0 Jun15 ?        00:00:00 [kworker/12:0]
root      77036      2  0 Jun15 ?        00:00:00 [kworker/1:1]
root      77384      2  0 Jun15 ?        00:00:00 [kworker/4:2]
root      77650      2  0 Jun15 ?        00:00:00 [kworker/7:2]
root      77856      2  0 Jun15 ?        00:00:00 [kworker/10:0]
root      78464      2  0 Jun15 ?        00:00:00 [kworker/15:1]
root      80687      2  0 Jun15 ?        00:00:00 [kworker/9:2]
root      82212      1  0 Jun15 ?        00:05:51 /usr/bin/docker daemon --raw-logs
root      82224  82212  0 Jun15 ?        00:00:02 docker-containerd -l /var/run/docker/libcontainerd/docker-containerd.sock --runtime docker-runc --start-timeout 2m
root      83857      2  0 00:57 ?        00:00:00 [kworker/13:1]
root      83879      2  0 00:57 ?        00:00:01 [kworker/0:1]
root      83894      2  0 00:57 ?        00:00:00 [kworker/7:0]
root      85744      2  0 01:29 ?        00:00:00 [kworker/2:0]
root      85755      2  0 01:29 ?        00:00:00 [kworker/8:1]
root      85783  82212  0 01:29 ?        00:00:00 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 80 -container-ip 172.17.0.2 -container-port 8080
root      85788  82224  0 01:29 ?        00:00:07 docker-containerd-shim 79bbb02185bc8e5d15063390f9b64d70f228cd3dd38a4d20bbd3f590f3e75c10 /var/run/docker/libcontainerd/79bbb02185bc8e5d15063390f9
root      85803  85788  0 01:29 ?        00:00:01 /usr/bin/s6-svscan /service
root      85820  85803  0 01:29 ?        00:00:00 s6-supervise cattle
root      85821  85803  0 01:29 ?        00:00:00 s6-supervise mysql
root      85822  85820 99 01:29 ?        23:48:28 java -Xms128m -Xmx8g -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/cattle/logs -Dlogback.bootstrap.level=WARN -cp /usr/share/cattle/
root      85980  85822  2 01:29 ?        00:24:45 websocket-proxy
root      86001  85822  0 01:29 ?        00:00:11 rancher-catalog-service -catalogUrl library=https://github.com/rancher/rancher-catalog.git,community=https://github.com/rancher/community-catalo
root      86203      2  0 Jun15 ?        00:00:01 [kworker/6:2]
root      94445      2  0 Jun15 ?        00:00:00 [kworker/8:2]
root      94468      2  0 Jun15 ?        00:00:00 [kworker/13:2]
root      96319      2  0 Jun15 ?        00:00:00 [kworker/3:1]
root     113524      2  0 13:32 ?        00:00:00 [kworker/0:2]
root     113525      2  0 13:32 ?        00:00:00 [kworker/6:1]
root     113846  85822  0 13:35 ?        00:00:13 rancher-compose-executor
root     113851  85822  0 13:35 ?        00:01:30 go-machine-service
root     114219      2  0 13:39 ?        00:00:00 [kworker/3:0]
root     114222      2  0 13:39 ?        00:00:00 [kworker/14:0]
root     124683      2  0 18:18 ?        00:00:00 [kworker/u256:1]
root     125056      2  0 18:26 ?        00:00:00 [kworker/u256:2]
root     125240   1419  0 18:32 ?        00:00:00 sshd: ubuntu [priv] 
root     125261      2  0 18:32 ?        00:00:00 [kworker/5:2]
ubuntu   125337 125240  0 18:32 ?        00:00:00 sshd: ubuntu@pts/0  
ubuntu   125338 125337  0 18:32 pts/0    00:00:00 -bash
ubuntu   125535 125338  0 18:39 pts/0    00:00:00 ps -ef

Here is the output of netstat -an on one of our hosts:

 netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:9344          0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:9344          127.0.0.1:44451         TIME_WAIT  
tcp        0      0 127.0.0.1:9344          127.0.0.1:44450         TIME_WAIT  
tcp        0      0 10.0.13.169:37979       52.36.30.192:80         ESTABLISHED
tcp        0      0 10.0.13.169:37976       52.36.30.192:80         ESTABLISHED
tcp        0      0 10.0.13.169:37969       52.36.30.192:80         ESTABLISHED
tcp        0      0 127.0.0.1:9344          127.0.0.1:44452         TIME_WAIT  
tcp        0      0 10.0.13.169:38228       52.36.30.192:80         ESTABLISHED
tcp        0      0 10.0.13.169:37990       52.36.30.192:80         ESTABLISHED
tcp        0      0 10.0.13.169:1020        10.0.13.178:49152       ESTABLISHED
tcp        0      0 10.0.13.169:1019        10.0.13.179:49152       ESTABLISHED
tcp        0      0 10.0.13.169:1023        10.0.13.178:24007       ESTABLISHED
tcp        0    216 10.0.13.169:22          69.244.191.147:36631    ESTABLISHED
tcp6       0      0 :::22                   :::*                    LISTEN     
udp        0      0 0.0.0.0:68              0.0.0.0:*                          
udp        0      0 0.0.0.0:29030           0.0.0.0:*                          
udp6       0      0 :::35700                :::*                               
udp6       0      0 :::4500                 :::*                               
udp6       0      0 :::500                  :::*                               
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ACC ]     STREAM     LISTENING     10554    /var/run/docker.sock
unix  2      [ ACC ]     STREAM     LISTENING     11550    /var/lib/docker/network/files/88cfd67ffce8057e78b004b52f22b0280b60516c0d007f8031e3eed176b66f0b.sock
unix  2      [ ACC ]     STREAM     LISTENING     10079    /var/run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING     7726     @/com/ubuntu/upstart
unix  2      [ ACC ]     SEQPACKET  LISTENING     9404     /run/udev/control
unix  2      [ ACC ]     STREAM     LISTENING     11463    /var/run/docker/libcontainerd/docker-containerd.sock
unix  4      [ ]         DGRAM                    9177     /dev/log
unix  2      [ ACC ]     STREAM     LISTENING     9961     /var/run/dbus/system_bus_socket
unix  3      [ ]         DGRAM                    9438     
unix  3      [ ]         STREAM     CONNECTED     29060109 /var/run/docker.sock
unix  2      [ ]         DGRAM                    29026042 
unix  3      [ ]         STREAM     CONNECTED     28900109 /var/run/docker.sock
unix  3      [ ]         STREAM     CONNECTED     29059753 
unix  3      [ ]         STREAM     CONNECTED     28900246 
unix  3      [ ]         STREAM     CONNECTED     10687    
unix  3      [ ]         STREAM     CONNECTED     28900112 
unix  2      [ ]         DGRAM                    19231    
unix  3      [ ]         STREAM     CONNECTED     28900115 /var/run/docker.sock
unix  3      [ ]         STREAM     CONNECTED     28900146 
unix  3      [ ]         STREAM     CONNECTED     28900782 /var/run/docker.sock
unix  3      [ ]         STREAM     CONNECTED     28900703 
unix  3      [ ]         STREAM     CONNECTED     9984     @/com/ubuntu/upstart
unix  3      [ ]         STREAM     CONNECTED     28900720 /var/run/docker.sock
unix  3      [ ]         STREAM     CONNECTED     11466    /var/run/docker/libcontainerd/docker-containerd.sock
unix  3      [ ]         STREAM     CONNECTED     8952     /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     29027593 
unix  3      [ ]         STREAM     CONNECTED     28900118 
unix  3      [ ]         STREAM     CONNECTED     28900698 
unix  3      [ ]         STREAM     CONNECTED     8942     
unix  3      [ ]         STREAM     CONNECTED     29027592 
unix  3      [ ]         STREAM     CONNECTED     29026622 
unix  3      [ ]         STREAM     CONNECTED     28900704 /var/run/docker.sock
unix  3      [ ]         STREAM     CONNECTED     9953     
unix  3      [ ]         STREAM     CONNECTED     28901545 /var/run/docker.sock
unix  3      [ ]         STREAM     CONNECTED     9005     
unix  3      [ ]         STREAM     CONNECTED     9380     
unix  3      [ ]         STREAM     CONNECTED     9993     
unix  3      [ ]         STREAM     CONNECTED     8633     @/com/ubuntu/upstart
unix  3      [ ]         DGRAM                    9437     
unix  3      [ ]         STREAM     CONNECTED     29026623 /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     8943     
unix  3      [ ]         STREAM     CONNECTED     28901007 
unix  3      [ ]         STREAM     CONNECTED     9395     @/com/ubuntu/upstart
unix  3      [ ]         STREAM     CONNECTED     9006     /var/run/dbus/system_bus_socket
unix  3      [ ]         STREAM     CONNECTED     8630     
unix  3      [ ]         STREAM     CONNECTED     28900701 /var/run/docker.sock

And here is the output of ps -ef on the same host:

$ ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 Jun01 ?        00:00:03 /sbin/init
root         2     0  0 Jun01 ?        00:00:00 [kthreadd]
root         3     2  0 Jun01 ?        00:00:01 [ksoftirqd/0]
root         5     2  0 Jun01 ?        00:00:00 [kworker/0:0H]
root         6     2  0 Jun01 ?        00:00:14 [kworker/u30:0]
root         7     2  0 Jun01 ?        00:07:49 [rcu_sched]
root         8     2  0 Jun01 ?        00:09:41 [rcuos/0]
root         9     2  0 Jun01 ?        00:10:05 [rcuos/1]
root        10     2  0 Jun01 ?        00:00:00 [rcuos/2]
root        11     2  0 Jun01 ?        00:00:00 [rcuos/3]
root        12     2  0 Jun01 ?        00:00:00 [rcuos/4]
root        13     2  0 Jun01 ?        00:00:00 [rcuos/5]
root        14     2  0 Jun01 ?        00:00:00 [rcuos/6]
root        15     2  0 Jun01 ?        00:00:00 [rcuos/7]
root        16     2  0 Jun01 ?        00:00:00 [rcuos/8]
root        17     2  0 Jun01 ?        00:00:00 [rcuos/9]
root        18     2  0 Jun01 ?        00:00:00 [rcuos/10]
root        19     2  0 Jun01 ?        00:00:00 [rcuos/11]
root        20     2  0 Jun01 ?        00:00:00 [rcuos/12]
root        21     2  0 Jun01 ?        00:00:00 [rcuos/13]
root        22     2  0 Jun01 ?        00:00:00 [rcuos/14]
root        23     2  0 Jun01 ?        00:00:00 [rcu_bh]
root        24     2  0 Jun01 ?        00:00:00 [rcuob/0]
root        25     2  0 Jun01 ?        00:00:00 [rcuob/1]
root        26     2  0 Jun01 ?        00:00:00 [rcuob/2]
root        27     2  0 Jun01 ?        00:00:00 [rcuob/3]
root        28     2  0 Jun01 ?        00:00:00 [rcuob/4]
root        29     2  0 Jun01 ?        00:00:00 [rcuob/5]
root        30     2  0 Jun01 ?        00:00:00 [rcuob/6]
root        31     2  0 Jun01 ?        00:00:00 [rcuob/7]
root        32     2  0 Jun01 ?        00:00:00 [rcuob/8]
root        33     2  0 Jun01 ?        00:00:00 [rcuob/9]
root        34     2  0 Jun01 ?        00:00:00 [rcuob/10]
root        35     2  0 Jun01 ?        00:00:00 [rcuob/11]
root        36     2  0 Jun01 ?        00:00:00 [rcuob/12]
root        37     2  0 Jun01 ?        00:00:00 [rcuob/13]
root        38     2  0 Jun01 ?        00:00:00 [rcuob/14]
root        39     2  0 Jun01 ?        00:00:23 [migration/0]
root        40     2  0 Jun01 ?        00:00:03 [watchdog/0]
root        41     2  0 Jun01 ?        00:00:03 [watchdog/1]
root        42     2  0 Jun01 ?        00:00:21 [migration/1]
root        43     2  0 Jun01 ?        00:00:25 [ksoftirqd/1]
root        45     2  0 Jun01 ?        00:00:00 [kworker/1:0H]
root        46     2  0 Jun01 ?        00:00:00 [khelper]
root        47     2  0 Jun01 ?        00:00:00 [kdevtmpfs]
root        48     2  0 Jun01 ?        00:00:00 [netns]
root        49     2  0 Jun01 ?        00:00:00 [xenwatch]
root        50     2  0 Jun01 ?        00:00:00 [xenbus]
root        52     2  0 Jun01 ?        00:00:00 [writeback]
root        53     2  0 Jun01 ?        00:00:00 [kintegrityd]
root        54     2  0 Jun01 ?        00:00:00 [bioset]
root        55     2  0 Jun01 ?        00:00:00 [kworker/u31:0]
root        56     2  0 Jun01 ?        00:00:00 [kblockd]
root        58     2  0 Jun01 ?        00:00:00 [ata_sff]
root        59     2  0 Jun01 ?        00:00:00 [khubd]
root        60     2  0 Jun01 ?        00:00:00 [md]
root        61     2  0 Jun01 ?        00:00:00 [devfreq_wq]
root        63     2  0 Jun01 ?        00:00:00 [khungtaskd]
root        64     2  0 Jun01 ?        00:00:00 [kswapd0]
root        65     2  0 Jun01 ?        00:00:00 [vmstat]
root        66     2  0 Jun01 ?        00:00:00 [ksmd]
root        67     2  0 Jun01 ?        00:00:05 [khugepaged]
root        68     2  0 Jun01 ?        00:00:01 [fsnotify_mark]
root        69     2  0 Jun01 ?        00:00:00 [ecryptfs-kthrea]
root        70     2  0 Jun01 ?        00:00:00 [crypto]
root        82     2  0 Jun01 ?        00:00:00 [kthrotld]
root        83     2  0 Jun01 ?        00:00:00 [kworker/u30:1]
root        84     2  0 Jun01 ?        00:00:00 [scsi_eh_0]
root        85     2  0 Jun01 ?        00:00:00 [scsi_eh_1]
root       106     2  0 Jun01 ?        00:00:00 [deferwq]
root       107     2  0 Jun01 ?        00:00:00 [charger_manager]
root       150     2  0 Jun01 ?        00:00:00 [kpsmoused]
root       169     2  0 Jun01 ?        00:00:00 [ttm_swap]
root       212     2  0 Jun01 ?        00:00:22 [jbd2/xvda1-8]
root       213     2  0 Jun01 ?        00:00:00 [ext4-rsv-conver]
root       359 11031  0 01:31 ?        00:09:12 /var/lib/cattle/bin/rancher-net --log /var/log/rancher-net.log -f /var/lib/cattle/etc/cattle/ipsec/config.json -c /var/lib/cattle/etc/cattle/ipsec
root       472     1  0 Jun01 ?        00:00:00 upstart-udev-bridge --daemon
root       480     1  0 Jun01 ?        00:00:00 /lib/systemd/systemd-udevd --daemon
root       647     1  0 Jun01 ?        00:00:00 upstart-socket-bridge --daemon
root       687     1  0 Jun01 ?        00:00:00 dhclient -1 -v -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
root       906     1  0 Jun01 ?        00:25:46 /usr/bin/docker daemon --raw-logs
message+   920     1  0 Jun01 ?        00:00:00 dbus-daemon --system --fork
root       955     1  0 Jun01 ?        00:00:00 /lib/systemd/systemd-logind
root       999     1  0 Jun01 ?        00:00:00 upstart-file-bridge --daemon
root      1060     1  0 Jun01 tty4     00:00:00 /sbin/getty -8 38400 tty4
root      1065     1  0 Jun01 tty5     00:00:00 /sbin/getty -8 38400 tty5
root      1069     1  0 Jun01 tty2     00:00:00 /sbin/getty -8 38400 tty2
root      1070     1  0 Jun01 tty3     00:00:00 /sbin/getty -8 38400 tty3
root      1072     1  0 Jun01 tty6     00:00:00 /sbin/getty -8 38400 tty6
root      1107     1  0 Jun01 ?        00:00:03 /usr/sbin/sshd -D
root      1126     1  0 Jun01 ?        00:00:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
root      1148     1  0 Jun01 ?        00:00:00 cron
daemon    1149     1  0 Jun01 ?        00:00:00 atd
syslog    1155     1  0 Jun01 ?        00:00:01 rsyslogd
root      1234     1  0 Jun01 tty1     00:00:00 /sbin/getty -8 38400 tty1
root      1235     1  0 Jun01 ttyS0    00:00:00 /sbin/getty -8 38400 ttyS0
root      1477   906  0 Jun01 ?        00:01:00 docker-containerd -l /var/run/docker/libcontainerd/docker-containerd.sock --runtime docker-runc
root      1502     2  0 Jun01 ?        00:00:00 [kauditd]
root      1628     1  0 Jun01 ?        00:00:22 /usr/sbin/glusterfs --volfile-id=/synappfiles --volfile-server=gluster1.synduit.internal /data
root      5640     2  0 Jun07 ?        00:00:00 [kworker/u31:1]
root     11004   906  0 Jun06 ?        00:00:00 docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 4500 -container-ip 172.17.0.2 -container-port 4500
root     11012   906  0 Jun06 ?        00:00:00 docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 500 -container-ip 172.17.0.2 -container-port 500
root     11017  1477  0 Jun06 ?        00:00:00 docker-containerd-shim 5579f33ee75d1803022267b2fa37b10188858f50b131977049e51499b0f59a24 /var/run/docker/libcontainerd/5579f33ee75d1803022267b2fa37
root     11031 11017  0 Jun06 ?        00:00:00 init  
root     11169     2  0 Jun15 ?        00:00:00 [kworker/0:1]
root     12002 11031  0 Jun06 ?        00:02:32 /usr/bin/monit -Ic /etc/monit/monitrc
root     12093 11031  0 Jun06 ?        00:49:11 /usr/local/sbin/charon --debug-dmn 1 --debug-mgr 1 --debug-ike 1 --debug-chd 1 --debug-cfg 1 --debug-knl 1 --debug-net 1 --debug-asn 1 --debug-tnc
root     16962  1107  0 17:49 ?        00:00:00 sshd: ubuntu [priv] 
ubuntu   17049 16962  0 17:49 ?        00:00:00 sshd: ubuntu@pts/4  
ubuntu   17053 17049  0 17:49 pts/4    00:00:00 -bash
root     17811     2  0 Jun15 ?        00:00:01 [kworker/1:2]
root     17976     2  0 Jun15 ?        00:00:01 [kworker/0:0]
root     18088  1477  0 Jun15 ?        00:00:07 docker-containerd-shim 4361eb14a2ab924cc9c95be39c3f3b3896c95b2538d74f4296fbf9bdaeffaa7d /var/run/docker/libcontainerd/4361eb14a2ab924cc9c95be39c3f
root     18102 18088  0 Jun15 ?        00:00:00 /bin/bash /run.sh run
root     20422 18102  0 13:34 ?        00:01:44 python /var/lib/cattle/pyagent/main.py
root     20713 20422  0 13:34 ?        00:00:02 /bin/bash /var/lib/cattle/pyagent/cattle/process_watcher.sh
root     20718 20422  0 13:34 ?        00:00:21 host-api -cadvisor-url http://127.0.0.1:9344 -logtostderr=true -ip 0.0.0.0 -port 9345 -auth=true -host-uuid ad183536-e0c6-4e29-ab41-d11d8c077f28 -
root     20721 20422  0 13:34 ?        00:00:00 /bin/bash /var/lib/cattle/bin/cadvisor.sh cadvisor -logtostderr=true -listen_ip 127.0.0.1 -port 9344 -housekeeping_interval 1s -docker_root /var/l
root     20758 20721  0 13:34 ?        00:02:42 /var/log/rancher/.cadvisor -logtostderr=true -listen_ip 127.0.0.1 -port 9344 -housekeeping_interval 1s -docker_root /var/lib/docker
root     22242 11031  0 18:33 ?        00:00:00 /var/lib/cattle/bin/host-api -log /var/log/haproxy-monitor.log -haproxy-monitor -pid-file /var/run/haproxy-monitor.pid
root     22842  1477  0 Jun07 ?        00:00:04 docker-containerd-shim 9b0f091102e87406ff171b2d35cb25163fa33702ddecc9495d59927f7d04cc78 /var/run/docker/libcontainerd/9b0f091102e87406ff171b2d35cb
root     22858 22842  0 Jun07 pts/0    00:00:36 php bin/synsocial_consumer
landsca+ 22862 18088  0 18:38 ?        00:00:00 haproxy -p /var/run/haproxy.pid -f /etc/healthcheck/healthcheck.cfg -sf 28021
root     22879 20713  0 18:39 ?        00:00:00 sleep 2
ubuntu   22880 17053  0 18:39 pts/4    00:00:00 ps -ef
root     24179  1477  0 Jun07 ?        00:00:04 docker-containerd-shim c4c4d8c18695867a895a0f7880d3992b4f05b4eac1453767f850642ad2978229 /var/run/docker/libcontainerd/c4c4d8c18695867a895a0f7880d3
root     24197 24179  0 Jun07 pts/1    00:00:35 php bin/synsocial_consumer
root     24240  1477  0 Jun07 ?        00:00:04 docker-containerd-shim d70015c59fb5361fcfcef8214de61a02b15b71b8de6f32054abc2f1e5cc76c10 /var/run/docker/libcontainerd/d70015c59fb5361fcfcef8214de6
root     24258 24240  0 Jun07 pts/3    00:00:36 php bin/synsocial_consumer
root     24272  1477  0 Jun07 ?        00:00:04 docker-containerd-shim 1057867e558334366eb87fc90b544b5cdb1970135e10b805c068292267f45f37 /var/run/docker/libcontainerd/1057867e558334366eb87fc90b54
root     24311 24272  0 Jun07 pts/2    00:00:36 php bin/synsocial_consumer
root     24320  1477  0 Jun07 ?        00:00:04 docker-containerd-shim dc0521174f27c9018fce2476dc2c22133f9398a4e0e0f0411f2a3a6ea8867e38 /var/run/docker/libcontainerd/dc0521174f27c9018fce2476dc2c
root     24352 24320  0 Jun07 pts/5    00:00:37 php bin/synsocial_consumer
root     31486     2  0 Jun10 ?        00:00:03 [kworker/1:1]
root     31779  1477  0 Jun12 ?        00:00:02 docker-containerd-shim fb2a9032fcfebd5d3385c47ec14dd27af299e751fbff501093f0317a0c54360d /var/run/docker/libcontainerd/fb2a9032fcfebd5d3385c47ec14d
root     31804 31779  0 Jun12 pts/6    00:00:20 php bin/synsocial_consumer
root     32559 18088  0 01:31 ?        00:06:39 /var/lib/cattle/bin/rancher-metadata -log /var/log/rancher-metadata.log -answers /var/lib/cattle/etc/cattle/metadata/answers.yml -pid-file /var/ru
root     32649 18088  0 01:31 ?        00:00:16 /var/lib/cattle/bin/rancher-dns -log /var/log/rancher-dns.log -answers /var/lib/cattle/etc/cattle/dns/answers.json -pid-file /var/run/rancher-dns.

It looks like your hitting an issue scheduling due to the amount of stacks within one environment. We’re working on a solution to fix this. What happens is during each stack launch we update metadata across all hosts. An interim solution could be add more ram, that could help scheduling speed on the rancher server. This would most likely be a short lived improvement. A more robust interim solution would be to separate hosts across environments within your rancher deployment. That would limit the scope of the broadcast effect where metadata has to be updated across all hosts. We are looking at solving quite a few of these issues in the 1.2 release, but thats due out ~Sept/Oct.

@aemneina Sounds good. We will split into multiple environments. Any recommendation on how many stacks per environment is safe?

A guideline here would be to ensure you’re under 200 stacks / 750 containers per environment. I think thats the upper limit and if you stay below that, you should keep a responsive environment. Feel free to reach out if you hit any more issues beyond that.

@aemneina Thanks for all your help!

1 Like

Do you have an update on this? We are still having the same issues w/ the metadata service. We could split into multiple environments, but it would be very inconvenient.

@veered youre responding to a thread thats over a year old. I dont have much context wrt to your environment as well. Can you create a new issue and provide relevant details there?

@aemneina I know it’s a year old, but I wanted the people that were having this issue to see my response. I suspect that they still are having issues, since we’ve been having this issue for more than a year.

In any case, I eventually found this issue https://github.com/rancher/rancher/issues/7751#issuecomment-313240070 and posted there. It’s precisely what I’m experiencing, so I’ll focus on the conversation there.

Briefly, when there are a lot of containers in a single environment redeploying can peg the host CPU’s to 100%, is really slow, and can sometimes cause host disconnects. This is because redeploying causes lots and lots of metadata updates, which need to be propagated to all metadata subscribers.

According to the Github issue I linked to, most of this time is spent encoding/decoding YAML files haha. So even if the only change was using JSON as the data format, things might be fine (since JSON encoding/decoding is like 20x faster than YAML encoding/decoding).

The reason they are YAML is that it supports references, as JSON the metadata would be hundreds of megabytes of redundant info because the same information is available in many different paths.

Bummer… we have at least one host disconnect during every full deploy because of the heaviness of the metadata service.

1 Like