I’m running Rancher (v.1.1.2) on my server (Ubuntu 14.04 LTS) and I’m noticing that there are countless java processes showing up, using an ever-increasing amount of memory. I asked a friend of mine who also uses Rancher in a similar setup if he experienced this, but he said he doesn’t see this on his machine. As of writing this, there are 136 instances of java -Xms128m -Xmx2g -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/cattle/logs -Dlogback.bootstrap.level=WARN -Xmx4096m -cp /usr/share/cattle/be21b2bf0c1a2d74b75c887ce9982c6e:/usr/share/cattle/be21b2bf0c1a2d74b75c887ce9982c6e/etc/cattle io.cattle.platform.launcher.Main
using 6882M of virtual memory.
I tried searching through the logs in the rancher_server and rancher-agent containers, and found the following:
rancher_server: /var/lib/cattle/logs/cattle-error.log
[code]2016-07-27 00:04:41,794 ERROR [:] [] [] [] [cutorService-24] [i.c.p.e.e.i.ProcessEventListenerImpl] Unknown exception running process [instance.purge:127] on [2] org.jooq.exception.DataChangedException: Database record has been changed
at org.jooq.impl.UpdatableRecordImpl.checkIfChanged(UpdatableRecordImpl.java:550) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.storeUpdate0(UpdatableRecordImpl.java:291) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.access$200(UpdatableRecordImpl.java:90) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl$3.operate(UpdatableRecordImpl.java:260) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.RecordDelegate.operate(RecordDelegate.java:123) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.storeUpdate(UpdatableRecordImpl.java:255) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.update(UpdatableRecordImpl.java:149) ~[jooq-3.3.0.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.persistRecord(JooqObjectManager.java:223) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.setFieldsInternal(JooqObjectManager.java:130) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager$3.execute(JooqObjectManager.java:118) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.idempotent.Idempotent.change(Idempotent.java:88) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.setFields(JooqObjectManager.java:115) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.setFields(JooqObjectManager.java:110) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.process.common.generic.GenericResourceProcessState.applyData(GenericResourceProcessState.java:96) ~[cattle-iaas-logic-common-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:440) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:375) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:369) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:369) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:471) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runWithProcessLock(DefaultProcessInstanceImpl.java:305) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$2.doWithLockNoResult(DefaultProcessInstanceImpl.java:245) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.acquireLockAndRun(DefaultProcessInstanceImpl.java:242) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:184) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:157) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:107) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:104) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:104) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.eventing.impl.ProcessEventListenerImpl.processExecute(ProcessEventListenerImpl.java:74) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.eventing.impl.ProcessEventListenerImpl.processExecute(ProcessEventListenerImpl.java:56) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at sun.reflect.GeneratedMethodAccessor475.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_101]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_101]
at io.cattle.platform.eventing.annotation.MethodInvokingListener$1.doWithLockNoResult(MethodInvokingListener.java:76) [cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) [cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.eventing.annotation.MethodInvokingListener.onEvent(MethodInvokingListener.java:72) [cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.eventing.impl.AbstractThreadPoolingEventService$2.doRun(AbstractThreadPoolingEventService.java:135) [cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.NoExceptionRunnable.runInContext(NoExceptionRunnable.java:15) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
2016-07-27 05:35:38,860 ERROR [:] [] [] [] [ecutorService-7] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [1] count [3][/code]
And on rancher-agent: /var/lib/rancher/agent.log
2016-07-27 21:28:34,225 ERROR agent [140398882495536] [event.py:112] Error in request : 90039c50-5a1b-4b6e-8409-78db2aa6d0ac
Traceback (most recent call last):
File "/var/lib/cattle/pyagent/cattle/agent/event.py", line 95, in _worker_main
resp = agent.execute(req)
File "/var/lib/cattle/pyagent/cattle/agent/__init__.py", line 15, in execute
return self._router.route(req)
File "/var/lib/cattle/pyagent/cattle/plugins/core/event_router.py", line 13, in route
resp = handler.execute(req)
File "/var/lib/cattle/pyagent/cattle/plugins/core/event_handlers.py", line 32, in execute
type.on_ping(event, resp)
File "/var/lib/cattle/pyagent/cattle/plugins/docker/compute.py", line 126, in on_ping
self._add_instances(ping, pong)
File "/var/lib/cattle/pyagent/cattle/plugins/docker/compute.py", line 138, in _add_instances
running, nonrunning = self._get_all_containers_by_state()
File "/var/lib/cattle/pyagent/cattle/plugins/docker/compute.py", line 171, in _get_all_containers_by_state
for c in client.containers(all=True):
File "/var/lib/cattle/pyagent/dist/docker/api/container.py", line 69, in containers
res = self._result(self._get(u, params=params), True)
File "/var/lib/cattle/pyagent/dist/docker/utils/decorators.py", line 47, in inner
return f(self, *args, **kwargs)
File "/var/lib/cattle/pyagent/dist/docker/client.py", line 112, in _get
return self.get(url, **self._set_request_timeout(kwargs))
File "/var/lib/cattle/pyagent/dist/requests/sessions.py", line 487, in get
return self.request('GET', url, **kwargs)
File "/var/lib/cattle/pyagent/dist/requests/sessions.py", line 475, in request
resp = self.send(prep, **send_kwargs)
File "/var/lib/cattle/pyagent/dist/requests/sessions.py", line 585, in send
r = adapter.send(request, **kwargs)
File "/var/lib/cattle/pyagent/dist/requests/adapters.py", line 479, in send
raise ReadTimeout(e, request=request)
ReadTimeout: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=2)
2016-07-27 21:29:32,246 INFO requests.packages.urllib3.connectionpool [140398882495536] [connectionpool.py:248] Resetting dropped connection: <hostname>
I’ve tried searching around for an explanation and/or solution to this, checking the github bug reports, and searching the forums, but to no avail. Can I get some help on figuring out what’s going on? I truly appreciate it.