Heres the network manager logs:
6/24/2018 8:54:21 AMtime="2018-06-24T12:54:21Z" level=info msg="hostnat: Applying new nat rules"
6/24/2018 8:55:02 AMtime="2018-06-24T12:55:02Z" level=info msg="arpsync: Network router changed, syncing ARP tables 1/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:55:02 AMtime="2018-06-24T12:55:02Z" level=info msg="hostports: : Applying new rules"
6/24/2018 8:55:04 AMtime="2018-06-24T12:55:04Z" level=info msg="Setting up resolv.conf for ContainerId [91186d04f8fbcf5a8d577cc5f6f8d8e3acc03116cc707397b36791a0ab606d0d]"
6/24/2018 8:55:07 AMtime="2018-06-24T12:55:07Z" level=info msg="arpsync: Network router changed, syncing ARP tables 2/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:55:09 AMtime="2018-06-24T12:55:09Z" level=info msg="Setting up resolv.conf for ContainerId [0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39]"
6/24/2018 8:55:09 AMtime="2018-06-24T12:55:09Z" level=info msg="CNI up: cid=0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39 networkMode=ipsec"
6/24/2018 8:55:10 AMtime="2018-06-24T12:55:10Z" level=info msg="CNI up done: result=IP4:{IP:{IP:10.42.8.127 Mask:ffff0000} Gateway:10.42.0.1 Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.42.0.1}]}, DNS:{Nameservers:[] Domain: Search:[] Options:[]} cid=0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39 networkMode=ipsec"
6/24/2018 8:55:10 AMtime="2018-06-24T12:55:10Z" level=info msg="hostports: : Applying new rules"
6/24/2018 8:55:12 AMtime="2018-06-24T12:55:12Z" level=info msg="arpsync: Network router changed, syncing ARP tables 3/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:55:17 AMtime="2018-06-24T12:55:17Z" level=info msg="arpsync: Network router changed, syncing ARP tables 4/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:55:23 AMtime="2018-06-24T12:55:23Z" level=info msg="arpsync: Network router changed, syncing ARP tables 5/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:55:28 AMtime="2018-06-24T12:55:28Z" level=info msg="arpsync: Network router changed, syncing ARP tables 6/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:55:34 AMtime="2018-06-24T12:55:34Z" level=info msg="iptablessync: need to hook CATTLE_NETWORK_POLICY chain"
6/24/2018 8:55:34 AMtime="2018-06-24T12:55:34Z" level=info msg="iptablessync: installing cmd: iptables -w -t filter -I FORWARD 1 -s 10.42.0.0/16 -d 10.42.0.0/16 -j CATTLE_NETWORK_POLICY"
6/24/2018 8:55:34 AMtime="2018-06-24T12:55:34Z" level=info msg="iptablessync: fixing order for CATTLE_FORWARD chain"
6/24/2018 8:55:34 AMtime="2018-06-24T12:55:34Z" level=info msg="iptablessync: installing cmd: iptables -w -t filter -I FORWARD 2 -j CATTLE_FORWARD"
6/24/2018 9:10:37 AMtime="2018-06-24T13:10:37Z" level=info msg="arpsync: Network router changed, syncing ARP tables 7/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 9:10:46 AMtime="2018-06-24T13:10:46Z" level=info msg="arpsync: Network router changed, syncing ARP tables 8/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 9:10:51 AMtime="2018-06-24T13:10:51Z" level=info msg="arpsync: Network router changed, syncing ARP tables 9/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 9:10:56 AMtime="2018-06-24T13:10:56Z" level=info msg="arpsync: Network router changed, syncing ARP tables 10/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 10:10:37 AMtime="2018-06-24T14:10:37Z" level=info msg="CNI down: cid=0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39 networkMode=ipsec"
6/24/2018 10:11:17 AMtime="2018-06-24T14:11:17Z" level=info msg="hostports: : Applying new rules"
6/24/2018 10:32:53 AMtime="2018-06-24T14:32:53Z" level=info msg="routesync: DOCKER_BRIDGE=docker0, METADATA_IP=169.254.169.250"
6/24/2018 10:32:53 AMtime="2018-06-24T14:32:53Z" level=info msg="Listening on /tmp/log.sock"
6/24/2018 10:32:53 AMtime="2018-06-24T14:32:53Z" level=info msg="routesync: starting monitoring on bridge: docker0, for metadataIP: 169.254.169.250 every 60 seconds"
6/24/2018 10:32:53 AMtime="2018-06-24T14:32:53Z" level=info msg="Waiting for metadata"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Inspecting on start: startedAt=2018-06-24T14:07:44.232225172Z running=false cid=31873dae020f4ce4009eeccc253f54b08e505dc200aa06e9a3843d40bb07c095"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Inspecting on start: startedAt=2018-06-24T14:29:55.194730429Z running=false cid=0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Inspecting on start: startedAt=2018-06-24T14:32:48.845135292Z running=true cid=91186d04f8fbcf5a8d577cc5f6f8d8e3acc03116cc707397b36791a0ab606d0d"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Recording previously started: startedAt=2018-06-24T14:32:48.845135292Z cid=91186d04f8fbcf5a8d577cc5f6f8d8e3acc03116cc707397b36791a0ab606d0d"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Inspecting on start: startedAt=2018-06-24T14:32:54.165369795Z running=true cid=35878070ddfa1a4de5b31020094f8eb2d98e87371bf0a17433746e236c275ebe"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Recording previously started: startedAt=2018-06-24T14:32:54.165369795Z cid=35878070ddfa1a4de5b31020094f8eb2d98e87371bf0a17433746e236c275ebe"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Inspecting on start: startedAt=2018-06-24T14:32:52.909784567Z running=true cid=fe323674e5bc840eef2abebb4df663e13183c89399f5ad1540e7f4af108dc5ea"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Recording previously started: startedAt=2018-06-24T14:32:52.909784567Z cid=fe323674e5bc840eef2abebb4df663e13183c89399f5ad1540e7f4af108dc5ea"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Inspecting on start: startedAt=2018-06-24T11:47:07.026643723Z running=true cid=e2969f5f0b767409c9b837220cf5a364eeaed91fd2e80f67e2026b8d25c99613"
6/24/2018 10:32:55 AMtime="2018-06-24T14:32:55Z" level=info msg="Recording previously started: startedAt=2018-06-24T11:47:07.026643723Z cid=e2969f5f0b767409c9b837220cf5a364eeaed91fd2e80f67e2026b8d25c99613"
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="enabling dns setup"
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="Starting event router."
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="Container [0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39] not running. Can't setup resolv.conf."
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="Container [31873dae020f4ce4009eeccc253f54b08e505dc200aa06e9a3843d40bb07c095] not running. Can't setup resolv.conf."
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="Setting up resolv.conf for ContainerId [91186d04f8fbcf5a8d577cc5f6f8d8e3acc03116cc707397b36791a0ab606d0d]"
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="hostnat: Applying new nat rules"
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="hostports: : Applying new rules"
6/24/2018 10:32:56 AMtime="2018-06-24T14:32:56Z" level=info msg="arpsync: Network router changed, syncing ARP tables 1/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 10:32:59 AMtime="2018-06-24T14:32:59Z" level=info msg="Setting up resolv.conf for ContainerId [0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39]"
6/24/2018 10:32:59 AMtime="2018-06-24T14:32:59Z" level=info msg="CNI up: cid=0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39 networkMode=ipsec"
6/24/2018 10:32:59 AMtime="2018-06-24T14:32:59Z" level=info msg="CNI up done: result=IP4:{IP:{IP:10.42.8.127 Mask:ffff0000} Gateway:10.42.0.1 Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.42.0.1}]}, DNS:{Nameservers:[] Domain: Search:[] Options:[]} cid=0a22dc9f315fca0b3f4b6006403f19c37c63a09ad289ff124f9f2b38c2b70e39 networkMode=ipsec"
6/24/2018 10:33:01 AMtime="2018-06-24T14:33:01Z" level=info msg="arpsync: Network router changed, syncing ARP tables 2/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 10:33:06 AMtime="2018-06-24T14:33:06Z" level=info msg="arpsync: Network router changed, syncing ARP tables 3/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 10:33:12 AMtime="2018-06-24T14:33:12Z" level=info msg="arpsync: Network router changed, syncing ARP tables 4/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 10:33:17 AMtime="2018-06-24T14:33:17Z" level=info msg="arpsync: Network router changed, syncing ARP tables 5/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 4:39:12 PMtime="2018-06-24T20:39:12Z" level=info msg="arpsync: Network router changed, syncing ARP tables 6/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:17:46 PMtime="2018-06-25T00:17:46Z" level=info msg="arpsync: Network router changed, syncing ARP tables 7/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:17:52 PMtime="2018-06-25T00:17:52Z" level=info msg="Setting up resolv.conf for ContainerId [0e8515bf4e87918a71b28a26e54bb652bf371194457c282aa478cb5d60497687]"
6/24/2018 8:17:52 PMtime="2018-06-25T00:17:52Z" level=info msg="CNI up: cid=0e8515bf4e87918a71b28a26e54bb652bf371194457c282aa478cb5d60497687 networkMode=ipsec"
6/24/2018 8:17:52 PMtime="2018-06-25T00:17:52Z" level=info msg="CNI up done: result=IP4:{IP:{IP:10.42.212.154 Mask:ffff0000} Gateway:10.42.0.1 Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:10.42.0.1}]}, DNS:{Nameservers:[] Domain: Search:[] Options:[]} cid=0e8515bf4e87918a71b28a26e54bb652bf371194457c282aa478cb5d60497687 networkMode=ipsec"
6/24/2018 8:17:57 PMtime="2018-06-25T00:17:57Z" level=info msg="arpsync: Network router changed, syncing ARP tables 8/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:18:02 PMtime="2018-06-25T00:18:02Z" level=info msg="arpsync: Network router changed, syncing ARP tables 9/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:18:07 PMtime="2018-06-25T00:18:07Z" level=info msg="arpsync: Network router changed, syncing ARP tables 10/10 in containers, new MAC: 02:c4:2a:e7:ad:ee"
6/24/2018 8:18:17 PMtime="2018-06-25T00:18:17Z" level=info msg="CNI down: cid=0e8515bf4e87918a71b28a26e54bb652bf371194457c282aa478cb5d60497687 networkMode=ipsec"
At this point IPSec has been trying to activate since this morning:
We have one rancher server, running 4 different environments but occasionally 5 environments. The rancher server is running a t2.xlarge AWS instance ( 4 cpu / 16GB Ram ).
Sample
1 host of t2.medium
3 containers
Staging
1 host of t2.large
2 hosts of t2.medium
16 Containers
Production
6 hosts of t2.large
46 containers
The container count from above is just our applications, it doesn’t include the rancher infrastructure stack containers. Additionally, each environment runs the Janitor stack installed from the catalog.
The dev site is currently what is problematic, but that typically runs in operating environment similar to staging. Running Mysql in RDS on a db.t2.medium class machine for rancher server. Activity graphs, for RDS look okay.
The rancher server log is filled with exceptions, some that seem to coincide with the delayed processes mentioned above.
The file is large so I will just post snippets that seem relavant:
2018-06-23 17:51:42,521 ERROR [634250ec-93ca-4a53-b115-e296c18b9ec4:2125710] [port:36047] [port.purge] [] [ecutorService-5] [.p.p.c.g.GenericResourceProcessState] Resource is null, can't find resource id [36047]
2018-06-23 17:51:56,858 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [595] count [3]
2018-06-23 17:51:56,858 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [702] count [3]
2018-06-23 17:51:56,858 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [4130] count [3]
2018-06-23 17:51:59,263 ERROR [c97eebe9-8b5f-4eaa-a112-770875ed3d6c:2155985] [instance:73773] [instance.start->(InstanceStart)] [] [ecutorService-6] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance hasn't gotten passed creating state] for instance [73773]
2018-06-23 17:51:59,357 ERROR [1608cd10-05ea-41b7-a2a8-8f65eab8566e:2178577] [instance:74247] [instance.start->(InstanceStart)] [] [cutorService-14] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance hasn't gotten passed creating state] for instance [74247]
2018-06-23 17:51:59,505 ERROR [80895b57-46f2-42de-adc7-5a7dbd80bb34:2151718] [instance:73578] [instance.start->(InstanceStart)] [] [cutorService-33] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance hasn't gotten passed creating state] for instance [73578]
2018-06-23 17:52:01,865 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [595] count [4]
2018-06-23 17:52:01,865 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [702] count [4]
2018-06-23 17:52:01,866 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [4130] count [4]
2018-06-23 17:52:06,890 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [595] count [5]
2018-06-23 17:52:06,891 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [702] count [5]
2018-06-23 17:52:06,891 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [4130] count [5]
2018-06-23 17:52:11,906 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [595] count [6]
2018-06-23 17:52:11,908 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Scheduling reconnect for agent [595] host [70] count [6]
2018-06-23 17:52:13,865 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [4130] count [6]
2018-06-23 17:52:13,867 ERROR [:] [] [] [] [TaskScheduler-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Scheduling reconnect for agent [4130] host [263] count [6]
2018-0
2018-06-23 18:00:25,697 ERROR [:] [] [] [] [cutorService-12] [.e.s.i.ProcessInstanceDispatcherImpl] Unknown exception running process [mount.create:2193964] on [40465], canceled by [State [deactivating] is not valid for process [volume.activate:2193975] on resource [29570]]
2018-06-23 18:06:29,134 ERROR [1f96e72c-e539-4aca-8403-304cd6a5a1f0:2135483] [instance:72853] [instance.start->(InstanceStart)] [] [cutorService-17] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance is not running] for instance [72853]
2018-06-23 18:06:46,179 ERROR [1f50fd80-e499-46df-8e49-9572a2e7b77c:2099974] [instance:71475] [instance.start->(InstanceStart)] [] [cutorService-20] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance is not running] for instance [71475]
2018-06-23 18:10:16,888 ERROR [7543cafd-424d-4957-b520-44e3d83ace97:2078612] [containerEvent:254621] [containerevent.create] [] [cutorService-25] [.p.p.c.g.GenericResourceProcessState] Resource is null, can't find resource id [254621]
2018-06-23 18:10:18,072 ERROR [f6e8558b-693f-48b3-8c82-c6eac9904a96:2174538] [instance:74173] [instance.start->(InstanceStart)] [] [cutorService-27] [i.c.p.process.instance.InstanceStart] Failed [Dependencies readiness error instance hasn't gotten passed creating state] for instance [74173]
2018-06-23 18:10:33,907 ERROR [:] [] [] [] [cutorService-10] [o.a.c.m.context.NoExceptionRunnable ] Uncaught exception org.jooq.exception.DataChangedException: Database record has been changed
at org.jooq.impl.UpdatableRecordImpl.checkIfChanged(UpdatableRecordImpl.java:550)
at org.jooq.impl.UpdatableRecordImpl.storeUpdate0(UpdatableRecordImpl.java:291)
at org.jooq.impl.UpdatableRecordImpl.access$200(UpdatableRecordImpl.java:90)
at org.jooq.impl.UpdatableRecordImpl$3.operate(UpdatableRecordImpl.java:260)
at org.jooq.impl.RecordDelegate.operate(RecordDelegate.java:123)
at org.jooq.impl.UpdatableRecordImpl.storeUpdate(UpdatableRecordImpl.java:255)
at org.jooq.impl.UpdatableRecordImpl.update(UpdatableRecordImpl.java:149)
at io.cattle.platform.object.impl.JooqObjectManager.persistRecord(JooqObjectManager.java:228)
at io.cattle.platform.object.impl.JooqObjectManager.setFieldsInternal(JooqObjectManager.java:133)
at io.cattle.platform.object.impl.JooqObjectManager$3.execute(JooqObjectManager.java:121)
at io.cattle.platform.engine.idempotent.Idempotent.change(Idempotent.java:88)
at io.cattle.platform.object.impl.JooqObjectManager.setFields(JooqObjectManager.java:118)
at io.cattle.platform.object.impl.JooqObjectManager.setFields(JooqObjectManager.java:113)
at io.cattle.platform.object.impl.AbstractObjectManager.setFields(AbstractObjectManager.java:139)
at io.cattle.platform.process.progress.ProcessProgressContext.update(ProcessProgressContext.java:73)
at io.cattle.platform.process.progress.ProcessProgressContext.messsage(ProcessProgressContext.java:63)
at io.cattle.platform.process.common.handler.AgentBasedProcessLogic$1.progress(AgentBasedProcessLogic.java:167)
at io.cattle.platform.agent.impl.EventCallProgressHelper$1.progress(EventCallProgressHelper.java:27)
at io.cattle.platform.agent.impl.EventCallProgressHelper$1.progress(EventCallProgressHelper.java:27)
at io.cattle.platform.eventing.impl.FutureEventListener.onEvent(FutureEventListener.java:48)
at io.cattle.platform.eventing.impl.AbstractThreadPoolingEventService$2.doRun(AbstractThreadPoolingEventService.java:141)
at org.apache.cloudstack.managed.context.NoExceptionRunnable.runInContext(NoExceptionRunnable.java:15)
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2018-06-23 18:12:22,126 ERROR [:] [] [] [] [ecutorService-2] [.e.s.i.ProcessInstanceDispatcherImpl] Unknown exception running process [instance.start:2151964] on [73587] io.cattle.platform.util.exception.ExecutionException: Dependencies readiness error
at io.cattle.platform.process.instance.InstanceStart.waitForDeploymentUnitCreate(InstanceStart.java:233)
at io.cattle.platform.process.instance.InstanceStart.handle(InstanceStart.java:122)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:470)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:421)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:415)
at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:415)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:517)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runWithProcessLock(DefaultProcessInstanceImpl.java:345)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$2.doWithLockNoResult(DefaultProcessInstanceImpl.java:245)
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7)
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40)
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.acquireLockAndRun(DefaultProcessInstanceImpl.java:242)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:184)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:157)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:116)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:113)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40)
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37)
at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:113)
at io.cattle.platform.engine.server.impl.ProcessInstanceDispatcherImpl.processExecuteWithLock(ProcessInstanceDispatcherImpl.java:136)
at io.cattle.platform.engine.server.impl.ProcessInstanceDispatcherImpl$1.doWithLock(ProcessInstanceDispatcherImpl.java:123)
at io.cattle.platform.engine.server.impl.ProcessInstanceDispatcherImpl$1.doWithLock(ProcessInstanceDispatcherImpl.java:120)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$4.doWithLock(AbstractLockManagerImpl.java:50)
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.tryLock(AbstractLockManagerImpl.java:25)
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.tryLock(AbstractLockManagerImpl.java:47)
at io.cattle.platform.engine.server.impl.ProcessInstanceDispatcherImpl.execute(ProcessInstanceDispatcherImpl.java:120)
at io.cattle.platform.engine.server.ProcessInstanceReference.doRun(ProcessInstanceReference.java:59)
at org.apache.cloudstack.managed.context.NoExceptionRunnable.runInContext(NoExceptionRunnable.java:15)
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
time="2018-06-23T18:14:26Z" level=info msg="Rollback Stack Event Received" eventId=85f9df2d-2da4-4ea4-b9c1-0d15ee5880e6 resourceId=1st627
time="2018-06-23T18:14:29Z" level=info msg="Timeout waiting for network-manager to finish"
time="2018-06-23T18:14:29Z" level=info msg="Timeout processing stack:1st627"
Failed [Dependencies readiness error instance hasn't gotten passed creating state] for instance [73059]
2018-06-23 19:46:42,928 ERROR [:] [] [] [] [cutorService-60] [.e.s.i.ProcessInstanceDispatcherImpl] Unknown exception running process [mount.create:2197811] on [40489], canceled by [State [deactivating] is not valid for process [volume.activate:2197816] on resource [25867]]
2018-06-23 19:46:42,928 ERROR [:] [] [] [] [cutorService-52] [.e.s.i.ProcessInstanceDispatcherImpl] Unknown exception running process [mount.create:2197810] on [40488], canceled by [State [deactivating] is not valid for process [volume.activate:2197815] on resource [25880]]
2018-06-23 19:46:42,977 ERROR [:] [] [] [] [cutorService-63] [.e.s.i.ProcessInstanceDispatcherImpl] Unknown exception running process [mount.create:2197813] on [40491], canceled by [State [deactivating] is not valid for process [volume.activate:2197817] on resource [25882]]
2018-06-23 19:47:43,541 ERROR [:] [] [] [] [cutorService-52] [.e.s.i.ProcessInstanceDispatcherImpl] Unknown exception running process [mount.create:2197845] on [40491], canceled by [State [deactivating] is not valid for process [volume.activate:2197848] on resource [25882]]
While the logs seem to indicate issues, its not very clear on how to proceed at this point. Please advise. Also if possible can you have someone reach to me at jweber@edge2web.com to discuss support options. I haven’t received anything after filling out my request online.
Also, if we there is any chance for some one to diagnose these issues in realtime I am sure we can arrange it on our end.
Thanks
J