Containers stuck in 'In progress' causing v1.1.2 upgrade to fail

We are running v1.1.0-dev3 at the moment and we have a lot of containers stuck in ‘in progress’ which has been a long running issue. Now I’m trying to upgrade to v1.1.2 which according to release notes should have fixed this problem.

Unfortunately those stuck containers are causing the new Rancher server not to sync up to the hosts. All hosts are stuck at ‘RECONNECTING’ and I’m getting lots of errors about state problems. Is there any way I can tidy this up to allow the upgrade to work?

rancher_1   | 2016-08-19 11:22:56,031 ERROR [:] [] [] [] [ServiceReplay-8] [c.p.e.p.i.DefaultProcessInstanceImpl] final ExitReason is null, should not be
rancher_1   | 2016-08-19 11:22:56,032 ERROR [:] [] [] [] [ServiceReplay-8] [i.c.p.e.e.i.ProcessEventListenerImpl] Unknown exception running process [instance.purge:409425] on [532] java.lang.IllegalStateException: Attempt to cancel when process is still transitioning
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:190) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:157) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:107) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:104) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:104) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.eventing.impl.ProcessEventListenerImpl.processExecute(ProcessEventListenerImpl.java:74) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.server.impl.ProcessInstanceParallelDispatcher$1.runInContext(ProcessInstanceParallelDispatcher.java:27) [cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_101]
rancher_1   |   at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_101]
rancher_1   |   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_101]
rancher_1   |   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_101]
rancher_1   |   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
rancher_1   | Caused by: io.cattle.platform.engine.process.impl.ProcessCancelException: State [activating] is not valid
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.preRunStateCheck(DefaultProcessInstanceImpl.java:267) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:182) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.executeWithProcessInstanceLock(DefaultProcessInstanceImpl.java:157) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:107) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$1.doWithLock(DefaultProcessInstanceImpl.java:104) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.execute(DefaultProcessInstanceImpl.java:104) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.manager.impl.DefaultProcessManager.scheduleProcessInstance(DefaultProcessManager.java:60) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.object.process.impl.DefaultObjectProcessManager.scheduleProcessInstance(DefaultObjectProcessManager.java:47) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.object.process.impl.DefaultObjectProcessManager.scheduleStandardProcess(DefaultObjectProcessManager.java:40) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.object.process.impl.DefaultObjectProcessManager.scheduleStandardProcess(DefaultObjectProcessManager.java:34) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.process.instance.InstancePurge.deleteVolumes(InstancePurge.java:73) ~[cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.process.instance.InstancePurge.handle(InstancePurge.java:47) ~[cattle-iaas-logic-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandler(DefaultProcessInstanceImpl.java:424) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:375) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$4.execute(DefaultProcessInstanceImpl.java:369) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.idempotent.Idempotent.execute(Idempotent.java:42) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runHandlers(DefaultProcessInstanceImpl.java:369) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runLogic(DefaultProcessInstanceImpl.java:471) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runWithProcessLock(DefaultProcessInstanceImpl.java:305) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl$2.doWithLockNoResult(DefaultProcessInstanceImpl.java:245) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.acquireLockAndRun(DefaultProcessInstanceImpl.java:242) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   at io.cattle.platform.engine.process.impl.DefaultProcessInstanceImpl.runDelegateLoop(DefaultProcessInstanceImpl.java:184) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
rancher_1   |   ... 20 common frames omitted
rancher_1   |

Got a little further on this. All the containers causing errors were marked as ‘purging’ in the instance table. Setting allocation_state to ‘purged’ stops the errors in the rancher logs.

However it still leaves them in the UI with cute little red bombs next to them. What can I do to make them vanish completely?

At least this means that I can proceed with the upgrade this weekend.