I just upgrade from 1.1.4 to 1.2.0 on debian 8 (GCE base image with a few config changes).
My load balancers are currently not upgrading with the message in the ui:
lb (Expected state running but got error: Timeout getting IP address)
One of my services also failed to be upgraded?
Generally speaking, this has changed my expectations around Rancher upgrades compared to the smooth upgrades of the past. Its one thing to have the rancher server have problems, but this impacted the availability of my services. I really love Rancher and what youāre all building, but this kinda sucks :-(.
During the launch of the new rancher container I saw:
time="2016-12-02T17:24:55Z" level=info msg="Updating machine jsons for [azure packet amazonec2 azure digitalocean google]"
time="2016-12-02T17:24:57Z" level=info msg="Creating schema machine, roles [service]" id=1ds39
2016-12-02 17:24:57,977 ERROR [:] [] [] [] [TaskScheduler-1] [i.c.p.core.cleanup.TableCleanup ] SQL [delete from `instance` where `instance`.`id` in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]; Cannot delete or update a parent row: a foreign key constraint fails (`cattle`.`network_service_provider_instance_map`, CONSTRAINT `fk_network_service_provider_instance_map__instance_id` FOREIGN KEY (`instance_id`) REFERENCES `instance` (`id`) ON DELETE NO ACTIO)
time="2016-12-02T17:24:58Z" level=info msg="Creating schema host, roles [service]" id=1ds40
2016-12-02 17:24:58,552 WARN [:] [] [] [] [TaskScheduler-1] [i.c.p.core.cleanup.TableCleanup ] [Rows Skipped] volume=73 host=15 physical_host=15 service_index=100 service=100 environment=2 credential=2 storage_pool=16 agent=100 account=100
time="2016-12-02T17:25:00Z" level=info msg="Creating schema machine, roles [project member owner]" id=1ds41
After launch, Iāve seen these types of log errors:
time="2016-12-02T17:27:24Z" level=info msg="Stack Create Event Done" eventId=3c3f91e2-1cc8-4dc0-98bd-8cb5aba6e94e resourceId=1st54
2016-12-02 17:27:25,055 ERROR [:] [] [] [] [ecutorService-1] [o.a.c.m.context.NoExceptionRunnable ] Uncaught exception org.jooq.exception.DataChangedException: Database record has been changed
at org.jooq.impl.UpdatableRecordImpl.checkIfChanged(UpdatableRecordImpl.java:550) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.storeUpdate0(UpdatableRecordImpl.java:291) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.access$200(UpdatableRecordImpl.java:90) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl$3.operate(UpdatableRecordImpl.java:260) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.RecordDelegate.operate(RecordDelegate.java:123) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.storeUpdate(UpdatableRecordImpl.java:255) ~[jooq-3.3.0.jar:na]
at org.jooq.impl.UpdatableRecordImpl.update(UpdatableRecordImpl.java:149) ~[jooq-3.3.0.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.persistRecord(JooqObjectManager.java:223) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.setFieldsInternal(JooqObjectManager.java:130) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager$3.execute(JooqObjectManager.java:118) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.engine.idempotent.Idempotent.change(Idempotent.java:88) ~[cattle-framework-engine-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.setFields(JooqObjectManager.java:115) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.JooqObjectManager.setFields(JooqObjectManager.java:110) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.object.impl.AbstractObjectManager.setFields(AbstractObjectManager.java:135) ~[cattle-framework-object-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.systemstack.listener.SystemStackUpdate.startStacks(SystemStackUpdate.java:142) ~[cattle-system-stack-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.systemstack.listener.SystemStackUpdate.process(SystemStackUpdate.java:175) ~[cattle-system-stack-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.systemstack.listener.SystemStackUpdate$1.run(SystemStackUpdate.java:102) ~[cattle-system-stack-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.configitem.version.impl.ConfigItemStatusManagerImpl$1.doWithLock(ConfigItemStatusManagerImpl.java:92) ~[cattle-config-item-common-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$4.doWithLock(AbstractLockManagerImpl.java:50) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.tryLock(AbstractLockManagerImpl.java:25) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.tryLock(AbstractLockManagerImpl.java:47) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.configitem.version.impl.ConfigItemStatusManagerImpl.runUpdateForEvent(ConfigItemStatusManagerImpl.java:85) ~[cattle-config-item-common-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.systemstack.listener.SystemStackUpdate.globalServiceUpdate(SystemStackUpdate.java:98) ~[cattle-system-stack-0.5.0-SNAPSHOT.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_72]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_72]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_72]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72]
at io.cattle.platform.eventing.annotation.MethodInvokingListener$1.doWithLockNoResult(MethodInvokingListener.java:76) ~[cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:7) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.LockCallbackNoReturn.doWithLock(LockCallbackNoReturn.java:3) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl$3.doWithLock(AbstractLockManagerImpl.java:40) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.LockManagerImpl.doLock(LockManagerImpl.java:33) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:13) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.lock.impl.AbstractLockManagerImpl.lock(AbstractLockManagerImpl.java:37) ~[cattle-framework-lock-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.eventing.annotation.MethodInvokingListener.onEvent(MethodInvokingListener.java:72) ~[cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na]
at io.cattle.platform.eventing.impl.AbstractThreadPoolingEventService$2.doRun(AbstractThreadPoolingEventService.java:140) ~[cattle-framework-eventing-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.NoExceptionRunnable.runInContext(NoExceptionRunnable.java:15) ~[cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:108) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) [cattle-framework-managed-context-0.5.0-SNAPSHOT.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_72]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_72]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
2016-12-02 17:28:09,717 ERROR [9722e583-271a-4372-9559-9ba25f519ee3:1541878] [instance:3949] [instance.start->(InstanceStart)] [] [ecutorService-2] [i.c.p.process.instance.InstanceStart] Failed to Waiting for dependencies for instance [3949]
2016-12-02 17:28:10,228 ERROR [8f78d6b0-393d-4e94-b241-10bf4960ecd1:1541893] [instance:3952] [instance.start->(InstanceStart)] [] [cutorService-13] [i.c.p.process.instance.InstanceStart] Failed to Waiting for deployment unit instances to create for instance [3952]
2016-12-02 17:28:11,633 ERROR [8b6b5fef-9ceb-44cf-aab8-7d9c0ac58e48:1541900] [instance:3957] [instance.start->(InstanceStart)] [] [ecutorService-5] [i.c.p.process.instance.InstanceStart] Failed to Waiting for dependencies for instance [3957]
2016-12-02 17:28:16,209 ERROR [6fb2cdee-5cb4-4bf4-ab44-6dbc5f6e74d7:1541912] [instance:3960] [instance.start->(InstanceStart)] [] [ecutorService-7] [i.c.p.process.instance.InstanceStart] Failed to Waiting for deployment unit instances to create for instance [3960]
2016-12-02 17:28:16,425 ERROR [4789ca65-5af7-4dc2-92ac-ee969164fc14:1541917] [instance:3965] [instance.start->(InstanceStart)] [] [cutorService-12] [i.c.p.process.instance.InstanceStart] Failed to Waiting for dependencies for instance [3965]
2016-12-02 17:28:17,887 ERROR [32518592-3846-4df8-930e-f80352e6f068:1541864] [instance:3948->instanceHostMap:3215] [instance.start->(InstanceStart)->instancehostmap.activate] [] [ecutorService-9] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=340]: Timeout getting IP address
2016-12-02 17:28:17,887 ERROR [32518592-3846-4df8-930e-f80352e6f068:1541864] [instance:3948] [instance.start->(InstanceStart)] [] [ecutorService-9] [i.c.p.process.instance.InstanceStart] Failed [1/2] to Starting for instance [3948]
2016-12-02 17:28:17,984 ERROR [32518592-3846-4df8-930e-f80352e6f068:1541864] [instance:3948->instanceHostMap:3215] [instance.start->(InstanceStart)->instancehostmap.activate] [] [ecutorService-9] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=340]: Timeout getting IP address
2016-12-02 17:28:17,984 ERROR [32518592-3846-4df8-930e-f80352e6f068:1541864] [instance:3948] [instance.start->(InstanceStart)] [] [ecutorService-9] [i.c.p.process.instance.InstanceStart] Failed [2/2] to Starting for instance [3948]
2016-12-02 17:28:23,141 ERROR [3a514ddd-7033-4d0f-bc5e-086c3f790510:1541856] [instance:3945->instanceHostMap:3213] [instance.start->(InstanceStart)->instancehostmap.activate] [] [ecutorService-1] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=340]: Timeout getting IP address
2016-12-02 17:28:23,141 ERROR [3a514ddd-7033-4d0f-bc5e-086c3f790510:1541856] [instance:3945] [instance.start->(InstanceStart)] [] [ecutorService-1] [i.c.p.process.instance.InstanceStart] Failed [1/2] to Starting for instance [3945]
2016-12-02 17:28:23,191 ERROR [3a514ddd-7033-4d0f-bc5e-086c3f790510:1541856] [instance:3945->instanceHostMap:3213] [instance.start->(InstanceStart)->instancehostmap.activate] [] [ecutorService-1] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=340]: Timeout getting IP address
2016-12-02 17:28:23,191 ERROR [3a514ddd-7033-4d0f-bc5e-086c3f790510:1541856] [instance:3945] [instance.start->(InstanceStart)] [] [ecutorService-1] [i.c.p.process.instance.InstanceStart] Failed [2/2] to Starting for instance [3945]
2016-12-02 17:28:23,642 ERROR [1e371feb-51c6-4e05-abbe-abf884833c7e:1541853] [instance:3944->instanceHostMap:3212] [instance.start->(InstanceStart)->instancehostmap.activate] [] [cutorService-10] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=394]: Timeout getting IP address
2016-12-02 17:28:23,642 ERROR [1e371feb-51c6-4e05-abbe-abf884833c7e:1541853] [instance:3944] [instance.start->(InstanceStart)] [] [cutorService-10] [i.c.p.process.instance.InstanceStart] Failed [1/2] to Starting for instance [3944]
2016-12-02 17:28:23,688 ERROR [1e371feb-51c6-4e05-abbe-abf884833c7e:1541853] [instance:3944->instanceHostMap:3212] [instance.start->(InstanceStart)->instancehostmap.activate] [] [cutorService-10] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=394]: Timeout getting IP address
2016-12-02 17:28:23,690 ERROR [1e371feb-51c6-4e05-abbe-abf884833c7e:1541853] [instance:3944] [instance.start->(InstanceStart)] [] [cutorService-10] [i.c.p.process.instance.InstanceStart] Failed [2/2] to Starting for instance [3944]
2016-12-02 17:28:42,442 ERROR [f1f112b9-9c46-4e87-8334-3cc137e22a32:1541924] [instance:3968] [instance.start->(InstanceStart)] [] [ecutorService-2] [i.c.p.process.instance.InstanceStart] Failed to Waiting for deployment unit instances to create for instance [3968]
2016-12-02 17:28:46,937 ERROR [e7dc6f58-a2e1-4680-b2a8-74f1ce76dcce:1541927] [instance:3971] [instance.start->(InstanceStart)] [] [ecutorService-7] [i.c.p.process.instance.InstanceStart] Failed to Waiting for dependencies for instance [3971]
2016-12-02 17:28:52,430 ERROR [29583744-1f9f-4881-aa06-6c6b00387b15:1541933] [instance:3972] [instance.start->(InstanceStart)] [] [ecutorService-7] [i.c.p.process.instance.InstanceStart] Failed to Waiting for deployment unit instances to create for instance [3972]
2016-12-02 17:28:53,942 ERROR [e6464d27-2019-4ca4-a801-077bb448a043:1541935] [instance:3975] [instance.start->(InstanceStart)] [] [ecutorService-1] [i.c.p.process.instance.InstanceStart] Failed to Waiting for dependencies for instance [3975]
2016-12-02 17:28:56,525 ERROR [ed4e7cd1-4e6b-405f-acb8-2d2450769386:1541848] [service:252] [service.activate] [] [ecutorService-4] [c.p.e.p.i.DefaultProcessInstanceImpl] Expected state running but got error: Timeout getting IP address
2016-12-02 17:28:56,567 ERROR [f9839d9c-ec08-467e-a4b3-8d6def44c7e1:1541852] [service:251] [service.activate] [] [ecutorService-3] [c.p.e.p.i.DefaultProcessInstanceImpl] Expected state running but got error: Timeout getting IP address
2016-12-02 17:28:56,598 ERROR [:] [] [] [] [ecutorService-4] [.e.s.i.ProcessInstanceDispatcherImpl] Expected state running but got error: Timeout getting IP address
2016-12-02 17:28:56,685 ERROR [:] [] [] [] [ecutorService-3] [.e.s.i.ProcessInstanceDispatcherImpl] Expected state running but got error: Timeout getting IP address
2016-12-02 17:28:57,861 ERROR [0e88e987-6ab5-4206-9ba4-60404cd5d4f2:1541851] [service:253] [service.activate] [] [ecutorService-6] [c.p.e.p.i.DefaultProcessInstanceImpl] Expected state running but got error: Timeout getting IP address
2016-12-02 17:28:58,036 ERROR [:] [] [] [] [ecutorService-6] [.e.s.i.ProcessInstanceDispatcherImpl] Expected state running but got error: Timeout getting IP address