In trying to update a rancher installation I am now unable to use the UI has it keeps timing out. If logins are enabled it just redirects back to login without error. If authentication is disabled then time out errors often give errors. When I started the migration process the VM that was running rancher only has 2 VCPU’s and 4G of Ram. It was so unusable I couldn’t even get to a login without time outs. I’ve since upgraded the VM to 8 VCPU and 16G of ram and the UI actually loads now, but still not working properly.
I’m not certain currently if it is realated to all the failed containers that it’s trying to start and is able to for various reasons or if it’s something else unrelated to the containers.
Some sample rancher logs
2016-07-20 21:19:18,132 ERROR [:] [] [] [] [ecutorService-1] [i.c.p.e.e.i.ProcessEventListenerImpl] Unknown exception running process [instance.start:1483917] on [7509] io.cattle.platform.eventing.exception.EventExecutionException: 500 Server Error: Internal Server Error ("rpc error: code = 2 desc = "oci runtime error: could not synchronise with container process: not a directory"")
2016-07-20 21:19:19,881 ERROR [0e78991d-4a0f-4a7f-b7f9-6d16f0734402:1483872] [instance:7047->instanceHostMap:6574] [instance.start->(InstanceStart)->instancehostmap.activate] [] [cutorService-25] [c.p.e.p.i.DefaultProcessInstanceImpl] Unknown exception io.cattle.platform.eventing.exception.EventExecutionException: 500 Server Error: Internal Server Error ("rpc error: code = 2 desc = "oci runtime error: could not synchronise with container process: not a directory"")
2016-07-20 21:19:19,881 ERROR [0e78991d-4a0f-4a7f-b7f9-6d16f0734402:1483872] [instance:7047] [instance.start->(InstanceStart)] [] [cutorService-25] [i.c.p.process.instance.InstanceStart] Failed to Starting for instance [7047]
I’m unable to login with logins enabled and if disabled and in trying to view running processes with login disabled then I get
Timeout
API request timeout (30 sec)
GET https://mysite.url:8080/v1/processinstances?endTime_null=true&limit=100&sort=id&order=desc
Reload to try again or log out
Current orchestration is cattle. Can I adjust the API timeout setting above? Anyone have recommendations for getting around this so I can poperly manage my enviroments again.
mysql> select * from setting;
+----+------------------------------+-------------------------------------------------------------------------------------------------------------------+
| id | name | value |
+----+------------------------------+-------------------------------------------------------------------------------------------------------------------+
| 1 | api.host | http://XXX.XXX.XXX.XXX:8080 |
| 2 | api.security.enabled | false |
| 3 | api.auth.provider.configured | |
| 4 | api.auth.local.access.mode | unrestricted |
| 5 | api.auth.enabler | rancher_id:16 |
| 6 | catalog.url | library=https://github.com/rancher/rancher-catalog.git,community=https://github.com/rancher/community-catalog.git |
| 7 | vm.enabled | false |
+----+------------------------------+-------------------------------------------------------------------------------------------------------------------+
I’ve tried to supply JAVA_OPTS upon starting container but dosn’t seem to work. When look at processes on host and in container the Xmx is still set for 8G when I’ve bumped it to 14G.
root 17037 191 49.1 22293856 8076856 ? Ssl 21:59 122:53 java -Xms128m -Xmx8g -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/cattle/logs -Dlogback.bootstrap.level=WARN -Xmx14096m
What I specified on start was
docker run -d --volumes-from mad_brahmagupta_backupnew -p 8080:8080 --restart=always -e JAVA_OPTS="-Xmx14096m" rancher/server
If you look at the process above you’ll see that -Xmx is specified twice so is it really setting max java memory limit.