Does anyone else find the UI unresponsive and buggy?

I’ve consistently had a bad experience with the rancher UI. This has been happening through multiple versions of the server agent (running v1.2.0-pre3 now) and i’m now running the server agent on an r3.4xlarge ec2 server in AWS (16 core, 122 GB of memory) and my database is running on an m4.4xlarge (16 core, 64 GB of memory)

Currently there are about 20 nodes connected, not running very many containers ( < 100), but I get the same performance when I run 200 nodes and 1500 containers.

I’ve tried multiple browsers and have had multiple people confirm this. Occasionally this will happen and the CPU will be pegged on the rancher server but this still happens even on the larger r3.4xlarge server.

Does this happen to anyone else?

Does anyone have any tips on how to speed this up?

Yes, but I live with it. My set up is on-premise. I find the slows particularly when transitioning between screens.

Yes, i’ve got the same issue (Chrome or Firefox) since i’ve migrated from v1.1 to v1.2.0-pre3.
The issue seems to comes from javascript, not from network calls.
I’ve no clues yet to speed up screen display.

Charles.

The rancher UI seems to be coded with Ember.js.
On my browser, most of the time is spent in a AMD/require.js-like function, which load many modules.
This loading function spend many seconds to load a ‘typeify’ function.
Unfortunately, it seems there are no source maps available, to spot more precisely where is the trouble.

best regards,

Charles.

I also saw this issue in the 1.2.0-preX line. Made the UI completely useless, I had to constantly hard refresh the browser to do anything and then could only get it to functions for a few moments.

@vincent , does rancher-ui ships some source maps, to spot any performance problem in the browser?

Charles.

The comment to load them might be missing or something, but they’re there.

We already started looking at this a while ago and you’ll see improvement in the next -pre or two. The two biggest things are typeify as you’ve already found (converting plain objects into Ember models) and filtered-sorted-array-proxy's updateContent (sorting & filtering removed/purged resources from most screens). A combination of UI and [backwards compatible] API changes will allow typeify to do less work (recursing over only the fields that have a type that could be another model) and eliminate most filtered arrays (by not returning things we don’t to show in the first place).

hi @vincent,
thanks for this explanation.
I hope this fix will be in the next pre, because i cannot use the ui (firefox and chrome displaya popup saying the script is too long, and my cpu is at 100%).
When the ui finally loads (after 80 sec), i cannot do some actions most of the time (it hangs).
And i’ve tried to revert to 1.1 without success…

best regards,

Charles.

x 1,000

I see many new features being added (which is cool) but the base usage has remained flawed for some time now. My ui locks up left and right on any deployment over a handful of containers. It appears to get slower the longer the deployment exists.

Love Rancher but stability and scale need to take the front seat or I fear users like myself may move on to kubernetes.

1 Like

Agreed. Rancher is becoming harder and harder to sell internally for me because of scalability and UI issues.

@vincent : Is there any github issue created for this blocking problem (ui unresponsive), or should i create a new issue for tracking it?

best regards,

Charles.

@jonathandietz are you running the DB in RDS?

We had similar issues and once I setup a containerized MySQL instance on the same machine as the agent things really started to hum.

I only have about 100 containers over a half dozen hosts so I’m not sure your scale but we definitely found RDS to be the culprit.

@channelape, the DB is in RDS. I would be hesitant on moving the DB back to the instance at this point. The CPU frequently gets pegged on our single node manager when scaling up and down nodes. I’ve got beefy servers for both the manager and the DB.

One thing that I found was there were stuck process. You can see these in https://[RANCHER_URL]/admin/processes. There are a few github issues that mention a similar issue. Manually removing them from the DB seemed to fix some of the slowness (the CLI and API are at least faster), but the UI is still horribly slow most of the time.

Hi, my DB is on baremetal, and i’ve got the same issue which comes from the UI layer (confirmed by @vincent).
This issue is painful ; i’ve tried to revert to a prior version without success …
All 1.2 pre-X has got the same bug, and revert to 1.1 does not work (ui redirect to a page with a kind of 500 page with a plane animation).
I hope the rancher team understand that this issue prevent users to use the cluster.
My recommendation would be to stick to 1.1 and not use from 1.2-pre-1 to 1.2-pre-3 releases…

best regards,

Charles.

For your information, i’ve created a github issue describing the problem (https://github.com/rancher/rancher/issues/6053).
hope it helps,

Charles.