Rancher HA Slow

Rancher in a single container was pretty snappy.

I’ve gone through the recommended steps to expand Rancher into an HA config with three nodes. All of the containers are running, the DB connection is active, just transitioning between screens takes a long time (>10s). Half of the time, clicking a button causes Rancher to fail. I get redirected to the /fail endpoint.

Each of the nodes running Rancher has 10%-30% CPU use, and 4GB of memory free. The Database is using about 25% of it’s memory and when in use the CPU is hovering around 35%. Network out of the DB is bursting to 4MB/s, which isn’t all that much, but for a fresh install it should be more than enough to get the data to the master nodes.

Has anyone else had this problem? Is there anything that I am missing here?

HA Master instances are t2.large
DB is db.t2.medium

I don’t get the failures you mentioned, but I definitely have had the same experience with overall performance with my HA environment. I see a lot of blue/orange balls orbiting a cow (the “working” indicator in Rancher) since upgrading to an HA environment. That said, it’s not so bad that I worry about it too much; just a lot slower than it was before HA.

I checked the clients and this container is continually restarting:

2e83e6617a76 rancher/server:v1.1.3 "rce" 40 seconds ago Up 38 seconds 3306/tcp, 8080/tcp r-management_rancher-compose-executor_2

Looks like the rancher-compose-executor