How scalable is Rancher?

I am testing Rancher out and am curious how far people have pushed Rancher. I’m curious how many nodes it can scale to, how many containers it can handle and also if the network overlay ever has issues.

Basically I’m just curious what breaks first and what bottlenecks exist at scale. Thanks.

We haven’t pushed the boundaries that far, yet.

This is the only reference I can remember to have seen out in public about any large numbers:
https://github.com/rancher/rancher/issues/1824

Maybe others will post their findings here… ? :wink:
Or someone from Rancher Labs may have something interesting to share on the subject :slight_smile:

The honest answer is we haven’t done a ton of load or scale testing lately, so there are likely pieces that could use improvement. There will probably be a good round of that towards the end of the year. But nobody I know of has come to us on fire yet.

There are definitely some limits at which things need to be scaled out or modified, e.g. the managed networks are a /16 by default and there is a connection per host using an ephemeral port on the API server.

Everything is designed in general to work out of the box with the one docker run ... rancher/server for the needs of most people, with various bits that can be swapped out and scaled independently for large or HA deployments. Over time those bits will become fully managed so you can do something like click a button to enable HA or scale out.

The best public example is probably actually from before we had a real company and it was still called Stampede… http://youtu.be/fmYqm7TC7GI (starting around 13m). Launching ~128,000 containers on ~200 hosts provided by DigitalOcean (around 45min). In between is a lot of relevant Darren talking about the architecture, and some screens of the primitive UI I put together for it in a few weekends :smile:.x

@vincent thanks, that is vey helpful. Can you speak at all on the network overlay? Is there much overhead to it?

Are there plans to support other overlays with libnetwork becoming available in Docker 1.9?