Need advice on managing infrasturcture

Looking for some tips on managing infrastructure well. Rancher has the following heirarchy:

  • Environments
    • Infrastructure
    • Stacks
      • Services

Within a single environment, stacks can be deployed onto machines in the infrastructure. However, if, for example, I have some infrastructure machines which are “storage” and others which are “load balancers” etc. I wouldn’t want random stack services launching on those.

A solution is to use labels for services and scheduling. However, if a label is not used any infrastructure host is fair game. It would be nice to have a way of white/black listing which infrastructure machines MUST use a schedule label for anything to run on them.

Alternatively, are folks organizing machine categories using environments? Put all your load balancers into one env, all your storage in another? The issue with this is that I cannot use service linking if that is the case. This also should be balanced with monitoring services which I want to run across all of the machines.

Looking for tips or practical solutions currently being used to manage a multi-purpose infrastructure and avoiding random processes showing up on special purpose nodes.

Feel free to tell me my approach(es) are all wrong and suggest ones that are working for you! I’d like to learn what people are doing.



I have been having lot of thoughts on what is the best way to use a multi-purpose environment where several applications are deployed. Here I will share some of our current approaches and ideas.

Basically we have organised ourselves around Development Teams. Each Team has at least two environments:

  1. Prod (where trusted devops from the Team have member role, the rest have reader role)
  2. Dev (all in the Team have member role)

The Team leader has owner role on both environments.

Occasionally we will also have a “CI environment” where Jenkins will deploy with rancher-compose and run tests of the entire stack, and later delete it or leave it up for user acceptance.

In each rancher environment we think to have 3 types of docker hosts.

  1. load-balancer VMs. LB stack. this are small vms with not much RAM used only for load-balancers like Rancher LB (haproxy) or custom nginx/apache servers. This VMs are not replaced very often. they have public IPs and used in external DNS.

  2. ephemeral applications VMs. Application stack. This are medium RAM VMs (4-8 GB) with not much disk need. They are disposable and host only ephemeral docker containers which are scaled and moved by rancher at any time. no persistent data is stored here. These can be deleted at any point without loosing data. These VMs do not require public IPs.

  3. Data/DB server VMs. These VMs are a bit heavier with more CPUs and better disk speed and larger RAM (>8GB). These VMS may or may not use docker containers for the DBMSs. This VMs have attached Volumes to persist data. This VMs can also be used for varnish and other RAM-hungry applications (otherwise we could put this on another stack).

Regarding scheduling on those VM, I can share below a generic setup and rancher-compose.yml for a fictive app, below the diagram

Here is our generic rancher-compose example with some motivations behind our decisions.

Feel free to comment or provide feedback, I am also curios to know how other are scheduling and organizing their environments.

1 Like

Thank you for the detailed response. What you describe makes sense overall but one thing that is still unclear to me is how you monitor these systems. Lets say you want to launch an agent that will collect system stats, etc (scout, datadog, whatever). If you use different environments, you will need to go into each environment and setup monitoring stacks. Alternatively, maybe it is possible to add a host to multiple environments? Then you could allocate hosts to an environment but also have a “master” environment where you can deploy monitors, etc across your entire infra.

@Roman_Shtylman[quote=“Roman_Shtylman, post:3, topic:2440”]
If you use different environments, you will need to go into each environment and setup monitoring stacks.

yes, you need to deploy a monitor stack on each environment … but hey…Rancher has the concept of a Catalog which makes a stack template available in all environments, therefore it is one click operation anyway. We also have one command rancher-compose that makes that very easy. So I don’t see this as a big issue.

you cannot add a host to multiple environments either. That would make orchestration very complex I guess.

I use Site24x7 for monitoring. And I am on AWS. What I did, is this:

Created a launch configuration that does the following:

  • Installs the rancher agent
  • Installs my monitoring agent
  • Sets ssh to run on port 2222

Created an autoscaling group that uses this launch configuration.

So each time a new instance is added into my environment it automatically gets the monitoring agent.

This will not work with all monitoring systems. Some systems generate a unique token for each agent and therefore require a unique install command per device.

Catalog is not ideal either cause I would prefer if the infra team handled monitoring of base systems without having to require every team to do that. But it is an interesting approach. Ideally would be able to add hosts to multiple envs.

Rancher does not allow a single host to be in multiple environments. However, you can place your monitoring server anywhere in your infrastructure, just make sure the agent can access it. Lets say you have three Rancher environments and your hosts for these are in three separate subnets. Plus you have a management subnet where your monitoring server lives: Like this:

  • Dev:
  • Test:
  • Prod:
  • Management:
  1. Setup your routing properly so that there is a route from each subnet to the Management network.
  2. If you are using AWS, this is easy because of the way the VPC routing is configured.
  3. You could give your Monitoring server an IP of, then point your monitoring agent at that IP.

@cloudlady911 This misses the desired feature of managing the monitoring agent(s) via rancher. Yes, I can setup the topology however I want and yea the monitoring servers can be in any location (even open internet). But the original intent of the question was around how to isolate certain categories of resources in rancher or just managing different types of systems with different purposes.

That morphed into a follow up question about common infra items across multiple environments after @demarant suggested each team be assigned an environment. The goal here is again to leverage rancher in tracking what is deployed across the infrastructure so just having a monitoring agent “autoload” on start up could work but isn’t as great as being able to try new stuff across the infrastructure. I am specifically interested in solutions within the rancher systems or leveraging those systems.

1 Like