Cross-datacenter clustering

I want to make my application HA by scaling it over multiple datacenters.
There is however no private network between the different datacenters, so my only option is to add the hosts in the different datacenters by their public ip.

My question is whether this is a recommended approach regarding security, given that all traffic will go over the internet?
If I understand correctly, rancher will use an IPSec vpn for all cross-host communication and I can also define security groups on the hosts but I’m wondering if that is sufficient to set up a reliable and secure distributed environment via Rancher?

For the application itself I might be able to set up a public facing “web” server in each datacenter (let’s call it web01 for datacenter 1 and web02 for datacenter 2) and put the rest of the application on an internal server that is only accessible by that web server (e.g. app01 for datacenter 1 and app02 for datacenter 2)

For the database this is not possible however, because the database needs to be available to the app in all datacenters

In other words:

       client
          |
web01 (dc1)  web02 (dc2)
   |           |
app01 (dc1)  app02 (dc2)
         |    |
        database
     (dc1 and dc2)

Thanks,

Guy

1 Like

Nobody?

How is everybody else setting up his/her Rancher environment, especially regarding the database host(s)?

I don’t have a DigitalOcean account so I can’t test it, but as far as I can see the DO integration in Rancher seems to use public ip’s to register the droplets.

Does this mean that it is common to register hosts via their public ip and hence perform all communication between hosts over internet?

Thanks.

@guyds all the traffic going over the internet between hosts over the Rancher network will be using an encrypted VPN. We’ve found that typically suffices for most security concerns.

Hi, thanks for you response.

I understand that all traffic is encrypted by the VPN, but when registering hosts via their public ip it also means you “expose” them to the public internet while when they are registered via their private ip, you don’t have to expose them over the internet.
In the former case it automatically means you have to take extra security precautions on each server.

It also feels strange to me that all traffic between hosts is going over internet, especially in case of a (distributed) microservices application where all services and the database are running on different servers.
This means that all communication between the microservices as well as all database operations count towards the monthly bandwidth limit.

This is just my feeling and therefore I would like to know how other people are deploying their multi-server applications.

@guyds, sorry I misunderstood your question.
If I understand correctly, one of your main concerns is wanting to use the private network between web and app nodes in the same DC. You want this for additional security and reduced bandwidth cost.

I’ll think on this a bit and see if I can come up with something sometime tomorrow.

Note that if you are completely within the same DC, you can set the host registration URL to a private IP and then all the hosts would register over the private network and rancher would “discover” their private IPs and everything would be done over the private network. Its only when you’re doing cross-DC that you get into all-public traffic. We haven’t yet implemented a hybrid mode where we intelligent use the private network when possible and the public when necessary, which I think is what you’re suggesting.

@cjellick, that’s mostly correct indeed.
I want to use the private network as much as possible, even between dc’s.

However, that’s the whole point, there are very few affordable cloud providers that offer cross-dc vlans and even a lot of cloud providers don’t offer vlans at all.
And because Rancher currently isn’t able to switch to using the public ip address if the private address isn’t reachable, I am stuck with either using all public ip’s or staying within 1 dc over vlan.

And I have to decide really soon now on a cloud provider…
So I’m wondering what would be my best bet for now:
Choose a provider that doesn’t offer (cross-dc) vlan, but allows me to go cross-dc by using public ip’s or choose a provider that does offer local vlan (not cross-dc), but doesn’t allow me to go cross-dc.

That’s why I was / am also interested in how other users are dealing with “large” distributed applications.

@cjellick did you by chance make any progress of this architecture in rancher v2?
That would improve the performance of our applications and still allows cross-datacenter and cross-provider deployments.

Maybe you could think about a high availability setup where in every datacenter there’s a cluster of rancher instances and the services within that datacenter/environment only talks to those. Rancher then would only communicate to a publicly available database. Health checks should stay within one data center.

I’m not sure whether you already covered that with the new cluster/environment setup in Rancher v2. To support that every environment/cluster should support its own registration url.