Local IP for rancher agent with Scale Sets

kvaes · February 23, 2016, 8:07pm

Been fiddling around to integrate rancher into the new Azure Virtual Machine Scale Sets (https://azure.microsoft.com/en-us/blog/azure-vm-scale-sets-public-preview/).

I was looking to achieve this by extending my previous ARM template ; https://github.com/Azure/azure-quickstart-templates/blob/master/docker-rancher/nodes.json

For those not accustomed to ARM (Azure Resource Manager), I’m basically using docker compose to deploy the rancher agent ;

         "compose": {
                "rancheragent": {
                  "image": "rancher/agent:v0.8.2",
                  "restart": "always",
    			  "privileged": true,
    			  "volumes": [
                    "/var/run/docker.sock:/var/run/docker.sock"
                  ],
                  "command": "[parameters('rancherApi')]"
                }
              }

The downside of this approach is that the agent always uses the external / public IP address for the server communication. If you wouldn’t do this, then the inter host networking would fail due to the IPSec VPN setup underneath.

Though as the client IP is dynamic (at boot), I’m having trouble using the CATTLE_AGENT_IP environment variable. Therefor I needed to use the public IP. Though this has a huge downside… The number of public IP addresses is limited / charged in Azure. When using scale sets, you would typically scale beyond those limits.

Any suggestion how to tackle this? The paths I’ve considered ;

using the variable interpolation of docker compose in combination with the CATTLE_AGENT_IP => though I think this would not prove to be stable
deploying the server in the same subnet & use the internal address as host ip => not tested, not sure if this would fix it
extending the docker images with some additions with a bash script to enter the IP dynamically => though this is very work intensive in regards to upgrades
extending the ARM template with a shell script as wrapper => at the moment this seems to be the best way, though it is far more complex compared to a “simple” docker compose

Anyhow, am I the only one experiencing these kind of deployment issues? Or am I pushing it too far in terms of automation… Any suggestions on what the best course of action would be to have an “easy” scalability in terms of hosts.

TL;DR

CATTLE_AGENT_IP is needed as # of public IPs is limited
setting the CATTLE_AGENT_IP dynamically / automated is not without implementation risks
asking for suggestions / take me to school!

kvaes · February 25, 2016, 8:21am

Issues that I’m currently facing when working with auto scale sets… ;

When scaling down
Hosts go into “reconnecting”-state, where I would expect a cleanup after a given period… Automatically remove disconnected hosts
When scaling up
As I want to scale from 0 tot … I’m limited by the amount of public IP’s I can assign (in the case of Azure). So I want the inter host network to use the Azure internal networks and expose service via the load balancer. The caveat I have here is that I can only do this via the “CATTLE_AGENT_IP”. Suggestion : A “switch” (triggered by an environment variable) that would trigger the agent to use a local network interface instead of the source nat ip adress of the hosts. OR (more complex) extend the agent to work with NATted environments. As the main reason for the public IP address is due to the fact that port 500 & 4500 are not configurable.

denise · March 2, 2016, 10:19pm

Rancher will most likely never automatically remove a disconnected host as we don’t know the reason why the host is in reconnecting state. Also, I couldn’t find one, but please feel free to create a Github issue for an option to automatically clean up hosts after a certain period.

Please feel free to open a Github issue on the switch and trying to use the private IP instead of public one.

kvaes · March 3, 2016, 9:28am

@denise ; Good suggestion!

https://github.com/rancher/rancher/issues/3745
https://github.com/rancher/rancher/issues/3746

Topic		Replies	Views
Rancher Server (1.5.1) and agent same Host Rancher 1.x	6	2878	March 15, 2017
Using public IP instead of private IP for agent for multi-cloud setup Rancher 2.0 Tech Preview	1	2013	April 21, 2018
Rancher Hybrid AWS and Private - VPN not established Rancher 1.x	2	1245	July 7, 2017
Question about external IP on a load balancer for a stack. What is the best practice?	1	1302	January 28, 2017
ExternalIP Address Assignments in Rancher 2.0 and Azure ACS Rancher 2.0 Tech Preview	0	1403	November 13, 2017

Local IP for rancher agent with Scale Sets

Related topics