How are people dealing with where they put the nodes (both the “server” nodes that “just” run the rancher app and the more general workload cluster nodes) on aws? I know it’s very simple to stick it in all a public subnet and depend mostly on security group enforcement, but I would expect to put all the server nodes and all the “worker” nodes in private subnets with some kind of LB in front. I think the example below is fairly typical if I follow everything correctly. It seems like lots of problems start to get introduced by having all the nodes/instances in private subnets.
With an NLB you can’t enforce any kind of “security group-like” restriction since NLB don’t have that and if you put the restriction on the instance level it can only talk to the private IP of the ENIs of the NLB, but those can change at any time (at which point rancher stops working since it can’t reach the new ones automatically). How are people dealing with that (trying to make sure the nodes only take traffic from each other and the NLB)?
With an ALB somehow it seems like the nodes can’t reach the rancher server url once they’ve done the initial registration. So like when you start a host it does reach the rancher server url and register itself fine, but then none of the rancher-cluster-agent and rancher-node-agent can reach the https port of the rancher server url. Note that they can reach the http port just fine, so it’s not that the rancher server url is unreachable.
So are people doing something like leaving the “server” hosts in the public zone, but sticking all the worker hosts in private zones? Or are people jumping through all the hoops to force having it all in private subnets to work?
route53 A record pointing rancher.example.com to the LB below
public subnet in vpc A:
NLB/ALB internet-facing that terminates ssl and balances across the three instances listed
private subnet in vpc A - zone us-west-2a
ec2 instance for the rancher server workload
private subnet in vpc A - zone us-west-2b
ec2 instance for the rancher server workload
private subnet in vpc A - zone us-west-2c
ec2 instance for the rancher server workload