How To: Deploy Rancher/Kubernetes in Amazon VPC private subnet

[SOLVED: Scroll to my third post on this thread for the exact manual procedure I used to create an AWS Kubernetes environment through my Rancher server console.]

Although there is a Catalog section on the /admin/settings page of the rancher console, there is no Catalog tab that I can find. Both the Rancher Certified Library and the Community Contributed catalogs are enabled on the /admin/settings page.

I do see in the documentation (https://docs.rancher.com/rancher/v1.4/en/catalog/, which links to https://docs.rancher.com/rancher/v1.4/en/installing-rancher/installing-server/#http-proxy) that special steps are necessary if Rancher is running behind an http proxy. We are running rancher-server behind an ELB (as described in the next paragraph). Does that mean we also need those special steps? Is the http_proxy URL supposed to be reachable on a public IP from the internet? Our wouldn’t be.

Our entire rancher infrastructure is deployed in the private subnet of our Amazon VPC (very similar to Deploying Rancher into a private subnet of an AWS VPC). Is it possible that this setup has broken access to the Rancher Catalog in some way. We are using Rancher-v1.4.1, Docker 1.12.6, and RancherOS v0.8.1. We created a Kubernetes environment and added a couple of hosts, which are also EC2 instances in the private subnet. There are NAT gateways defined for each private subnet, and the routing works. We provide access to our team through an OpenVPN server running in a public subnet of our Amazon VPC. The rancher server web console has a 10.101.x.y private IP address, but it is reachable after we connect the OpenVPN tunnel form our laptops.

I do find errors in the logs:

time=“2017-03-03T22:13:47Z” level=info msg="Starting Rancher Catalog service"
time=“2017-03-03T22:13:47Z” level=info msg="Using catalog library=https://git.rancher.io/rancher-catalog.git"
time=“2017-03-03T22:13:47Z” level=info msg="Using catalog community=https://git.rancher.io/community-catalog.git"
time=“2017-03-03T22:13:47Z” level=fatal msg="Failed to configure cattle client: Get http://localhost:8080/v1: dial tcp [::1]:8080: getsockopt: connection refused"
time=“2017-03-03T22:13:47Z” level=info msg=“Starting rancher-compose-executor” version=v0.12.2
time=“2017-03-03T22:13:47Z” level=fatal msg=“Unable to create event router” error=“Get http://localhost:8080/v2-beta: dial tcp [::1]:8080: getsockopt: connection refused”

And a little later:

time=“2017-03-03T22:13:49Z” level=info msg="Request to refresh catalog"
time=“2017-03-03T22:13:49Z” level=info msg="Using catalog library=https://git.rancher.io/rancher-catalog.git"
time=“2017-03-03T22:13:49Z” level=info msg="Using catalog community=https://git.rancher.io/community-catalog.git"
time=“2017-03-03T22:13:49Z” level=info msg=“Starting rancher-compose-executor” version=v0.12.2
time=“2017-03-03T22:13:49Z” level=fatal msg=“Unable to create event router” error="Get http://localhost:8080/v2-beta: dial tcp [::1]:8080: getsockopt: connection refused"
time=“2017-03-03T22:13:49Z” level=fatal msg="Failed to configure cattle client: Get http://localhost:8080/v1: dial tcp [::1]:8080: getsockopt: connection refused"
time=“2017-03-03T22:13:50Z” level=info msg=“Setting log level” logLevel=info
time=“2017-03-03T22:13:50Z” level=info msg=“Starting go-machine-service…” gitcommit=v0.35.0
time=“2017-03-03T22:13:50Z” level=info msg="Waiting for handler registration (1/2)"
time=“2017-03-03T22:13:50Z” level=fatal msg="Exiting go-machine-service: Get http://localhost:8080/v2-beta: dial tcp [::1]:8080: getsockopt: connection refused"
time=“2017-03-03T22:13:50Z” level=error msg=“Failed to pull the catalog from git repo https://git.rancher.io/rancher-catalog.git, error: exit status 1”

I was confused between the default Cattle environment and the Kubernetes environment that I wanted to create. I found that the rancher 1.4.1 documentation is somewhat scattered. But I have a reproducible instructions. In the first section, notice that I am not using the default Kubernetes template, since that won’t use the AWS cloud provider. Adding a new template seems to be a shortcut into the catalog.

This is the detailed checklist for the initial configuration of our Rancher/Kubernetes/AWS:

  1. Prepare AWS IAM policies, for a User and an InstanceProfile
  • A User in a Group with necessary policy permissions for creating EC2s. This User is only used to supply API keys for launching an EC2 host.
  • A Role (with an Instance Profile) with necessary policy permissions for ELB, ECR, etc. This InstanceProfile is assigned to every EC2 host. Without these policy permissions, your AWS cloud provider will fail when Kubernetes needs to launch an ELB, access an ECR, etc.
  1. Add “AWS Kubernetes” Template
  • Navigate to Default > Manage Environments
  • Click to “Add Template”
    • Name: AWS Kubernetes
    • Description: AWS cloud provider for Kubernetes Template
    • Orchestration: select Kubernetes
    • Click to “Edit Config”
      • Choose a version: v1.5.2-rancher1-4
      • Name: AWS Kubernetes
      • Cloud provider: aws
      • Click “Configure” at bottom of screen
  • Click “Create” at bottom of screen
  1. Add “AWS-K8s” Environment
  • Navigate to Default > Manage Environments
  • Click to “Add Environment”
    • Name: AWS-K8s
    • Description: Kubernetes Environment with AWS cloud provider
    • Environment Template: select AWS Kubernetes
      • [DON’T SELECT PLAIN: Kubernetes !!!]
  • Click “Create” at bottom of screen
  1. Make “AWS-K8S” the default environment
  • Find the “Default” row in the Environments section
  • Select “Deactivate” from the menu dropdown at far right.
  • Now “AWS-K8s” should automatically become the default.
  • It will still report as “Unhealthy”
  1. Add a pair of rancher-node-* hosts
  • Navigate to Infrastructure > Hosts
  • Click to “Add Host”
    • Machine Drivers
      • Select “Amazon EC2” for the machine drivers.
    • Account Access
      • Region: us-east-1
      • Access Key: [Copy from rancher-iam-api-keys user]
      • Secret Key: [Copy from rancher-iam-api-keys user]
    • Availability Zone & VPC
      • Availability Zone: us-east-1a
      • VPC Subnet: [Select the private subnet corresponding to your VPC and AZ]
    • Security Group
      • Custom: [Could use default SG]
    • Instance
      • Name: rancher-node-
      • Quantity: 2
      • Note that “Hosts will be named rancher-node-1 — rancher-node-2”
    • Instance Options
      • Instance Type: m4.large
      • AMI: [Click though to RancherOS list, and find ami for correct region]
      • SSH User: rancher
      • IAM Profile: [Name of pre-prepared InstanceProfile]
      • Private IP: select “Use only private IP address”
      • Rancher Labels:
        • Project: rancher
        • Component: rancher-node
    • Click “Create” at bottom of screen
    • After 5-10 minutes, the nodes will have joined the cluster, and the Kubernetes infrastructure stack should be healthy.
  1. Delete “Default” Cattle environment
  • Navigate to Default > Manage Environments
  • Select “Delete” from drop-down on Default row