Failed to get lb config

I’ve been attempting to add a globally scheduled application load balancer to my cluster (currently 2 servers) and for some reason, even using the default template in the ui it appears to be unable to contact the internal metadata service. I’ve also tried deploying using the rancher CLI to no success. However, the balancer does start up, and does its job for a bit before getting killed and restarted ad-infinitum.

Here are the container logs for a given cycle, and the compose files (extracted from the config export feature on the UI).

rancher: v1.6.2
docker: 17.03.1-ce

lb.instance.image=rancher/lb-service-haproxy:v0.7.5
newest.docker.version=v17.04.0

console:

6/17/2017 12:26:27 AMtime="2017-06-16T12:26:27Z" level=error msg="Failed to initialize Kubernetes controller: KUBERNETES_URL is not set"
6/17/2017 12:26:27 AMtime="2017-06-16T12:26:27Z" level=info msg="Starting Rancher LB service"
6/17/2017 12:26:27 AMtime="2017-06-16T12:26:27Z" level=info msg="LB controller: rancher"
6/17/2017 12:26:27 AMtime="2017-06-16T12:26:27Z" level=info msg="LB provider: haproxy"
6/17/2017 12:26:27 AMtime="2017-06-16T12:26:27Z" level=info msg="starting rancher controller"
6/17/2017 12:26:27 AMtime="2017-06-16T12:26:27Z" level=info msg="Healthcheck handler is listening on :10241"
6/17/2017 12:26:28 AMtime="2017-06-16T12:26:28Z" level=info msg=" -- starting haproxy\n * Starting haproxy haproxy\n   ...done.\n"
6/17/2017 12:26:29 AMtime="2017-06-16T12:26:29Z" level=info msg=" -- reloading haproxy config with the new config changes\n * Reloading haproxy haproxy\n[WARNING] 166/122629 (58) : config : 'option forwardfor' ignored for proxy 'default' as it requires HTTP mode.\n[WARNING] 166/122629 (60) : config : 'option forwardfor' ignored for proxy 'default' as it requires HTTP mode.\n   ...done.\n"
6/17/2017 12:26:39 AMtime="2017-06-16T12:26:39Z" level=error msg="Failed to get lb config: Get http://rancher-metadata/2015-12-19/self/service: net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
6/17/2017 12:26:54 AMtime="2017-06-16T12:26:54Z" level=error msg="Failed to get lb config: Get http://rancher-metadata/2015-12-19/self/service: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
6/17/2017 12:27:13 AMtime="2017-06-16T12:27:13Z" level=info msg=" -- reloading haproxy config with the new config changes\n * Reloading haproxy haproxy\n[WARNING] 166/122712 (76) : config : 'option forwardfor' ignored for proxy 'default' as it requires HTTP mode.\n[WARNING] 166/122713 (78) : config : 'option forwardfor' ignored for proxy 'default' as it requires HTTP mode.\n   ...done.\n"
6/17/2017 12:27:46 AMtime="2017-06-16T12:27:46Z" level=info msg="Received SIGTERM, shutting down"
6/17/2017 12:27:46 AMtime="2017-06-16T12:27:46Z" level=info msg="Shutting down rancher controller"
6/17/2017 12:27:46 AMtime="2017-06-16T12:27:46Z" level=info msg="Shutting down provider haproxy"
6/17/2017 12:27:46 AMtime="2017-06-16T12:27:46Z" level=info msg="Error during shutdown shutdown already in progress"
6/17/2017 12:27:46 AMtime="2017-06-16T12:27:46Z" level=info msg="Exiting with 1"

docker-compose.yml:

version: '2'
services:

  master:
    privileged: true
    image: jenkinsci/jenkins:2.65-alpine
    volumes:
      - /opt/jenkins/home:/var/jenkins_home
      - /var/run/docker.sock:/var/run/docker.sock
    labels:
      io.rancher.container.pull_image: always
      io.rancher.scheduler.affinity:host_label: name=server1

  balancer:
    image: rancher/lb-service-haproxy:v0.7.5
    ports:
      - 8090:8090/tcp
    labels:
      io.rancher.container.agent.role: environmentAdmin
      io.rancher.container.create_agent: 'true'
      io.rancher.scheduler.global: 'true'

  slave-nix:
    privileged: true
    image: techtraits/jenkins-slave
    environment:
      JENKINS_USERNAME: jenkins
      JENKINS_PASSWORD: jenkins
      JENKINS_MASTER: http://master:8080
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    links:
      - master:master
    labels:
      io.rancher.container.hostname_override: container_name
      io.rancher.container.pull_image: always
      io.rancher.scheduler.global: 'true'

rancher-compose.yml:

version: '2'

services:

  master:
    scale: 1
    start_on_create: true

  balancer:
    start_on_create: true
    lb_config:
      certs: []
      port_rules:
        - priority: 1
          protocol: http
          service: master
          source_port: 8090
          target_port: 8080
    health_check:
      healthy_threshold: 2
      response_timeout: 2000
      port: 42
      unhealthy_threshold: 3
      initializing_timeout: 60000
      interval: 2000
      reinitializing_timeout: 60000

  slave-nix:
    start_on_create: true

I’ve been able to work around this a bit by just deleting the health_check section in rancher_config.yml. I’m sure it still complains, but now that no-one’s listening, it works fine.