Rancher-agent cannot register with server - Rancher-server behind SSL

Hi there :wink:

I want to install rancher-server behind SSL and I used, for that, the jwilder/nginx-proxy and jrcs/letsencrypt-nginx-proxy-companion container to proxify my rancher-server container.

The rancher-server setup works nicely but problem appears as soon as I want to add my first host.

The host never appears on the server and I can see those lines in the agent logs:

time="2016-11-17T09:30:29Z" level="info" msg="Host not registered yet. Sleeping 1 second and trying again." Attempt=1 reportedUuid="6bfcf58e-6562-4b65-ba51-499da1c36d67"
time="2016-11-17T09:30:30Z" level="info" msg="Host not registered yet. Sleeping 1 second and trying again." Attempt=2 reportedUuid="6bfcf58e-6562-4b65-ba51-499da1c36d67"
time="2016-11-17T09:30:31Z" level="info" msg="Host not registered yet. Sleeping 1 second and trying again." Attempt=3 reportedUuid="6bfcf58e-6562-4b65-ba51-499da1c36d67"

On the nginx-proxy container, I can see those lines in the logs:

nginx.1    | rancher.domain.com aa.bb.cc.dd - C5DDC15BA46CDB2CF9ED [17/Nov/2016:09:30:59 +0000] "POST /v1/hostapiproxytokens HTTP/1.1" 422 179 "-" "Go 1.1 package http"
nginx.1    | rancher.domain.com aa.bb.cc.dd - C5DDC15BA46CDB2CF9ED [17/Nov/2016:09:31:00 +0000] "POST /v1/hostapiproxytokens HTTP/1.1" 422 179 "-" "Go 1.1 package http"
nginx.1    | rancher.domain.com aa.bb.cc.dd - C5DDC15BA46CDB2CF9ED [17/Nov/2016:09:31:01 +0000] "POST /v1/hostapiproxytokens HTTP/1.1" 422 179 "-" "Go 1.1 package http"

The 422 status code indicates there is a problem but I don’t know what to do to fix it.

FYI, I used the following docker-compose.yml file to setup my rancher-server instance:

version: '2'
services:
    nginx-proxy:
      restart: always
      image: jwilder/nginx-proxy
      ports:
        - "80:80"
        - "443:443"

      volumes:
        - nginx-certs:/etc/nginx/certs:ro
        - nginx-conf:/etc/nginx/conf.d
        - nginx-vhost:/etc/nginx/vhost.d
        - nginx-html:/usr/share/nginx/html
        - /var/run/docker.sock:/tmp/docker.sock:ro

    nginx-proxy-companion:
      image: jrcs/letsencrypt-nginx-proxy-companion

      volumes:
        - nginx-certs:/etc/nginx/certs:rw
        - /var/run/docker.sock:/var/run/docker.sock

      volumes_from:
        - nginx-proxy

    rancher-server:
      restart: unless-stopped
      image: rancher/server

      environment:
        VIRTUAL_HOST: rancher.domain.com
        VIRTUAL_PORT: 8080
        LETSENCRYPT_HOST: rancher.domain.com
        LETSENCRYPT_EMAIL: admin@domain.com

      volumes:
        - rancher-data:/var/lib/mysql

volumes:
    nginx-certs:
        driver: local
    nginx-conf:
        driver: local
    nginx-vhost:
        driver: local
    nginx-html:
        driver: local
    rancher-data:
        driver: local

Thanks for you help!

This is my nginxconf:

upstream rancher {
    server {{ private_ipv4 }}:{{ rancher_server_port }};
}

server {
    listen 443 ssl;
    server_name  _;
    ssl_certificate /host.cert;
    ssl_certificate_key /host.key;

    location / {
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_pass http://rancher;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "Upgrade";
        # This allows the ability for the execute shell window to remain open for up to 15 minutes. Without this parameter, the default is 1 minute and will automatically close.
        proxy_read_timeout 900s;
    }
}

server {
    listen 80;
    server_name _;
    if ($http_user_agent ~* (OdklBot) ) {
      return 403;
    }
    return 301 https://$server_name$request_uri;
}

you can ignore the odklBot thing I have some servers in europe and there are captured by some bots

The problem is my nginx configuration file is not directly accessible as it is automatically generated by my nginx-proxy container. Although, by looking within the container, I found this one:

# If we receive X-Forwarded-Proto, pass it through; otherwise, pass along the
# scheme used to connect to this server
map $http_x_forwarded_proto $proxy_x_forwarded_proto {
  default $http_x_forwarded_proto;
  ''      $scheme;
}
# If we receive Upgrade, set Connection to "upgrade"; otherwise, delete any
# Connection header that may have been passed to this server
map $http_upgrade $proxy_connection {
  default upgrade;
  '' close;
}
gzip_types text/plain text/css application/javascript application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
log_format vhost '$host $remote_addr - $remote_user [$time_local] '
                 '"$request" $status $body_bytes_sent '
                 '"$http_referer" "$http_user_agent"';
access_log off;
# HTTP 1.1 support
proxy_http_version 1.1;
proxy_buffering off;
proxy_set_header Host $http_host;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $proxy_connection;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $proxy_x_forwarded_proto;
# Mitigate httpoxy attack (see README for details)
proxy_set_header Proxy "";
server {
	server_name _; # This is just an invalid value which will never trigger on a real hostname.
	listen 80;
	access_log /var/log/nginx/access.log vhost;
	return 503;
}
upstream rancher.domain.com {
				## Can be connect with "rancher_default" network
			# rancher_rancher-server_1
			server 172.18.0.3:8080;
}
server {
	server_name rancher.domain.com;
	listen 80 ;
	access_log /var/log/nginx/access.log vhost;
	return 301 https://$host$request_uri;
}
server {
	server_name rancher.domain.com;
	listen 443 ssl http2 ;
	access_log /var/log/nginx/access.log vhost;
	ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
	ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS';
	ssl_prefer_server_ciphers on;
	ssl_session_timeout 5m;
	ssl_session_cache shared:SSL:50m;
	ssl_session_tickets off;
	ssl_certificate /etc/nginx/certs/rancher.domain.com.crt;
	ssl_certificate_key /etc/nginx/certs/rancher.domain.com.key;
	ssl_dhparam /etc/nginx/certs/rancher.domain.com.dhparam.pem;
	add_header Strict-Transport-Security "max-age=31536000";
	include /etc/nginx/vhost.d/default;
	location / {
		proxy_pass http://rancher.domain.com;
	}
}

What are the directives in this file that could impact on the problem I encountered ?

You need the websocket upgrade etc this is copied from the rancher webiste. I am using the oficial nginx image.

   /usr/bin/docker run -d \
        --name=rancher-nginx \
        -p 80:80 \
        -p 443:443 \
        -v /etc/pki/cert:/host.cert:ro \
        -v /etc/pki/key:/host.key:ro \
        -v conf_file:/etc/nginx/conf.d/default.conf:ro \
        nginx:latest

@andreimc If I follow exactly the steps proposed on official rancher documentation, there is no problem. The fact is that I want to use another way as the one specified in the doc.

My point here is to use let’s encrypt to manage the certificates generation and renewal. For this, I use the solution proposed by jwilder (jwilder/nginx-proxy and jrcs/letsencrypt-proxy-companion). These two containers automatically detect new containers launched with specific environment variables and manage the vhost/certificates configuration.

For this, nginx-proxy container use a template to generate nginx configuration file on the fly. Due to this, I cannot directly modify the nginx config file, but I can make some changes on the template.

So my goal is to adapt the nginx-proxy template file to be fully usable with rancher-server. What are the parts of the nginx config file that directly impact on the websocket ?

proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://rancher;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
# This allows the ability for the execute shell window to remain open for up to 15 minutes. Without this parameter, the default is 1 minute and will automatically close.
proxy_read_timeout 900s;

New event! Without changing anything in my configuration, I upgraded rancher-server to the last 1.2.0-pre4-rc7.

With this release, I’ve been able to add an host without any problem.

Any suggestion to explain why it has not worked with the stable release ?

I tried again with stable release and I encountered the same problem as before. The fact that it is working with 1.2.0-pre4 makes me think that the problem should not be caused by a websocket issue… But why is it not working with 1.1.4 ?