Upgrading Rancher HA

I’m trying to upgrade Rancher Server HA to the latest version, but it appears that it doesn’t work like the docs for a single server (or like I assumed, which was just re-launching the service with the latest server image version).

Does anyone know the proper upgrade path for HA? Maybe @denise since she pined the docs for the single server version :slight_smile:

Thanks in advance,

Re-run the HA setup script on each host with a newer image tag; The script is saved on each host, so

/var/lib/rancher/bin/rancher-ha-start.sh rancher/server:v1.1.0-dev2

Thanks Vincent,

I tried that, but now it’s spewing out things like:

time="2016-05-25T18:16:21Z" level=info msg="[0/10] [tunnel]: Starting "
time="2016-05-25T18:16:21Z" level=info msg="Upgrading tunnel"
time="2016-05-25T18:16:21Z" level=error msg="Failed Starting tunnel : Bad response statusCode [422]. Status [422 Unprocessable Entity]. Body: [code=InvalidAction, fieldName=Can't upgrade selector only service] from []"

Any ideas?

Don’t hurt yourself on this one, I ended up wiping out the DB since I had nothing on it and starting from scratch. The good news is, HA ELB with SSL Offloading works, weee.

Do you think you can share a little more information about your setup? I’ve been trying to stand up a very basic 3 node Rancher HA installation in AWS, but have run into problem after problem. A few of the issues I’ve run into are here: Production Rancher HA on AWS

You appear to have been successful and I don’t mind starting over from scratch as I’ve already done that a half dozen times anyway trying to get this working.

For instance, what AMI are you starting with? Docker version? Anything else that might get us off the starting block…

I’ve got my AWS + ELB + SSL working. I’ve found this post accurate in the steps to get this going:


@vincent: you mention that a version upgrade for rancher-server in HA just neets to run:

/var/lib/rancher/bin/rancher-ha-start.sh rancher/server:v1.1.0-devX

Do we need to stop the running containers running with the old version? At what moment should I run the script in the next rancher-server in the cluster? or should they be done all at once?

I’ve tried going from dev3 -> dev4 and dev4 -> dev5 and I’ve always ran into problems, having to resort to delete all of the docker images and the /var/lib/rancher/state directory and running the script above several times before reaching a green state. That is clearly not how the procedure should work, I would really appreciate more details of how you guys at Rancher Labs do it.

Best regards,


I’ve had the same experience, it’s not as simple as just running a script.

I was doing the same as you with all of the manual stuff, then I got tired of it and put my 3 masters in an ASG and am just terminating them when I need an upgrade (one by one). I can give you my cloud-init script if you’re interested. It may only work on ROS though.


Thanks for the fast reply @whiteadam , it sure would be a great help if you could share your cloud-init scripts, if so just to see other approaches, as I am in Debian + Bare Metal.


Here you go, it’s a bit hacky, but I’m sure you can figure it out :slight_smile:

This may be MUCH easier on Debian, RancherOS has some finicky little things you have to work around, which is why there is an init script to wait 300 secs before executing the HA script.

The formatting here might be a little strange, I tried.

  - path: /etc/rc.local
    permissions: "0755"
    owner: root
    content: |
      sleep 300
      sudo bash /etc/rancher-ha.sh rancher/server:v1.1.0-dev5
  - path: /etc/rancher-ha.sh
    permissions: "0755"
    owner: root
    content: |
      set -e
      umask 077

      if [ "$IMAGE" = "" ]; then

      mkdir -p /var/lib/rancher/etc/server
      mkdir -p /var/lib/rancher/etc/ssl
      mkdir -p /var/lib/rancher/bin

      echo Creating /var/lib/rancher/etc/server.conf
      cat > /var/lib/rancher/etc/server.conf << EOF
      export CATTLE_HA_HOST_REGISTRATION_URL=https://rancher.example.com
      export CATTLE_HA_CONTAINER_PREFIX=rancher-ha-

      export CATTLE_DB_CATTLE_MYSQL_HOST=whatever.us-east-1.rds.amazonaws.com
      export CATTLE_DB_CATTLE_MYSQL_NAME=rancher
      export CATTLE_DB_CATTLE_USERNAME=rancher
      export CATTLE_DB_CATTLE_PASSWORD=<password in base64>

      export CATTLE_HA_PORT_REDIS=6379
      export CATTLE_HA_PORT_SWARM=2376
      export CATTLE_HA_PORT_HTTP=80
      export CATTLE_HA_PORT_HTTPS=443
      export CATTLE_HA_PORT_PP_HTTP=81
      export CATTLE_HA_PORT_PP_HTTPS=444
      export CATTLE_HA_PORT_ZK_CLIENT=2181
      export CATTLE_HA_PORT_ZK_QUORUM=2888
      export CATTLE_HA_PORT_ZK_LEADER=3888

      # Uncomment below to force HA enabled and not require one to set it in the UI
      export CATTLE_HA_ENABLED=true

      echo Creating /var/lib/rancher/etc/server/encryption.key
      if [ -e /var/lib/rancher/etc/server/encryption.key ]; then
        mv /var/lib/rancher/etc/server/encryption.key /var/lib/rancher/etc/server/encryption.key.`date '+%s'`
      cat > /var/lib/rancher/etc/server/encryption.key << EOF

      echo Creating /var/lib/rancher/bin/rancher-ha-start.sh
      cat > /var/lib/rancher/bin/rancher-ha-start.sh << "EOF"
      set -e

      if [ "$IMAGE" = "" ]; then
        echo Usage: $0 DOCKER_IMAGE
        exit 1

      docker rm -fv rancher-ha >/dev/null 2>&1 || true
      ID=`docker run --restart=always -d -v /var/run/docker.sock:/var/run/docker.sock --name rancher-ha --net host --privileged -v /var/lib/rancher/etc:/var/lib/rancher/etc $IMAGE ha`

      echo Started container rancher-ha $ID
      echo Run the below to see the logs
      echo docker logs -f rancher-ha

      chmod +x /var/lib/rancher/bin/rancher-ha-start.sh

      echo Running: /var/lib/rancher/bin/rancher-ha-start.sh $IMAGE
      echo To re-run please execute: /var/lib/rancher/bin/rancher-ha-start.sh $IMAGE
      exec /var/lib/rancher/bin/rancher-ha-start.sh $IMAGE
1 Like