LoadBalancer stuck as degraded

nhumrich · February 3, 2016, 4:04pm

My Loadbalancers have been running for a while now with no changes. All the sudden last night, they switched to “Degraded” and you could not connect to them via TCP.

Not knowing what happened, I tried to restart them with no avail. Then I deleted them entirely and created new ones, and those all went straight to degraded and now I cant get them working at all.

Is there anything I can do to figure out what happened?
I noticed there is a new version of rancher out, and some things changed. I also noticed that the “agent” version didn’t change. Is it possible the HA Proxy configuration changed and is not compatible with an older version of rancher?

My current versions:
Rancher - v0.51.0
Cattle - v0.130.0
User Interface - v0.78.0
Rancher Compose - v0.7.0

denise · February 3, 2016, 6:12pm

Have you confirmed that your cross host communication is still working? Exec into a network agent and ping the IP of the other network agent.

There is a health check on port 42 for load balancers, so you’ll need to make sure this port isn’t used for anything else.

nhumrich · February 5, 2016, 5:29pm

So, after a lot of investigation this seems to have been mostly due to my AWS Security group. I used to only have ports 4500 and 500 open between nodes, and everything used to work, but since the new networking changes, it seems that containers talk to eachother directly now, and need more than those two ports open, so I had to allow the nodes to talk to eachother on all ports.

There seems to have been a change to how the network agent and loadbalancer worked even without me upgrading to 0.56.1. I think I got all the 0.56.1 changes before upgrading. Not sure why that happened, but it caused a lot of headache.

denise · February 22, 2016, 10:26pm

I finally got around to testing for this on AWS and had no issues with a security group that only had UDP ports 500 and 4500 open (along with 8080 and 22 and port 80).

I was able to set up the following Docker-compose.yml with no issues on the load balancer. Basically a load balancer directing to a ghost container which has port 2368 exposed on the container.

blog:
  labels:
    io.rancher.container.pull_image: always
  tty: true
  image: ghost
  stdin_open: true
blogtest2:
  ports:
  - 80:2368
  tty: true
  image: rancher/load-balancer-service
  links:
  - blog:blog
  stdin_open: true

Topic		Replies	Views
1.6.12 loadbalancer changes Rancher 1.x	4	1022	December 4, 2017
Rancher HA Stack "Degraded" Rancher 1.x	27	5979	May 2, 2016
New Loadbalancer stays in Initializing Rancher 1.x	7	1976	December 30, 2015
URGENT: Can't get Load Balancer to work after 0.59 upgrade Rancher 1.x	10	1654	February 26, 2016
Load Balancer config Rancher 1.x	15	8430	September 4, 2015

LoadBalancer stuck as degraded

Related topics