Can't add hosts. Agent unauthorized downloading configscripts


I seem to have gotten my rancher server into a state where I can no longer add any new hosts. I was running 1.2.0-pre3 but I’ve gone back to latest and am still seeing the problem. I am currently on docker 1.12.1, I can try to downgrade if you really think that is the problem.

I was trying to replace my nginx proxy that is running on the host with a proxy that was in a container but as soon as I did that the host would go offline and I couldn’t upgrade the container. I tried changing my server’s address to my non-standard port that rancher server was using, but that didn’t seem to update the existing hosts because as soon as I stopped nginx the host would show as reconnecting. I intended to remove the host and re-add, but that was when I discovered that I couldn’t add any more hosts.

I created a new environment and was able to add that host to the new environment, but after removing the host again, I can’t re-add anymore even if I create new environments.

Is there some way to get this error cleared or failing that, is there a good way to get the config for my set up services so I can easily re-create them if I need to blow my db away.

Also, is it possible to run the front-end proxy for rancher in rancher or am I just asking for nothing but bootstrapping issues. Do people usually run their rancher on a different ip or a non-standard port?

agent log:

INFO: Starting agent for B1F085493D965245B82B
INFO: Access Key: B1F085493D965245B82B
INFO: Config URL: http://{rancher.domain}/v1
INFO: Storage URL: http://{rancher.domain}/v1
INFO: API URL: http://{rancher.domain}/v1
INFO: IP: {rancher.public.ip}
INFO: Port:
INFO: Required Image: rancher/agent:v1.0.2
INFO: Current Image: rancher/agent:v1.0.2
INFO: Using image rancher/agent:v1.0.2
INFO: Downloading agent http://{rancher.domain}/v1/configcontent/configscripts
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Hosts have state and remember the URL of the server they’re supposed to connect to, and their ‘identity’ in that installation.

The host opens the connection to the server, not the other way around. So if you change the server URL, the hosts need to be re-registered with the new one by re-running the registration command.

If you delete a host and then try to re-add it, it still remembers it’s identity and the server will ignore it because that host is supposed to be deleted. rm -rf /var/lib/rancher/state to clear saved state, then register.

Clearing the state and re-registering worked like a charm. My services are now firing back up. Thanks for that tip. I’ll have to remember that if I ever need to remove and re-add a host in the future.