Can't get rancher agent running on the same host as the rancher server


We had Rancher agent v1.2.2 running happily on the same host as a 1.5.5 rancher server, however the box died. I tried to re-create this but can’t get the agent to start, I have tried upgrading to V1.6.7 server but still the same outcome:

Agent log outputs:

time="2017-08-21T15:19:00Z" level=info msg="Execing [/usr/bin/nsenter --mount=/proc/30927/ns/mnt -F -- /data/docker_images/aufs/mnt/40b80260068afa27344220170481e7d9ed6c3af4eb99fa74bc28656b185d527f/usr/bin/share-mnt --stage2 /var/lib/rancher/volumes /var/lib/kubelet -- norun]"
INFO: Starting agent for F3AAD6F9208CF4D69049
And that ‘Starting agent’ line just repeats.

In the Rancher Server logs we get quite a few of these:

2017-08-21 15:26:35,371 ERROR [968c9f7d-4df1-4470-9517-05af1513349c:874191] [instance:32858->instanceHostMap:32012] [instance.start->(InstanceStart)->instancehostmap.activate] [] [cutorService-12] [c.p.e.p.i.DefaultProcessInstanceImpl] Agent error for [compute.instance.activate.reply;agent=78]: Timeout getting IP address

Not quite sure how to fix it though? We set the CATTLE_AGENT_IP variable on rancher agent startup…

Update - I have confirmed the agent container can connect to the rancher server, as it mentions a successful connection test in the logs, and I have also checked I can ping the specified IP from within containers running on the same host so its not a firewall issue etc. Anyone got any ideas?

try using -e CATTLE_AGENT_IP='<ip_addr_of_host>' with your docker run command for your agent host.

Hi there,

We have already tried that, though I mentioned it hidden in the bottom of the first post!

Sorry, missed that. Is the agent host being reused? Have you removed the ‘/var/lib/rancher’ folder?

Hi there,

The agent host is not being re-used, but I am using the old Rancher DB as that was in a separate container.

Wiping the /var/lib/rancher has not helped.

For anyone finding this issue, the cause was the /etc/hosts file on that machine was pointing to the wrong IP for the domain name the rancher server was advertising! Many thanks to @superseb for the help!

