Logspout no route to host

Hi there,

I have 2 hosts in my rancher set-up with a logstash stack successfully deployed.

When I add a logspout stack, the container that gets deployed on the same host (host01) as the logstash-collector works, the other one, deployed on host02, throws an getsockopt: no route to host error.

logspout.yml:

logspout:
  restart: on-failure
  environment:
    ROUTE_URIS: logstash+tcp://logstash:5000
  external_links:
  - logstash/logstash-collector:logstash
  volumes:
  - /var/run/docker.sock:/var/run/docker.sock
  labels:
    io.rancher.scheduler.global: 'true'
    io.rancher.container.pull_image: always
  tty: true
  image: bekt/logspout-logstash:latest
  stdin_open: true

Logs from logspout_1 container:

# logspout v3.2-dev-custom by gliderlabs
# adapters: raw udp tcp logstash
# options : persist:/mnt/routes
# jobs    : http[]:80 pump routes
# routes  :
#   ADAPTER	ADDRESS		CONTAINERS	SOURCES	OPTIONS
#   logstash	logstash:5000				map[]

Logs from logspout_2 container:

# logspout v3.2-dev-custom by gliderlabs
# adapters: logstash raw udp tcp
# options : persist:/mnt/routes
!! dial tcp 10.42.176.8:5000: getsockopt: no route to host

ping, telnet & UDP work fine from within another container deployed on host02:

$ ping 10.42.176.8
PING 10.42.176.8 (10.42.176.8): 56 data bytes
64 bytes from 10.42.176.8: seq=0 ttl=62 time=0.696 ms
64 bytes from 10.42.176.8: seq=1 ttl=62 time=0.781 ms
64 bytes from 10.42.176.8: seq=2 ttl=62 time=0.776 ms

$ telnet 10.42.176.8 5000                                                     
Test message from a different container on host02

Kibana output:

October 15th 2016, 22:07:53.957	message:Test message from a different container on host02 tags:_jsonparsefailure @version:1 @timestamp:October 15th 2016, 22:07:53.957 host:10.42.219.240 port:43030 _id:AVfJ9T6xJw9hW0ybpNep _type:logs _index:logstash-2016.10.15 _score:

$ nc -uv 10.42.176.8 5000                                                     
10.42.176.8 (10.42.176.8:5000) open                                             
UDP test message

Kibana output:

October 15th 2016, 22:09:12.237	host:10.42.219.240 message:UDP test message tags:_jsonparsefailure @version:1 @timestamp:October 15th 2016, 22:09:12.237 _id:AVfJ_NlUJw9hW0ybpNfi _type:logs _index:logstash-2016.10.15 _score:

I also have a load-balancer working, so Rancherā€™s internal DNS is working fine. Also checked the iptables chains. Everything seems to be correct.

What am I missing?! :scream:

Can anybody please tell me why logspout_2 isnā€™t able to establish a connection with the logstash-collector ?

Thank you
Adrian

EDIT:

If I manually start the logspout_2 container on host2 with host01ā€™s IP it works!!

[rancher@ip-172-31-46-131 ~]$ docker run --name="logspout_xx" \
>     --volume=/var/run/docker.sock:/var/run/docker.sock \
>     bekt/logspout-logstash:latest \
>     logstash+tcp://172.31.46.130:5000
# logspout v3.2-dev-custom by gliderlabs
# adapters: logstash raw udp tcp
# options : persist:/mnt/routes
# jobs    : http[]:80 pump routes
# routes  :
#   ADAPTER	ADDRESS			CONTAINERS	SOURCES	OPTIONS
#   logstash+tcp172.31.46.130:5000				map[]
1 Like

Something really weird is going on. Iā€™ve spun up a 3rd instance and left logspout to connect endlessly and in the end it did. :confused:

Iā€™m fighting this as well. No instances of logspout on a host that hasnā€™t got a local logstash fail with the same ā€œno route to hostā€ error.

Iā€™m kind stumped but my googling has brought up https://github.com/weaveworks/weave/issues/1846 which appears to at least closely resemble the circumstances (if not the exact cause)

@phedoreanu

Couple of pointers:

  • To make services resolvable that are in different stacks, you can use <service_name>.<stack_name> and are not required explicit links. (Reference: http://docs.rancher.com/rancher/v1.2/en/rancher-services/dns-service/)
  • There are two stacks involved here. So in your logspout.yml, you need to pass the <service_name>.<stack_name> for the ROUTE_URIS variable.

@leodotcloud

Thanks for the tip with the <service_name>.<stack_name>.

It works better than with external-links, but still a few hiccups:

26/10/2016 17:40:21# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:40:21# adapters: logstash raw udp tcp
26/10/2016 17:40:21# options : persist:/mnt/routes
26/10/2016 17:40:24!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:40:40# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:40:40# adapters: raw udp tcp logstash
26/10/2016 17:40:40# options : persist:/mnt/routes
26/10/2016 17:40:43!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:41:10# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:41:10# adapters: raw udp tcp logstash
26/10/2016 17:41:10# options : persist:/mnt/routes
26/10/2016 17:41:13!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:41:40# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:41:40# adapters: logstash raw udp tcp
26/10/2016 17:41:40# options : persist:/mnt/routes
26/10/2016 17:41:43!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:42:10# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:42:10# adapters: logstash raw udp tcp
26/10/2016 17:42:10# options : persist:/mnt/routes
26/10/2016 17:42:13!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:42:40# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:42:40# adapters: logstash raw udp tcp
26/10/2016 17:42:40# options : persist:/mnt/routes
26/10/2016 17:42:43!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:43:10# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:43:10# adapters: tcp logstash raw udp
26/10/2016 17:43:10# options : persist:/mnt/routes
26/10/2016 17:43:13!! dial tcp 10.42.213.119:5000: getsockopt: no route to host
26/10/2016 17:43:40# logspout v3.2-dev-custom by gliderlabs
26/10/2016 17:43:40# adapters: logstash raw udp tcp
26/10/2016 17:43:40# options : persist:/mnt/routes
26/10/2016 17:43:40# jobs    : http[]:80 pump routes
26/10/2016 17:43:40# routes  :
26/10/2016 17:43:40#   ADAPTER	ADDRESS				CONTAINERS	SOURCES	OPTIONS
26/10/2016 17:43:40#   logstash+tcplogstash-collector.logstash:5000			map[]

This seems to be a timing issue / the network hasnā€™t been configured by the time logspout is trying to connect. The following worked as a quick fix for me, Note the entrypoint and the command:

  logspout:
  image: gliderlabs/logspout:latest
  tty: true
  stdin_open: true
  entrypoint: sh
  command: -c "/bin/sleep 2; /bin/logspout"
  environment:
    ROUTE_URIS: "tcp://logstash.elk:5000"
    LOGSPOUT: 'ignore'
  volumes:
    - '/var/run/docker.sock:/var/run/docker.sock'
  labels:
    io.rancher.container.pull_image: always
    io.rancher.scheduler.global: true 

I assume the root cause is an issue on Rancherā€™s side?

@phedoreanu Glad it worked. The container starts with just the docker ip address in the network 172.17.0.0/16 and once itā€™s discovered, rancher agent configures the second ip address in the network 10.42.0.0/16. Since it takes a few seconds initially for this to happen, you are seeing the expected error as there is really no route to the 10.42 network with out the IP being configured. Once the IP is assigned, the application starts working.

@untoldone If the communication was broken forever I would definitely agree there is an issue. But here in this case, the initial delay of a couple of seconds to setup networking is expected. As you can observe the issue resolves after a couple of retries and the connectivity is established. logspout already has a retry mechanism built into it so there is no additional benefit by modifying the entrypoint/command to add a sleep except to avoid seeing the no route to host messages in the beginning of the log.

It doesnā€™t seem to be the case that logspout retries right now ā€“ I see a continuously restarting container with the current ā€˜:latestā€™ logspout image.

I see the same output as @phendoreanu. The repeated messages are from a container restart rather than logspout retry logic ā€“ given the machines + networking I use, it never successfully connects without the sleep.

@leodotcloud By the way, is the delay for networking documented anywhere? Iā€™ve had the same type of problems specifically with Go based applications (maybe theyā€™re all starting really fast?). Assuming this is different than base docker behavior (which it seems to be), my personal expectation would be you wait for the networking to come up before launching the default docker entrypoint / command.

@untoldone there are a lot of changes coming up in the 1.2 release. The whole networking is being revamped and you would see a much better user experience especially with CNI driver integration with Cattle.

2 Likes