Rancher 1.0.0 - "Failed to get ping from agent" from aws hosts

Hi … trying to setup amazon aws machines for rancher.
Ranger api is available from aws vps. see connections from agent machines to ranger api port.

Last rancher logs are:

time=“2016-04-07T11:33:02Z” level=info msg=“Activating Machine” eventId=3159b405-c088-4782-85b2-ec06bcfd6998 resourceId=1ph103
time=“2016-04-07T11:33:03Z” level=info msg=“Pulling rancher/agent:v0.11.0 image.“
time=“2016-04-07T11:33:10Z” level=info msg=“stdout: Detecting the provisioner…” resourceId: =1ph99
time=“2016-04-07T11:33:10Z” level=info msg=“stdout: Provisioning with rancheros…” resourceId: =1ph99
time=“2016-04-07T11:33:12Z” level=info msg=“stdout: Copying certs to the local machine directory…” resourceId: =1ph99
time=“2016-04-07T11:33:13Z” level=info msg=“stdout: Copying certs to the remote machine…” resourceId: =1ph99
time=“2016-04-07T11:33:14Z” level=info msg=“stdout: Setting Docker configuration on the remote daemon…” resourceId: =1ph99
time=“2016-04-07T11:33:15Z” level=info msg=“stdout: Checking connection to Docker…” resourceId: =1ph99
time=“2016-04-07T11:33:15Z” level=info msg=“stdout: Docker is up and running!” resourceId: =1ph99
time=“2016-04-07T11:33:15Z” level=info msg=“stdout: To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env ec2-host-push1” resourceId: =1ph99
time=“2016-04-07T11:33:16Z” level=info msg=“Machine Created” machineExternalId=ae8a6225-ad07-40e6-91c5-5ea3883d2dd6 resourceId=1ph99
time=“2016-04-07T11:33:16Z” level=info msg=“Creating and uploading extracted machine config” resourceId=1ph99
time=“2016-04-07T11:33:16Z” level=info msg=“Machine config file created and encoded.” destFile=”/var/lib/cattle/machine/ae8a6225-ad07-40e6-91c5-5ea3883d2dd6/ec2-host-push1.tar.gz” resourceId=1ph99
time=“2016-04-07T11:33:16Z” level=info msg=“Activating Machine” eventId=16b75e04-1c87-4838-8fd4-b052449b1db0 resourceId=1ph99
time=“2016-04-07T11:33:17Z” level=info msg=“Pulling rancher/agent:v0.11.0 image.“
time=“2016-04-07T11:33:59Z” level=info msg=“Container created for machine” containerId=660e03bc3df6b5627b6e30a9e54e346403f47b9fe031da5389bef98575d10530 machineId=1ph103 resourceId=1ph103
time=“2016-04-07T11:34:04Z” level=info msg=“Rancher-agent for machine started” containerId=660e03bc3df6b5627b6e30a9e54e346403f47b9fe031da5389bef98575d10530 machineExternalId=94092934-f653-4c57-834b-5b50b8f0d09e resourceId=1ph103
time=“2016-04-07T11:34:04Z” level=info msg=“Creating and uploading extracted machine config” resourceId=1ph103
time=“2016-04-07T11:34:04Z” level=info msg=“Machine config file created and encoded.” destFile=”/var/lib/cattle/machine/94092934-f653-4c57-834b-5b50b8f0d09e/ec2-host-push2.tar.gz” resourceId=1ph103
time=“2016-04-07T11:34:10Z” level=info msg=“Container created for machine” containerId=b83a7be4fc8a5c048dd84e37020c2b6ee1b55aff39c9806d7260217d9a3ffd11 machineId=1ph99 resourceId=1ph99
time=“2016-04-07T11:34:14Z” level=info msg=“Rancher-agent for machine started” containerId=b83a7be4fc8a5c048dd84e37020c2b6ee1b55aff39c9806d7260217d9a3ffd11 machineExternalId=ae8a6225-ad07-40e6-91c5-5ea3883d2dd6 resourceId=1ph99
time=“2016-04-07T11:34:14Z” level=info msg=“Creating and uploading extracted machine config” resourceId=1ph99
time=“2016-04-07T11:34:14Z” level=info msg=“Machine config file created and encoded.” destFile="/var/lib/cattle/machine/ae8a6225-ad07-40e6-91c5-5ea3883d2dd6/ec2-host-push1.tar.gz" resourceId=1ph99
2016-04-07 11:34:42,169 ERROR [:] [] [] [] [ecutorService-9] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [3]
2016-04-07 11:34:47,170 ERROR [:] [] [] [] [ecutorService-8] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [4]
2016-04-07 11:34:52,171 ERROR [:] [] [] [] [cutorService-10] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [5]
2016-04-07 11:34:52,171 ERROR [:] [] [] [] [cutorService-10] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [3]
2016-04-07 11:34:57,173 ERROR [:] [] [] [] [ecutorService-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [4]
2016-04-07 11:34:57,173 ERROR [:] [] [] [] [ecutorService-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [6]
2016-04-07 11:34:57,177 ERROR [:] [] [] [] [ecutorService-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Scheduling reconnect for [3]
2016-04-07 11:35:02,173 ERROR [:] [] [] [] [ecutorService-8] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [5]
2016-04-07 11:35:02,173 ERROR [:] [] [] [] [ecutorService-9] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [7]
2016-04-07 11:35:07,175 ERROR [:] [] [] [] [ecutorService-8] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [8]
2016-04-07 11:35:07,175 ERROR [:] [] [] [] [ecutorService-9] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [6]
2016-04-07 11:35:07,179 ERROR [:] [] [] [] [ecutorService-9] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Scheduling reconnect for [7]
2016-04-07 11:35:13,176 ERROR [:] [] [] [] [ecutorService-7] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [3] count [9]
2016-04-07 11:35:13,177 ERROR [:] [] [] [] [ecutorService-1] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [7]
2016-04-07 11:35:18,176 ERROR [:] [] [] [] [ecutorService-5] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [8]
2016-04-07 11:35:23,177 ERROR [:] [] [] [] [ecutorService-3] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [7] count [9]

curl from another host in vpc to rancher show:

curl https://rancher.xx.de
{“type”:“collection”,“resourceType”:“apiVersion”,“links”:{“self”:“https://rancher.xx.de/",“latest”:“https://rancher.xx.de/v1”},“createTypes”:{},“actions”:{},“data”:[{“id”:“v1”,“type”:“apiVersion”,“links”:{“self”:“https://rancher.xx.de/v1”},“actions”:{}}],“sortLinks”:{},“pagination”:null,“sort”:null,“filters”:{},"createDefaults”:{}}[ec2-user@ip-20-0-2-10 ~]

UI shows “Almost there… Waiting for agent connection”.

Could not get machine config for fetching the ssh key, as button “download machine config” does not work…

which protocol/port is used ? any advise to find errors?

Also trying a too add custom host has the same error.

agent shows:

INFO: Running Agent Registration Process, CATTLE_URL=https://rancher.xx.de/v1
INFO: Checking for Docker version >= 1.6.0
INFO: Found Server version: 1.9.1
INFO: docker version: Client version: 1.6.0
INFO: docker version: Client API version: 1.18
INFO: docker version: Go version (client): go1.4.2
INFO: docker version: Git commit (client): 4749651
INFO: docker version: OS/Arch (client): linux/amd64
INFO: docker version: Server version: 1.9.1
INFO: docker version: Server API version: 1.21
INFO: docker version: Go version (server): go1.4.2
INFO: docker version: Git commit (server): a34a1d5/1.9.1
INFO: docker version: OS/Arch (server): linux/amd64
INFO: docker info: Containers: 1
INFO: docker info: Images: 11
INFO: docker info: Storage Driver: devicemapper
INFO: docker info: Pool Name: docker-202:1-263755-pool
INFO: docker info: Pool Blocksize: 65.54 kB
INFO: docker info: Base Device Size: 107.4 GB
INFO: docker info: Backing Filesystem: xfs
INFO: docker info: Data file: /dev/loop0
INFO: docker info: Metadata file: /dev/loop1
INFO: docker info: Data Space Used: 563.1 MB
INFO: docker info: Data Space Total: 107.4 GB
INFO: docker info: Data Space Available: 6.351 GB
INFO: docker info: Metadata Space Used: 1.188 MB
INFO: docker info: Metadata Space Total: 2.147 GB
INFO: docker info: Metadata Space Available: 2.146 GB
INFO: docker info: Udev Sync Supported: true
INFO: docker info: Deferred Removal Enabled: false
INFO: docker info: Deferred Deletion Enabled: false
INFO: docker info: Deferred Deleted Device Count: 0
INFO: docker info: Data loop file: /var/lib/docker/devicemapper/devicemapper/data
INFO: docker info: Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
INFO: docker info: Library Version: 1.02.93-RHEL7 (2015-01-28)
INFO: docker info: Execution Driver: native-0.2
INFO: docker info: Kernel Version: 4.4.5-15.26.amzn1.x86_64
INFO: docker info: Operating System: Amazon Linux AMI 2016.03
INFO: docker info: CPUs: 1
INFO: docker info: Total Memory: 995.5 MiB
INFO: docker info: Name: ip-20-0-2-10
INFO: docker info: ID: KABE:4CGA:WC3P:V56L:W4ES:4HIN:TPT5:XDYD:QVWX:TQ3K:4ECL:2UXL
INFO: docker info: Http Proxy:
INFO: docker info: Https Proxy:
INFO: docker info: No Proxy:
INFO: Attempting to connect to: https://rancher.xx.de/v1
INFO: https://rancher.xx.de/v1 is accessible
INFO: Inspecting host capabilities
INFO: System: false
INFO: Host writable: true
INFO: Token: xxxxxxxx
INFO: Running registration
INFO: Printing Environment
INFO: ENV: CATTLE_ACCESS_KEY=57D7B09CAD567FD0C41B
INFO: ENV: CATTLE_AGENT_IP=20.0.2.10
INFO: ENV: CATTLE_HOME=/var/lib/cattle
INFO: ENV: CATTLE_REGISTRATION_ACCESS_KEY=registrationToken
INFO: ENV: CATTLE_REGISTRATION_SECRET_KEY=xxxxxxx
INFO: ENV: CATTLE_SECRET_KEY=xxxxxxx
INFO: ENV: CATTLE_SYSTEMD=false
INFO: ENV: CATTLE_URL=https://rancher.xx.de/v1
INFO: ENV: DETECTED_CATTLE_AGENT_IP=20.0.2.10
INFO: ENV: RANCHER_AGENT_IMAGE=rancher/agent:v0.11.0
INFO: Launched Rancher Agent: Rancher State 09beea5aa684430e638e211d8837ae7ae537b24cbcdfb823141b29d68590b824

rancher:

016-04-07 12:10:56,687 ERROR [:] [] [] [] [ecutorService-6] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [3]
2016-04-07 12:11:01,688 ERROR [:] [] [] [] [ecutorService-9] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [4]
2016-04-07 12:11:06,690 ERROR [:] [] [] [] [ecutorService-7] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [5]
2016-04-07 12:11:11,689 ERROR [:] [] [] [] [ecutorService-6] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [6]
2016-04-07 12:11:11,733 ERROR [:] [] [] [] [ecutorService-6] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Scheduling reconnect for [11]
2016-04-07 12:11:16,690 ERROR [:] [] [] [] [ecutorService-9] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [7]
2016-04-07 12:11:21,692 ERROR [:] [] [] [] [cutorService-10] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [8]
2016-04-07 12:11:26,694 ERROR [:] [] [] [] [ecutorService-2] [i.c.p.a.s.ping.impl.PingMonitorImpl ] Failed to get ping from agent [11] count [9]

On the host where the agent is running, can you provide the output of docker ps -a?
And here should be more than one rancher/agent container. Can you provide logs for all them?

As downloading machine configuration not working ( Press button and nothing happend. ), i have no ssh keys for the machine… but i see all machine sending requests to rancher hosts in firewall log. A mod_proxy httpd is in front of rancher ( terminating ssl ). May that a problem ?

The custom hosts shows the same error behavior. These logs i have posted here before …

That is quite possibly the problem. If you are running a proxy in front of rancher, that proxy has to set some headers. Also, the proxy must support websockets as the communication from agent to server is over websockets.

Here are some example configurations for SSL terminating proxies: http://docs.rancher.com/rancher/installing-rancher/installing-server/basic-ssl-config/

:grinning:Thats it. Apache rewrite for ws:/ was missing. Thx.

NP. Glad we got it figured out.