Rancher Agent issue

Hi,

I re-installed the Rancher latest version but the problem is still going on.
Container stats does not work also view logs and execute shell commands also do not work.

When i look at the rancher-agent logs, I get the error message below:

time="2017-05-09T19:14:14Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:14:27Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:14:41Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:14:56Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:15:12Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:15:29Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:15:47Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:16:06Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:16:26Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:16:47Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:31:15Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
time="2017-05-09T19:31:16Z" level=error msg="Couldn't find container for id." error="token is expired" id=132d83bc0179f77d81f64e99ddcc117d8f6f85415bed4745a2f3ac465737ed07
2017/05/09 19:31:17 "token is expired"
goroutine 1368 [running]:
github.com/rancher/agent/service/hostapi/app/common.CheckError(0xeeb9c0, 0xc42020a8a0, 0x2)
	/go/src/github.com/rancher/agent/service/hostapi/app/common/error.go:14 +0xba
github.com/rancher/agent/service/hostapi/auth.GetAndCheckToken(0xc420204339, 0x2c3, 0xa412b0, 0x5)
	/go/src/github.com/rancher/agent/service/hostapi/auth/auth.go:49 +0x22d
github.com/rancher/agent/service/hostapi/logs.(*Handler).Handle(0xf30760, 0xc420204300, 0x24, 0xc420204329, 0x2d3, 0xc420372d80, 0xc420373e00)
	/go/src/github.com/rancher/agent/service/hostapi/logs/logs.go:38 +0xf3
created by github.com/rancher/agent/vendor/github.com/rancher/websocket-proxy/backend.connectToProxyWS
	/go/src/github.com/rancher/agent/vendor/github.com/rancher/websocket-proxy/backend/backend.go:82 +0xa84

E0509 19:31:17.359373    7783 error.go:23] %q

I’m using the following versions on Rancher Server:
Rancher: v1.6.0
Cattle: v0.179.7
User Interface: v1.6.1
Rancher CLI: v0.6.0
Rancher Compose: v0.12.5

Host machine: Ubuntu 14.04.2 LTS (3.16.0) Docker v1.12.3
rancher/agent:v1.2.2
rancher/dns:v0.15.0
rancher/healthcheck:v0.3.1
rancher/metadata:v0.9.1
rancher/net:holder
rancher/net:v0.11.2
rancher/net:v0.11.2
rancher/network-manager:v0.7.0
rancher/scheduler:v0.7.5

Check the clock on the server and hosts. The tokens issued for those websockets are good for 5 minutes, so if one is significantly off they will be expired immediately.

Hi vincent, thank you for your help.

The clocks seem to be synchronized. This is not such a problem.
Everything seems fine when I add a new host via google cloud or aws amazon.
It seems to be a problem with existing servers. What else could be causing this?
What do I need to check?

# ufw status
    Status: inactive

# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
CATTLE_FORWARD  all  --  anywhere             anywhere
DOCKER-ISOLATION  all  --  anywhere             anywhere
DOCKER     all  --  anywhere             anywhere
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere
ACCEPT     all  --  anywhere             anywhere

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain CATTLE_FORWARD (1 references)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere             mark match 0x1068
ACCEPT     all  --  anywhere             anywhere             mark match 0x4000

Chain DOCKER (1 references)
target     prot opt source               destination

Chain DOCKER-ISOLATION (1 references)
target     prot opt source               destination
RETURN     all  --  anywhere             anywhere

# docker info

Containers: 16
 Running: 16
 Paused: 0
 Stopped: 0
Images: 17
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 126
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
 Volume: local
 Network: null host bridge
Kernel Version: 4.4.0-75-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.859 GiB
Name: host-2
ID: YFDQ:PELR:S4FK:AYJQ:ARYF:LHES:AGPT:TKQR:BZYT:7CZD:7WJF:5GU3
WARNING: No swap limit support

ps: I also tried different versions of the docker on host machine.

  • v1.10.3
  • v1.12.0 - v1.12.2
  • v1.12.3
  • v1.13.x

When testing new versions, the docker was completely removed and also all images, containers, and volumes removed.

  • /var/lib/rancher removed
  • /var/lib/docker removed

I´m experiencing the same error:

time=“2017-08-09T22:08:27Z” level=error msg=“Couldn’t find container for id.” error=“token is expired” id=08327a40c2b341394645f532e348b453fae23082f9255f324dc488e75e0f697c
time=“2017-08-09T22:16:45Z” level=error msg=“Couldn’t find container for id.” error=“token is expired” id=08327a40c2b341394645f532e348b453fae23082f9255f324dc488e75e0f697c

And the rancher ui is unable to connect to the host containers to view logs, monitoring or to run a shell.

Clocks are pretty well synched.

|Versions|Rancher v1.6.5 Cattle: v0.182.1 UI: v1.6.9 |

Any solution to this?