Docker host & no space left on device

Hi,
I’am using Rancher to manage some EC2 hosts (4 nodes in an auto-scaling group) & to orchestrate containers. Everything works fine.

But, at some point, I have a recurrent problem of disk space, even if I remove unused and untagged images with this command

docker images --quiet --filter=dangling=true | xargs --no-run-if-empty docker rmi

Like I said, even if I run this command above, my hosts are continuoulsy running out of space :

Filesystem      Size  Used Avail Use% Mounted on
udev            7.9G   12K  7.9G   1% /dev
tmpfs           1.6G  1.4M  1.6G   1% /run
/dev/xvda1       79G   77G     0 100% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
none            5.0M     0  5.0M   0% /run/lock
none            7.9G  7.5M  7.9G   1% /run/shm
none            100M     0  100M   0% /run/user

I’am using rancher 1.1.4 and my hosts are running Docker 1.12.5 under Ubuntu 14.04.4. LTS.

Is there something I miss? What are the best practices to configure docker for production hosts in order to avoid this problem?

Thank you for your help.

Install the janitor from the catalogue…it’s handles all this for you

How much space should be used? What I mean by this is what is the total sum of the size of the images that you need available on a host? What requirements do the running containers have. Are there any stopped containers taking up space etc. etc.

@sjiveson not sure if it will entirely fix the issue, but Ubuntu 14.04 uses the devicemapper Docker storage driver. We’ve had issues with this on some of our Ubuntu 14.04 Jenkins hosts, not specifically running out of space, but other storage related problems.

The solution it seems for this is to use the overlay storage driver instead, which unfortunately won’t work on the default Ubuntu 14.04 kernel. You can install the LTS kernel from Xenial and it will work, but that requires a kernel package upgrade and reboot.

See https://docs.docker.com/engine/userguide/storagedriver/selectadriver/ for more information.

Interesting. Funnily enough I had a similar problem on a CentOS host a while back, which I was specifically using to avoid a hard links limit issue with overlay. I completely blew it up. Was also for Jenkins. I took a punt and switched back to RancherOS and overlay and we didn’t hit the hard link issue again, which was a pleasant surprise.

I figured it out : My problem came from an “stdout” missconfigured in a logstash container and the generated log file (docker side) was always growing because there was no limit size on it. I did’nt understand why there’s no default config about that in the docker engine …

I have the same situation to yours. I destroyed the stack with the logstash container but the space has not been reclaimed. Did you do anything else to fix this?

I have also tried to use the janitor stack, it says it removed exited containers but the space has not been reclaimed yet.