[Rancher v1.6.18]Facing Memory Issues on Rancher Host

#1

Hello Team,
We are using Rancher v1.6.18. We have around 220 stacks running in rancher farm.

Infra Details:
3 Rancher Masters
1 Rancher Db Host
70 Rancher Hosts

It is seen that the Docker process consumes complete memory of the Rancher host.This make the Rancher hosts unresponsive.
In order to resolve this we need to clean up the whole Rancher Hosts which include:

Stopping all the stacks
Removing Docker images, containers, Volumes
Removing /var/lib/docker, /var/lib/rancher
Reboot the Rancher host
Adding back to Rancher farm

Current Resource Limits:
Memory: 13000
mCPU: 4

Host Details:
16GB RAM
Rhel 7

Can you let us know how can we restrict the memory for docker process on Rancher Host?
Is there a way we can Manage the CPU and Memory usage?

Also should we upgrade to 1.6.26 version? Will this help resolve the Memory issue on the rancher hosts?

Regards,
Yash BRAHMANI

#2

There is not enough information to discuss this. You are saying that the Docker process is taking all the memory? Is this dockerd or the process of the container created by Docker? And how do you determine that it takes all the memory, can you share the data you use to determine that?

In these cases it also helps to specify exactly what you are using, that includes OS version, kernel and exact Docker version (all reported by docker info).

#3

Hello Superseb,

Please find in the following output for docker info command:
Containers: 24
Running: 23
Paused: 0
Stopped: 1
Images: 13
Server Version: 1.13.1
Storage Driver: overlay
Backing Filesystem: extfs
Supports d_type: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Init Binary: /usr/libexec/docker/docker-init-current
containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: N/A (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: fec3683b971d9c3ef73f284f176672c44b448662 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
WARNING: You’re not using the default seccomp profile
Profile: /etc/docker/seccomp.json
Kernel Version: 3.10.0-957.1.3.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.6 (Maipo)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 3
CPUs: 4
Total Memory: 15.5 GiB
Name: SERVER_NAME
ID: WDSI:RYH3:IDS7:LRCU:PMRZ:3WD2:X6DT:OZLQ:A2KK:EKRK:N7PW:DAOP
Docker Root Dir: /data/datastore/docker
Debug Mode (client): false
Debug Mode (server): false
Http Proxy:
Https Proxy:
Registry: https://registry.access.redhat.com/v1/
WARNING: bridge-nf-call-ip6tables is disabled
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Registries: registry.access.redhat.com (secure), docker.io (secure), registry.fedoraproject.org (secure), quay.io (secure), registry.centos.org (secure), docker.io (secure)

It is Dockerd process which is taking all the memory. The resource limit as well does not reflect on the rancher host.

This makes the Rancher host unresponsive which causes us to do a complete cleanup with reboot on the rancher hosts.
We would like to know what settings we could do to restrict the docker daemon memory to 12gb and allocate memory to the containers via memory_reservation in catalog.?Or is there any other way we can resolve this problem.

Highly appreciate your help
Thank you

Regards,
Yash BRAHMANI

#4

Can I have an update on this please?

Regards,
Yash