We are running scheduled cron jobs for logrotate within a docker container. The docker container is running on SLES15 SP1 based host OS. The docker container itself is built on top of SLES12 SP3 based OS.
BASE OS/host version on the host: SLES15 SP1
Docker engine version: Docker version 19.03.1
Docker OS version: SLES12 SP3
After few hours of system idle time with no active operations performed:
HOST1:~ # pstree
systemdâ
ââdockerdââ¬âdocker-containeââ¬âdocker-containeââ¬âbash
[COLOR="#B22222"] â â â ââstart_cron_jobs ââ¬âcronââ¬â126*[cronâââshââârun-cronsâââmktemp]
â â â â â ââ3401*[cronâââlogrotate]
â â â â â ââ5*[cronâââsh][/COLOR]
There are multiple logrotate process (triggered by cron) running which are in a uninterruptible sleep state
HOST1:~ # ps -eo ppid,pid,user,stat,pcpu,comm,wchan:32
PPID PID USER STAT %CPU COMMAND WCHAN
19422 19429 root Ds 0.0 logrotate call_rwsem_down_write_failed
19423 19430 root Ds 0.0 logrotate call_rwsem_down_write_failed
.
.
19419 19431 root Ss 0.0 sh -
19428 19433 root Ds 0.0 logrotate -
19427 19434 root Ds 0.0 logrotate call_rwsem_down_write_failed
20040 19454 root D 0.0 cron -
20040 19455 root D 0.0 cron call_rwsem_down_read_failed
20040 19456 root D 0.0 cron call_rwsem_down_read_failed
[/QUOTE][/QUOTE]
Stack trace in /var/log/messages
echo w > /proc/sysrq-trigger
2019-10-15T18:34:01.456520+00:00 ESDNAS1 kernel: [100293.155679] logrotate D 0 28738 28729 0x00000000
2019-10-15T18:34:01.456522+00:00 ESDNAS1 kernel: [100293.155684] Call Trace:
2019-10-15T18:34:01.456524+00:00 ESDNAS1 kernel: [100293.155688] ? __schedule+0x27f/0x830
2019-10-15T18:34:01.456526+00:00 ESDNAS1 kernel: [100293.155692] schedule+0x28/0x80
2019-10-15T18:34:01.456556+00:00 ESDNAS1 kernel: [100293.155756] do_syscall_64+0x74/0x150
2019-10-15T18:34:01.458395+00:00 ESDNAS1 kernel: [100293.157556] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000040d52a
2019-10-15T18:34:01.458556+00:00 ESDNAS1 kernel: [100293.157675] ? kmem_cache_alloc+0xea/0x510
2019-10-15T18:34:01.460556+00:00 ESDNAS1 kernel: [100293.159732] ? kmem_cache_alloc+0xea/0x510
2019-10-15T18:34:01.462494+00:00 ESDNAS1 kernel: [100293.161556] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
2019-10-15T18:34:01.463400+00:00 ESDNAS1 kernel: [100293.162556] ? __schedule+0x27f/0x830
2019-10-15T18:34:01.463556+00:00 ESDNAS1 kernel: [100293.162674] do_syscall_64+0x74/0x150
2019-10-15T18:34:01.464556+00:00 ESDNAS1 kernel: [100293.163748] down_write+0x20/0x30
2019-10-15T18:34:01.465556+00:00 ESDNAS1 kernel: [100293.164671] logrotate D 0 29442 29432 0x00000000
2019-10-15T18:34:01.465560+00:00 ESDNAS1 kernel: [100293.164681] ? __schedule+0x27f/0x830
2019-10-15T18:34:01.465567+00:00 ESDNAS1 kernel: [100293.164685] schedule+0x28/0x80
2019-10-15T18:34:01.465569+00:00 ESDNAS1 kernel: [100293.164689] rwsem_down_write_failed+0x153/0x320
2019-10-15T18:34:01.466426+00:00 ESDNAS1 kernel: [100293.165556] RAX: ffffffffffffffda RBX: 0000000001f2f2c0 RCX: 00007f978ab7c0e0
2019-10-15T18:34:01.466443+00:00 ESDNAS1 kernel: [100293.165560] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
Inside docker first script â
-
Start cron
-
Crontab -l = 0 1,7,13,19 * * * /opt/bin/myscript
-
logrotate -f /etc/logrotate.d/1.conf â¦. 4.conf &
-
/etc/cron.d/logrotate.sh
o */5 * * * * root /usr/sbin/logrotate /etc/logrotate.d/xxtools.conf
o */5 * * * * root /usr/sbin/logrotate /etc/logrotate.d/postgres.conf
.
.
.
Same Docker container works smoothly with SLES12 SP3 based host OS. No issues of cron threads being stuck in D state reported.
Rebooting the node to resolve it is not an option for us always.
Any help would be appreciated.
Thanks.