Hi,
One of our SuSE11 SP1 VM hung last week. We were uanble to ping/ssh the server. Also not able to access the login prompt at the console.
So; we decided to restart.
However we are not able to find any clues in the /var/log/messages file on what caused the server to hang / not respond.
From the /var/log/boot.msg we can see that the server boot started at 14:38. However before this time or after it; we don’t find any clues on what caused the server to hang / not respond. Please comment or point us in the right direction.
**** excerpts from /var/log/boot.msg ****
Boot logging started on /dev/tty1(/dev/console) at Fri May 27 14:38:53 2016
**** excerpts from /var/log/messages ****
May 27 13:54:01 node123 cron[19828]: (oramp3) CMD (/oracle/local/bin/RMAN_arch_backup_BO.ksh MP3 n 2 35 >/oracle/local/log/RMAN_arch_backup_MP3.log 2>&1)
May 27 13:55:02 node123 cron[19916]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 14:41:43 node123 syslog-ng[4498]: syslog-ng starting up; version=‘2.0.9’
May 27 14:41:44 node123 sm-notify[4552]: sm-notify running as root. chown /var/lib/nfs/sm to choose different user
May 27 14:41:44 node123 rpc.statd[4564]: Version 1.2.1 Starting
May 27 14:41:44 node123 rpc.statd[4564]: Flags:
May 27 14:41:44 node123 rpc.statd[4564]: statd running as root. chown /var/lib/nfs/sm to choose different user
May 27 14:41:46 node123 /usr/bin/crontab[4921]: (root) LIST (root)
May 27 14:41:47 node123 kernel: klogd 1.4.1, log source = /proc/kmsg started.
May 27 14:41:47 node123 kernel: [ 167.911562] type=1505 audit(1464381697.920:2): operation=“profile_load” pid=2837 name=/bin/ping
May 27 14:41:47 node123 kernel: [ 167.934592] type=1505 audit(1464381697.944:3): operation=“profile_load” pid=2838 name=/sbin/klogd
May 27 14:41:47 node123 kernel: [ 167.962602] type=1505 audit(1464381697.972:4): operation=“profile_load” pid=2839 name=/sbin/syslog-ng
May 27 14:41:47 node123 kernel: [ 167.990218] type=1505 audit(1464381697.996:5): operation=“profile_load” pid=2840 name=/sbin/syslogd
May 27 14:41:47 node123 kernel: [ 168.021737] type=1505 audit(1464381698.028:6): operation=“profile_load” pid=2841 name=/usr/sbin/avahi-daemon
May 27 14:41:47 node123 kernel: [ 168.047974] type=1505 audit(1464381698.056:7): operation=“profile_load” pid=2842 name=/usr/sbin/identd
May 27 14:41:47 node123 kernel: [ 168.076396] type=1505 audit(1464381698.084:8): operation=“profile_load” pid=2843 name=/usr/sbin/mdnsd
May 27 14:41:47 node123 kernel: [ 168.114908] type=1505 audit(1464381698.124:9): operation=“profile_load” pid=2844 name=/usr/sbin/nscd
May 27 14:41:47 node123 kernel: [ 168.170875] type=1505 audit(1464381698.180:10): operation=“profile_load” pid=2845 name=/usr/sbin/ntpd
May 27 14:41:47 node123 kernel: [ 168.195495] type=1505 audit(1464381698.204:11): operation=“profile_load” pid=2846 name=/usr/sbin/traceroute
May 27 14:41:47 node123 kernel: [ 169.073458] microcode: CPU0 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.077201] microcode: CPU1 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.079331] microcode: CPU2 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.081387] microcode: CPU3 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.083456] microcode: CPU4 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.085451] microcode: CPU5 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.087626] microcode: CPU6 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.089683] microcode: CPU7 sig=0x206d2, pf=0x1, revision=0x710
May 27 14:41:47 node123 kernel: [ 169.091719] Microcode Update Driver: v2.00 tigran@aivazian.fsnet.co.uk, Peter Oruba
May 27 14:41:47 node123 kernel: [ 169.392499] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
May 27 14:41:47 node123 kernel: [ 169.392991] acpiphp: Slot [32] registered
May 27 14:41:47 node123 kernel: [ 169.393007] acpiphp: Slot [33] registered
May 27 14:41:47 node123 kernel: [ 169.393021] acpiphp: Slot [34] registered
May 27 14:41:47 node123 kernel: [ 169.393036] acpiphp: Slot [35] registered
May 27 14:41:47 node123 kernel: [ 169.393051] acpiphp: Slot [36] registered
May 27 14:41:47 node123 kernel: [ 169.393064] acpiphp: Slot [37] registered
May 27 14:41:47 node123 kernel: [ 169.393078] acpiphp: Slot [38] registered
May 27 14:41:47 node123 kernel: [ 169.393092] acpiphp: Slot [39] registered
May 27 14:41:47 node123 kernel: [ 169.393107] acpiphp: Slot [40] registered
May 27 14:41:47 node123 kernel: [ 169.393121] acpiphp: Slot [41] registered
May 27 14:41:47 node123 kernel: [ 169.393134] acpiphp: Slot [42] registered
May 27 14:41:47 node123 kernel: [ 169.393148] acpiphp: Slot [43] registered
May 27 14:41:47 node123 kernel: [ 169.393162] acpiphp: Slot [44] registered
May 27 14:41:47 node123 kernel: [ 169.393184] acpiphp: Slot [45] registered
May 27 14:41:47 node123 kernel: [ 169.393198] acpiphp: Slot [46] registered
May 27 14:41:47 node123 kernel: [ 169.393212] acpiphp: Slot [47] registered
May 27 14:41:47 node123 kernel: [ 169.393225] acpiphp: Slot [48] registered
May 27 14:41:47 node123 kernel: [ 169.393239] acpiphp: Slot [49] registered
May 27 14:41:47 node123 kernel: [ 169.393253] acpiphp: Slot [50] registered
May 27 14:41:47 node123 kernel: [ 169.393267] acpiphp: Slot [51] registered
May 27 14:41:47 node123 kernel: [ 169.393280] acpiphp: Slot [52] registered
May 27 14:41:47 node123 kernel: [ 169.393294] acpiphp: Slot [53] registered
May 27 14:41:47 node123 kernel: [ 169.393308] acpiphp: Slot [54] registered
May 27 14:41:47 node123 kernel: [ 169.393321] acpiphp: Slot [55] registered
May 27 14:41:47 node123 kernel: [ 169.393335] acpiphp: Slot [56] registered
May 27 14:41:47 node123 kernel: [ 169.393348] acpiphp: Slot [57] registered
May 27 14:41:47 node123 kernel: [ 169.393362] acpiphp: Slot [58] registered
May 27 14:41:47 node123 kernel: [ 169.393375] acpiphp: Slot [59] registered
May 27 14:41:47 node123 kernel: [ 169.393389] acpiphp: Slot [60] registered
May 27 14:41:47 node123 kernel: [ 169.393403] acpiphp: Slot [61] registered
May 27 14:41:47 node123 kernel: [ 169.393417] acpiphp: Slot [62] registered
May 27 14:41:47 node123 kernel: [ 169.393432] acpiphp: Slot [63] registered
May 27 14:41:47 node123 kernel: [ 169.393606] acpiphp: Slot [160] registered
May 27 14:41:47 node123 kernel: [ 169.393633] acpiphp: Slot [192] registered
May 27 14:41:47 node123 kernel: [ 169.393658] acpiphp: Slot [224] registered
May 27 14:41:47 node123 kernel: [ 169.393684] acpiphp: Slot [256] registered
May 27 14:41:47 node123 kernel: [ 169.393731] acpiphp: Slot [161] registered
May 27 14:41:47 node123 kernel: [ 169.393763] acpiphp: Slot [162] registered
May 27 14:41:47 node123 kernel: [ 169.393788] acpiphp: Slot [163] registered
May 27 14:41:47 node123 kernel: [ 169.393814] acpiphp: Slot [164] registered
May 27 14:41:47 node123 kernel: [ 169.393839] acpiphp: Slot [165] registered
May 27 14:41:47 node123 kernel: [ 169.393864] acpiphp: Slot [166] registered
May 27 14:41:47 node123 kernel: [ 169.393888] acpiphp: Slot [167] registered
May 27 14:41:47 node123 kernel: [ 169.393914] acpiphp: Slot [193] registered
May 27 14:41:47 node123 kernel: [ 169.393939] acpiphp: Slot [194] registered
May 27 14:41:47 node123 kernel: [ 169.393964] acpiphp: Slot [195] registered
May 27 14:41:47 node123 kernel: [ 169.393991] acpiphp: Slot [196] registered
May 27 14:41:47 node123 kernel: [ 169.394017] acpiphp: Slot [197] registered
May 27 14:41:47 node123 kernel: [ 169.394042] acpiphp: Slot [198] registered
May 27 14:41:47 node123 kernel: [ 169.394068] acpiphp: Slot [199] registered
May 27 14:41:47 node123 kernel: [ 169.394094] acpiphp: Slot [225] registered
May 27 14:41:47 node123 kernel: [ 169.394120] acpiphp: Slot [226] registered
May 27 14:41:47 node123 kernel: [ 169.394146] acpiphp: Slot [227] registered
May 27 14:41:47 node123 kernel: [ 169.394172] acpiphp: Slot [228] registered
May 27 14:41:47 node123 kernel: [ 169.394199] acpiphp: Slot [229] registered
May 27 14:41:47 node123 kernel: [ 169.394225] acpiphp: Slot [230] registered
May 27 14:41:47 node123 kernel: [ 169.394250] acpiphp: Slot [231] registered
May 27 14:41:47 node123 kernel: [ 169.394275] acpiphp: Slot [257] registered
May 27 14:41:47 node123 kernel: [ 169.394301] acpiphp: Slot [258] registered
May 27 14:41:47 node123 kernel: [ 169.394327] acpiphp: Slot [259] registered
May 27 14:41:47 node123 kernel: [ 169.394353] acpiphp: Slot [260] registered
May 27 14:41:47 node123 kernel: [ 169.394381] acpiphp: Slot [261] registered
May 27 14:41:47 node123 kernel: [ 169.394407] acpiphp: Slot [262] registered
May 27 14:41:47 node123 kernel: [ 169.394432] acpiphp: Slot [263] registered
May 27 14:41:47 node123 kernel: [ 170.167846] vmmemctl: started kernel thread pid=3374
May 27 14:41:47 node123 kernel: [ 170.167854] VMware memory control driver initialized
May 27 14:41:47 node123 kernel: [ 171.239466] eth0: intr type 3, mode 0, 1 vectors allocated
May 27 14:41:47 node123 kernel: [ 171.240434] eth0: NIC Link is Up 10000 Mbps
May 27 14:41:47 node123 kernel: [ 171.574371] eth1: intr type 3, mode 0, 1 vectors allocated
May 27 14:41:47 node123 kernel: [ 171.575266] eth1: NIC Link is Up 10000 Mbps
May 27 14:41:47 node123 kernel: [ 172.241427] NET: Registered protocol family 10
May 27 14:41:47 node123 kernel: [ 172.241856] lo: Disabled Privacy Extensions
May 27 14:41:47 node123 kernel: [ 173.599468] RPC: Registered udp transport module.
May 27 14:41:47 node123 kernel: [ 173.599472] RPC: Registered tcp transport module.
May 27 14:41:47 node123 kernel: [ 173.599474] RPC: Registered tcp NFSv4.1 backchannel transport module.
May 27 14:41:47 node123 kernel: [ 173.626821] Slow work thread pool: Starting up
May 27 14:41:47 node123 kernel: [ 173.626925] Slow work thread pool: Ready
May 27 14:41:47 node123 kernel: [ 173.627006] FS-Cache: Loaded
May 27 14:41:47 node123 kernel: [ 173.688520] FS-Cache: Netfs ‘nfs’ registered for caching
May 27 14:41:47 node123 kernel: [ 174.004294] ip_tables: (C) 2000-2006 Netfilter Core Team
May 27 14:41:53 node123 kernel: [ 182.248422] eth1: no IPv6 routers present
May 27 14:41:54 node123 kernel: [ 183.183770] eth0: no IPv6 routers present
May 27 14:41:57 node123 sshd[5153]: Server listening on 0.0.0.0 port 22.
May 27 14:41:57 node123 sshd[5153]: Server listening on :: port 22.
May 27 14:41:57 node123 rhnsd[5246]: Red Hat Network Services Daemon starting up, check in interval 240 minutes.
May 27 14:41:57 node123 vasd[5268]: vasd parent process (PID 5268) starting - 4.0.3.184
May 27 14:41:59 node123 vasd[5319]: vasd dispatcher (PID 5319) starting
May 27 14:41:59 node123 vasd[5326]: vasd (PID 5326) starting
May 27 14:41:59 node123 vasd[5326]: vasd-database (PID 5326) starting
May 27 14:41:59 node123 /usr/sbin/cron[5333]: (CRON) STARTUP (V5.0)
May 27 14:41:59 node123 smartd[5344]: smartd 5.39 2008-10-24 22:33 [x86_64-suse-linux-gnu] (openSUSE RPM) Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net
May 27 14:41:59 node123 smartd[5344]: Opened configuration file /etc/smartd.conf
May 27 14:41:59 node123 smartd[5344]: Drive: DEVICESCAN, implied ‘-a’ Directive on line 26 of file /etc/smartd.conf
May 27 14:41:59 node123 smartd[5344]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
May 27 14:41:59 node123 smartd[5344]: Device: /dev/sda, opened
May 27 14:41:59 node123 smartd[5344]: Device: /dev/sda, IE (SMART) not enabled, skip device Try ‘smartctl -s on /dev/sda’ to turn on SMART features
May 27 14:41:59 node123 smartd[5344]: Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting…
May 27 14:42:01 node123 vasd[5328]: vasd (PID 5328) starting
May 27 14:42:01 node123 vasd[5327]: vasd (PID 5327) starting
May 27 14:42:01 node123 .vgptool[5355]: [NOTICE controller.cpp:397] Group Policy Apply - CallType: SYSTEM START
May 27 14:43:46 node123 sshd[5508]: Accepted publickey for root from 10.200.104.200 port 42211 ssh2
May 27 14:45:01 node123 cron[5734]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 14:46:26 node123 sshd[5508]: Received disconnect from 10.200.104.200: 11: disconnected by user
May 27 14:47:33 node123 sshd[5796]: Failed publickey for oramp3 from 10.204.3.213 port 61217 ssh2
May 27 14:47:33 node123 sshd[5796]: Accepted publickey for oramp3 from 10.204.3.213 port 61217 ssh2
May 27 14:48:00 node123 kernel: [ 549.091617] ------------[ cut here ]------------
May 27 14:48:00 node123 kernel: [ 549.091622] WARNING: at /usr/src/packages/BUILD/kernel-default-2.6.32.12/linux-2.6.32/fs/hugetlbfs/inode.c:936 hugetlb_file_setup+0x21e/0x250()
May 27 14:48:00 node123 kernel: [ 549.091623] Hardware name: VMware Virtual Platform
May 27 14:48:00 node123 kernel: [ 549.091624] Using mlock ulimits for SHM_HUGETLB deprecated
May 27 14:48:00 node123 kernel: [ 549.091625] Modules linked in: autofs4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipv6 vsock(X) vmmemctl(X) acpiphp microcode fuse loop acpi_memhotplug ppdev i2c_piix4 shpchp tpm_tis parport_pc sg i2c_core vmci(X) pci_hotplug tpm pcspkr intel_agp rtc_cmos floppy tpm_bios parport sr_mod mptctl rtc_core cdrom rtc_lib container button ac dm_mirror dm_region_hash dm_log linear sd_mod crc_t10dif dm_snapshot vmxnet(X) vmw_pvscsi vmxnet3 edd dm_mod ext3 mbcache jbd fan thermal processor thermal_sys hwmon ide_pci_generic piix ide_core ata_generic ata_piix libata mptspi mptscsih mptbase scsi_transport_spi scsi_mod
May 27 14:48:00 node123 kernel: [ 549.091653] Supported: Yes
May 27 14:48:00 node123 kernel: [ 549.091655] Pid: 5870, comm: oracle Tainted: G X 2.6.32.12-0.7-default #1
May 27 14:48:00 node123 kernel: [ 549.091656] Call Trace:
May 27 14:48:00 node123 kernel: [ 549.091663] [] dump_trace+0x6c/0x2d0
May 27 14:48:00 node123 kernel: [ 549.091669] [] dump_stack+0x69/0x71
May 27 14:48:00 node123 kernel: [ 549.091672] [] warn_slowpath_common+0x74/0xd0
May 27 14:48:00 node123 kernel: [ 549.091675] [] warn_slowpath_fmt+0x40/0x50
May 27 14:48:00 node123 kernel: [ 549.091678] [] hugetlb_file_setup+0x21e/0x250
May 27 14:48:00 node123 kernel: [ 549.091682] [] newseg+0x126/0x250
May 27 14:48:00 node123 kernel: [ 549.091685] [] ipcget+0x62/0xe0
May 27 14:48:00 node123 kernel: [ 549.091688] [] sys_shmget+0x55/0x60
May 27 14:48:00 node123 kernel: [ 549.091692] [] system_call_fastpath+0x16/0x1b
May 27 14:48:00 node123 kernel: [ 549.091696] [<00007f25c5821867>] 0x7f25c5821867
May 27 14:48:00 node123 kernel: [ 549.091697] —[ end trace a7ec335f08946054 ]—
May 27 14:48:16 node123 sshd[5948]: Accepted publickey for root from 10.200.104.200 port 42216 ssh2
May 27 14:48:19 node123 sshd[5948]: Received disconnect from 10.200.104.200: 11: disconnected by user
May 27 14:49:06 node123 sshd[6049]: Accepted publickey for root from 10.200.104.200 port 42217 ssh2
May 27 14:50:01 node123 cron[6104]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
May 27 14:50:02 node123 cron[6108]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 14:54:01 node123 cron[6204]: (oramp3) CMD (/oracle/local/bin/RMAN_arch_backup_BO.ksh MP3 n 2 35 >/oracle/local/log/RMAN_arch_backup_MP3.log 2>&1)
May 27 14:55:01 node123 cron[6283]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 14:56:39 node123 sshd[6463]: Address 10.200.19.106 maps to cil12080368.hq.huskyenergy.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT!
May 27 14:56:42 node123 sshd[6472]: pam_vas: Authentication for user: account: olkowm@HQ.HUSKYENERGY.COM service: reason: <N/A> Access Control Identifier(NT Name):<HQ\olkowm>
May 27 14:56:42 node123 sshd[6463]: Accepted keyboard-interactive/pam for olkowm from 10.200.19.106 port 56422 ssh2
May 27 14:56:46 node123 sudo: olkowm : TTY=pts/2 ; PWD=/export/home/olkowm ; USER=root ; COMMAND=/bin/su - root
May 27 14:56:46 node123 su: (to root) olkowm on /dev/pts/2
May 27 15:00:01 node123 cron[6667]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
May 27 15:00:01 node123 cron[6668]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:05:01 node123 cron[6870]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:09:01 node123 cron[7011]: (oramp3) CMD (/oracle/local/bin/RMAN_arch_backup_BO.ksh MP3 n 2 35 >/oracle/local/log/RMAN_arch_backup_MP3.log 2>&1)
May 27 15:10:01 node123 cron[7096]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:10:01 node123 cron[7095]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
May 27 15:15:01 node123 cron[7333]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:19:33 node123 sshd[6049]: Received disconnect from 10.200.104.200: 11: disconnected by user
May 27 15:20:01 node123 cron[7533]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:20:01 node123 cron[7538]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
May 27 15:24:01 node123 cron[7678]: (oramp3) CMD (/oracle/local/bin/RMAN_arch_backup_BO.ksh MP3 n 2 35 >/oracle/local/log/RMAN_arch_backup_MP3.log 2>&1)
May 27 15:25:01 node123 cron[7758]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:30:01 node123 cron[7936]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:30:01 node123 cron[7937]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
May 27 15:35:01 node123 cron[8134]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:39:01 node123 cron[8277]: (oramp3) CMD (/oracle/local/bin/RMAN_arch_backup_BO.ksh MP3 n 2 35 >/oracle/local/log/RMAN_arch_backup_MP3.log 2>&1)
May 27 15:40:01 node123 cron[8370]: (root) CMD (/var/tmp/pshistory >& /dev/null)
May 27 15:40:01 node123 cron[8375]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 -S ALL 1 1)
May 27 15:41:43 node123 syslog-ng[4498]: Log statistics; dropped=‘pipe(/dev/xconsole)=0’, dropped=‘pipe(/dev/tty10)=0’, dropped=‘udp(AF_INET(10.204.3.102:514))=0’, processed=‘center(queued)=426’, processed=‘center(received)=183’, processed=‘destination(messages)=181’, processed=‘destination(mailinfo)=2’, processed=‘destination(mailwarn)=0’, processed=‘destination(localmessages)=0’, processed=‘destination(newserr)=0’, processed=‘destination(mailerr)=0’, processed=‘destination(netmgm)=0’, processed=‘destination(warn)=20’, processed=‘destination(console)=19’, processed=‘destination(null)=0’, processed=‘destination(mail)=2’, processed=‘destination(xconsole)=19’, processed=‘destination(firewall)=0’, processed=‘destination(acpid)=0’, processed=‘destination(allmessages)=183’, processed=‘destination(newscrit)=0’, processed=‘destination(newsnotice)=0’, processed=‘source(src)=183’
May 27 15:45:01 node123 cron[8617]: (root) CMD (/var/tmp/pshistory >& /dev/null)
Thanks