syslog-ng setup issue

I have approximately 32 SLES 11 SP3 servers running on System z (mainframe) and am trying to figure out why log files are not being treated the same on all of the servers. I am trying to determine why a few servers are not processing the log files the same as the majority of the servers are.

Syslog-ng is working properly on most of the servers in that the various messages are being logged in /var/log into files such as firewall, mail, mail.info, mail.warn, messages, etc. Logrotate is being invoked and is creating .bz2 files.

.
.
.
-rw-r----- 1 root root 1678154 Feb 4 06:49 firewall
-rw-r----- 1 root root 162674 Nov 29 00:00 firewall-20131129.bz2
-rw-r----- 1 root root 173509 Dec 6 00:00 firewall-20131206.bz2
-rw-r----- 1 root root 185288 Dec 13 00:00 firewall-20131213.bz2
-rw-r----- 1 root root 184341 Dec 20 00:00 firewall-20131220.bz2
-rw-r----- 1 root root 178501 Dec 28 00:00 firewall-20131228.bz2
-rw-r----- 1 root root 164442 Jan 4 00:00 firewall-20140104.bz2
-rw-r----- 1 root root 193302 Jan 11 00:00 firewall-20140111.bz2
-rw-r----- 1 root root 193204 Jan 18 00:00 firewall-20140118.bz2
-rw-r----- 1 root root 186124 Jan 25 00:00 firewall-20140125.bz2
-rw-r----- 1 root root 190887 Feb 1 00:00 firewall-20140201.bz2
.
.
.
-rw-r----- 1 root root 2878985 Feb 4 06:50 messages
-rw-r----- 1 root root 251563 Apr 12 2013 messages-20130412.bz2
-rw-r----- 1 root root 190486 May 18 2013 messages-20130518.bz2
-rw-r----- 1 root root 191482 Jun 23 2013 messages-20130623.bz2
-rw-r----- 1 root root 186454 Jul 28 2013 messages-20130728.bz2
-rw-r----- 1 root root 191182 Sep 1 00:00 messages-20130901.bz2
-rw-r----- 1 root root 192293 Oct 7 00:00 messages-20131007.bz2
-rw-r----- 1 root root 198968 Nov 11 00:00 messages-20131111.bz2
-rw-r----- 1 root root 179388 Dec 5 00:00 messages-20131205.bz2
-rw-r----- 1 root root 183243 Dec 28 00:00 messages-20131228.bz2
-rw-r----- 1 root root 182494 Jan 20 00:00 messages-20140120.bz2
.
.
.

The servers I’m seeing ‘the problem’ on are not logging into /var/log files firewall, mail, mail.info, mail.warn, messages, etc. The files firewall, mail, mail.info, mail.warn, messages, etc. have a size of zero bytes. The messages are being logged into other files though.

.
.
.
-rw-r----- 1 root root 0 Feb 2 00:00 firewall
-rw-r----- 1 root root 5367051 Jan 26 07:44 firewall-20140112
-rw-r----- 1 root root 0 Jan 12 00:00 firewall-20140119
-rw-r----- 1 root root 0 Jan 19 00:00 firewall-20140126
-rw-r----- 1 root root 2796508 Feb 4 06:57 firewall-20140202
.
.
.
-rw-r----- 1 root root 0 Feb 2 00:00 messages
-rw-r----- 1 root root 2828304 Jan 26 07:52 messages-20140112
-rw-r----- 1 root root 0 Jan 12 00:00 messages-20140119
-rw-r----- 1 root root 0 Jan 19 00:00 messages-20140126
-rw-r----- 1 root root 1473506 Feb 4 06:55 messages-20140202
.
.
.

In the previous, last, file list, the firewall messages are being logged into firewall-20140202 and the system messages are being logged into messages-20140202.

I have compared the /etc/syslog-ng/syslog-ng.conf files and don’t see anything that could be causing the difference. The logging problem is not resolved by a reboot. The only way that I can get the messages logged into /var/log/firewall and /var/log/messages is to issue command ‘syslog-ng restart’. A reboot will cause the problem to come back.

What other files can I check to determine the cause of this?

Hi x0500hl,

the firewall messages are being logged into firewall-20140202 and the system messages are being logged into messages-20140202.
I have compared the /etc/syslog-ng/syslog-ng.conf files and don’t see anything that could be causing the difference.

I would not suspect the syslog configuration, but that of logrotate. Either the syslog-related config file in /etc/logrotate.d is bad or the syslog restart issued in there does not work, for some reason.

Logrotate will switch rename the log output file, “touch” the new one (without extension) and then should restart syslog, before compressing the renamed file. The latter two steps seem not to happen.

Regards,
Jens

Hi
What about the /etc/sysconfig/syslog file, all the same? When you say compared the conf files, is that via a visual look or via the diff command?

Jens,

The final two steps not occurring make sense but why are the messages still being logged into the *-20140202 files? Is it because the file(s) were renamed but the syslog-ng daemon is still writing to the renamed file(s)? If I manually run ‘syslog-ng restart’ the messages are logged to both places, i.e. messages and messages-20142002. There are two instances of syslog-ng running (ps -efH | less).

I will look harder into logrotate being the problem though.

Malcolm,

I compared the conf files by copying them to my PC and using the WinMerge program to compare them. I also used WinMerge to compare the /etc/sysconfig/syslog files and it said that they were identical.

I will run logrotate in debug mode to see what messages are issued.

Harley

Hi Harley,

[QUOTE=x0500hl;19047]Jens,

The final two steps not occurring make sense but why are the messages still being logged into the *-20140202 files? Is it because the file(s) were renamed but the syslog-ng daemon is still writing to the renamed file(s)? If I manually run ‘syslog-ng restart’ the messages are logged to both places, i.e. messages and messages-20142002. There are two instances of syslog-ng running (ps -efH | less).

I will look harder into logrotate being the problem though.[/QUOTE]

a simple rename of an (already opened) file will not actually change the file - it’s only the directory entry that is modified. Therefore, any process having an open file handle will still be able to use that handle - as is syslog-ng. To you it may appear as a different file, but it’s not… it only got a new “label” :wink:

Same thing goes for deleting files: You’re just removing the directory entry, not the open file handles… the file (in terms of “ordered collection of disk space”) only gets deleted once all references to it (directory entries AKA “hard links”, open file handles) are closed. That’s why your free disk space isn’t increasing after deleting that monster /var/log/messages file - until after a restart of syslog-ng.

Regards,
Jens

PS: See the “tail” command, too: With “-f” (lower-case), you’ll follow the actual file (at file system level), even if the file gets renamed or it’s directory entry deleted. With “-F” (capital), “tail” will check if the file name has been renamed and switches to following any new file of the original name, if available.

Hi Harley,

I missed to answer that part: It is indeed strange that you have two instances running. Might it be that somehow the pid file (/var/run/syslog-ng.pid) gets deleted, i.e. by some over-active clean-up job? Then the rc scripts wouldn’t find the currently active instance (to stop it) and will just start another one.

Regards,
Jens

I believe that I found the cause of my issue, though I won’t know for sure until one of the files exceeds the size for rotation to automatically occur. I found a message in /var/log/messages-20140202 stating that there was a duplicate entry in a line (I should have checked the messages for logrotate before I opened this thread).

On the servers where everything was working fine the /var/logrotate.d/syslog file contained the line

/var/log/warn /var/log/messages /var/log/allmessages /var/log/localmessages /var/log/firewall /var/log/acpid /var/log/NetworkManager {

On the servers that had the problem the /var/logrotate.d/syslog file contained the line

/var/log/firewall /var/log/warn /var/log/messages /var/log/allmessages /var/log/localmessages /var/log/firewall /var/log/acpid /var/log/NetworkManager {

Note that there are two '/var/log/firewall ’ entries in the line.

Maybe this will aid someone else in the future that has the same problem.