We are using SuSE Linux 11.1 in our environment and all the servers are vitalized on Vmware ESX servers.
Recently there was a huge time drift on all the servers ranging from 1 minute to about 20 minutes.
But for some reason we were unable to find the cause of the drift.
So, was wondering is there any way we can monitor the time drift on all these servers, may be using a script or monitoring a certain log file?
sounds to me like the typical “VM/guest time drift” problem. AFAIR, this happens when the guest runs its own wallclock and depends on accurate interrupts from the host to increment the clock.
The subject line of this thread contains “NTP” - but you don’t mention if you’re actually running ntp inside the VMs and how your VMs’ wall clock is set up (dependant/host-based).
If you’re not (yet) running ntpd inside the VMs: To properly monitor the time drift, you need a “reference point”, a system who’s clock you trust. Then you need to take a time stamp snapshot on both the reference system and the monitored system and compare both. Depending on the accuracy you want to achieve, you might do this by running
ssh <vmhost> date +%s from the reference system to the vm and compare the result with the output of “date +%s” on the reference system.
If you’re not after monitoring, but rather want to have a more accurate time source within your VMs, then I suggest running ntpd, with independant wallclocks inside the VMs and a physical server or two (or even true NTP device) as time source. Still not as accurate as having a physically triggered RTC interrupt, but much better than what you have now