Output of "top" against "ps"

Hello,

we are using a Monitoring solution from BMC Software to monitor our
Server and Services. We have several SLES-Servers which we monitor, too.
We experienced a problem with the monitoring of a single process on a
SLES 11 SP1-Server. This Process (Sybase DB for our ZCM system) has
often a high processor utilization. The ouput of top is (the process id
is 30881):

top -p3633

top - 10:24:26 up 91 days, 8 min, 2 users, load average: 0.00, 0.00, 0.00
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.5%us, 0.3%sy, 0.0%ni, 99.2%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 1910912k total, 1825672k used, 85240k free, 81132k buffers
Swap: 2104472k total, 63872k used, 2040600k free, 357408k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3633 root 20 0 1678m 1.0g 1752 S 2 56.2 21193:51 dbsrv12

This shows -in my opion- the correct CPU Utilization. But the BMC
Software shows other values in CPU-Utilization. I asked the BMC-Support
and they told me that they use the following command to monitor the
Process Utilization:

UNIX95= ps -eo pid,comm,user,pcpu,vsz,etime,args | cat

This gives me the following output:

3633 dbsrv12 root 16.1 1719092 91-00:05:31
/opt/novell/zenworks/share/sybase/bin32/dbsrv12
@/etc/opt/novell/zenworks/zenworks_database.conf

The CPU-Util in this command differs from the top-output. It’s a real
problem if the CPU-Util is nearly 100%, the ps-command stays at 16.1%
and we don’t get informed from the Monitoring-System.
The BMC-support told me that this command is the Standard-command for
monitoring processes in their products and it works on all Unix-systems.
They advised me to ask the Novell-Support for this behaviour.
Does anybody know why the second ouput does not show the correct
CPU-Utilization?

Kind regards,
Thomas

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

They’re insane. See the ‘ps’ manpage for details:

%cpu %CPU cpu utilization of the process in “##.#” format.
Currently, it is the CPU time used divided by the time the process has
been running (cputime/realtime ratio), expressed as a percentage. It
will not add up to 100% unless you are lucky. (alias pcpu).

The important caveat here is that this is an average over ALL TIME so if
you run for a day and have the last hour at 100% utilization, you may
show 5% by the end of that, but you won’t show 100% even at the
beginning of it. This is, as far as I know, consistent across all *nix
platforms so use of this value for anything in realtime is folly.

Good luck.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJP4Gt6AAoJEF+XTK08PnB5oOUQALFAPTkF+RuxYN8Fc4ncfzwj
26cnaEHnaN/uDz9mu87Ng/5Qcf2D9lVDr04xb1qcVlfaY1BlQFrfjAa//qcoPOAR
ZxknozqC3FR/XM50LLJWDBLYw0/Ah/V5jLMAljLbzNhoBA0VbS5VyoMOqbZwwD2d
X8iqeMFM/EIW4jAx/vSf8EegpJ1jeoNqaWesk2X9lNmgUIigGDsVHV5i6NCwuS4h
ytYr0+F572ZHD2zvXXrhQJENeiJxDhgNtNYHEdSyi6Hq8e5kmJ1Mx44MG2Mv58N7
EpHfe3kDXrJwoRnphivJeKjWjiOIDGlOwK+nqQAVn9h15+TKmSzPjHYjmLCXKwSH
TlqTRsDS8k4SyR5kqtYAbldUgCb0rBHiGPvE3tlIeajOkPYdXLLujTNnMgJ921OR
W6vtLm84N/TWQ1UsEZ7dEWUABDJmA2e5wkm1P0HxGKiwWEJ+QWASQUyknslWBwqu
GI2FarvaLoPxQeY2kQOx0VDVvlppfXWiVBt5UCZ+psGJnj5W4ah2X1U6wlorFoBw
pFnPw9CGT1i2jDXmgTKHOCKIK+uDoW4UUs7I1zuu7PeLvnpalAFuAGtG4a9vg/L2
K56nFowIcclbisNSp7hg1AEgVgkS8CP5Hfrx4oBQP2IM+9VSBTN4aizzryWMHXz/
DLQiLrF+FVX/+GIVzAdl
=NSZz
-----END PGP SIGNATURE-----

Thanks for your info, you’re right. I never read the man pages up to the
bottom, sorry…
I’ll send this information to BMC.

Regards,
Thomas
[color=blue]

They’re insane. See the ‘ps’ manpage for details:

%cpu %CPU cpu utilization of the process in “##.#” format.
Currently, it is the CPU time used divided by the time the process has
been running (cputime/realtime ratio), expressed as a percentage. It
will not add up to 100% unless you are lucky. (alias pcpu).

The important caveat here is that this is an average over ALL TIME so if
you run for a day and have the last hour at 100% utilization, you may
show 5% by the end of that, but you won’t show 100% even at the
beginning of it. This is, as far as I know, consistent across all *nix
platforms so use of this value for anything in realtime is folly.

Good luck.[/color]