Mystery RX packet drops on SLES11 SP2 every 30 sec

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
[color=blue]

The issue is that the kernel devs could have made their own
“counter” for their “unknown” drops, which, are in reality, the
kernel devs not keeping up with network protocols, not the other way
around.[/color]

What protocol, specifically, is on the wire and should be supported by
the Linux kernel on your SLES machine? Have you searched for relevant
kernel modules to support that protocol? Have you submitted this to the
kernel developers? For some reason, perusing this thread, I thought you
were looking for an explanation on dropped packets, not for missing
functionality because of a lack of support for third-party protocols
which do not affect your server (other than by incrementing a counter of
packets dropped).
[color=blue]

We’ve found the SLES11 kernel doesn’t understand several cisco
protocols (really??), and bonding protocols even.[/color]

Again, ones that you need? Which ones specifically? If support for
something useful is missing then I’m sure engineering will consider
adding it if possible. For many of these cases, though, protocols are
not supported out of the box because they do not matter to the purpose
of the box. Cisco protocols… if you’re talking about proprietary
protocols then this doesn’t seem very important at all. If you’re
talking about protocols that Cisco happens to implement along with the
rest of the world (BGP, RIP, etc.) then I’d expect Linux can handle
those, but those modules may be omitted out of the box to save
time/space. If that’s the case you’re welcome to change that by adding
more modules.
[color=blue]

If we need to “see” unknown packets, put it in a different counter.
There are already LEGITIMATE drops in the DROP counter, don’t need to
see the unknown ones here too.[/color]

Sounds like a valid enhancement request.
[color=blue]

There is no where, where I can find, that says, to be a Kernel 3.0
up user, you need to clean up your unknown tiny packets they may be
lurking on your network, because we think its important to count
them…[/color]

Indeed, there is not; nor should there be, though. Your system works
just fine dropping unknown packets as it always has. I’ve heard of two
customers now who cared (you were the second, and the first prompted the
TID) so perhaps this will become a bigger issue as more move to SLES 11
SP2 (or as other non-SLES customers move to the late 2.x and 3+ kernels)
so if that’s the case your enhancement request, or even a bug, may be
the right course of actions to provide yet another counter for packets.
I doubt the developer implementing this did so to cause problems, but
rather to get a better picture of what really happens on the wire. That
they did not account for people watching for other types of dropped
packets as some kind of network health check seems like an honest
mistake, if that’s really what happened.

Good luck.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iQIcBAEBAgAGBQJQO8lHAAoJEF+XTK08PnB5MoMP/10LlrawKU1erdqs1abfI1EY
A6z09WEDWSrNZw4u5lVbaIqVQ9TXT0FBGKbwrfOrZQlm6yaC2zcV49AAi9r43KQZ
TC6fV84TstFHgUsYmoLLbads0z1Wbke1AOnlS/mGC2VC/lSmNW0USZBQ7S5mvv5A
GAMK/abWSMvU/6UfC91qdyI/qa4GvKwU0Su/0NIjm3xn3N2NIbeiEtTOMd1Xp/ng
2b5BT4Ac64gHjQTKGzKvROskG6yta5VbBwGMuoL875uS7dhwuthjFHb/MUK5VX6A
FsgHpd2ZafGQDG8UpI1t+Ki4V1bztUe1BZ+sNperHmO9WlJV1ZN1Zj4Eci1hulU0
kYOv5t2rwMgoVAwBDp/VK7MhzIkPjict7PkivAoR1fy7oX9CtXffPaInWrMWAgjZ
EsBttipPq5x37RAMMihjQEtP0hTQ4F4oiFp5jL4xG2WAUjcL0rB3iVkI2etmQbyE
KbQf+VF42gAHVHzHGkCey+w8rgnl+/6YT2txVbrE+ROOp78FO6G/U5ZqYFpF8lQ+
jLt3XIY8HpEE+AadjAHtATbebopGryIziid/Bc5twGr0ILdDD/sI0JjeykmCpFRz
8JeD7k0EmtPa+5Ic+CVi2AdeCNOTjX1AbNfofBTBUfh+7Xo756okgucR1j35LBF3
8e/kBpGTbsvQrhw2K0Qu
=Sx7F
-----END PGP SIGNATURE-----

The “OUI Unknown” shown in tcp dump seems to be the ones that trigger the drops. We’ve eliminated a few that we have found, but it seems things like spanning tree, bonding, and the bnx2 loopback packets all make these happen (there was one protocol that was from a Microsoft terminal server that was causing one but can’t recall what the name was). I have a fedora box that has the same issues with bonding and dropping “unknown” packets (3.x kernel again).
It’s really an issue of monitoring. I use the drop counter (and others) to look for drops of important packets that show issues with data we care about. I don’t need to see a drop of a small arp packet that the kernel doesn’t understand. It needs its own counter and leave useful counters alone.

has any resolution been found for this? This is still the absolute dumbest non-error I’ve ever seen in a OS.

Have you asked the developers for a fix or change?

(that’s a change in the upstream Linux kernel - not SLES- nor SuSE-specific, so lkml-net might be a better place to ask your question)