(crossposted between linux-kernel and netfilter users because some traces do go into nf_* and I'm using the conntrack module) Hello; I had a very bizarre situation where four boxes in the same rack all simultaneously (within 30 minutes) hard-locked with Oops messages. The boxes don't even have the same function - two of them are MXs that face the Internet, and the other two are spamassassin spamd boxes that receive messages, put a tag header up top, and send them back to the MX for filtering. It was during a very (ridiculously) high-load situation where all four boxes were running at their limit for a few hours. The kernel running on them was 2.4.23, patched from netfilter's patch-o-matic to include the ipt_connlimit module. Connlimit was active on the MXs, but not on the spamd boxes. .config-file is attached. I captured four panic messages. The results of ksymoops on these are attached. The one with a 'kernel BUG at buffer.c:539!' was a spamd box, the other three are from MXs (RESULT-4 is from one of the MXs crashing after this incident). Sadly, the stack traces don't appear to necessarily have anything to do with each other. The boxes have since had sparodic crashes, but not in unison like the first time. The MXs have been backgraded to 2.4.22+connlimit patch. One of them has crashed since, I could not grab the output as it was rebooted via a remote power box. Hardware: MXs: Dell PE 1650, dual P3 1.13GHz, 1024MB RAM, aacraid scsi controller with 15k rpm drives. spamd: Dell PE 1750, dual P4 'Xeon' 2.4GHz, 1024MB RAM, Fusion MPT scsi controller. The MXs have been functioning for a year as either nameservers or MXs-that-don't-do-spamassassin for some time flawlessly. The spamd boxes are new. Software: All debian woody, linux 2.4.23 with ipt_connlimit as module (not loaded on the spamd machines). MXs run qmail and thrash the disk quite a bit. All filesystems reiserfs. SMP kernels, no ACPI or APM etc. Is there a kernel problem here? If not, does anyone familiar with the chi of kernel stacktraces have any advice? Thanks, - Erik Bourget
Attachment:
config-2.4.23
Description: config
Attachment:
RESULT-1
Description: Binary data
Attachment:
RESULT-2
Description: Binary data
Attachment:
RESULT-4
Description: Binary data