Dear list, I run a Compaq Proliant 1500 (dual Pentium 75.200) with hardware raid (Smart2) with two ethernet cards 3com905 (b or c, I can't tell you right now) as a firewall and web/mail virus scanner which (needless to say) needs to be up 7d/24h. Recently, during a pretty fast download the machine (ethernet technically, you could login on the console, even ping the ethernet ip address) locked up with the following error log: Oct 9 17:29:02 fwintern kernel: eth0: transmit timed out, tx_status 00 status e681. Oct 9 17:29:02 fwintern kernel: eth0: Interrupt posted but not delivered -- IRQ blocked by another device? Oct 9 17:29:02 fwintern kernel: Flags; bus-master 1, full 0; dirty 6543269 current 6543269. Oct 9 17:29:02 fwintern kernel: Transmit list 00000000 vs. cbef4250. Oct 9 17:29:02 fwintern kernel: 0: @cbef4200 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 1: @cbef4210 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 2: @cbef4220 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 3: @cbef4230 length 800005ea status 800105ea Oct 9 17:29:02 fwintern kernel: 4: @cbef4240 length 800005ea status 800105ea Oct 9 17:29:02 fwintern kernel: 5: @cbef4250 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 6: @cbef4260 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 7: @cbef4270 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 8: @cbef4280 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 9: @cbef4290 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 10: @cbef42a0 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 11: @cbef42b0 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 12: @cbef42c0 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 13: @cbef42d0 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 14: @cbef42e0 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: 15: @cbef42f0 length 800005ea status 000105ea Oct 9 17:29:02 fwintern kernel: eth0: Resetting the Tx ring pointer. Oct 9 17:29:02 fwintern kernel: Packet log: input DENY eth1 PROTO=17 10.1.1.200:138 10.1.2.2:138 L=257 S=0x00 I=34892 F=0x0000 T=126 (#3) Oct 9 17:29:12 fwintern kernel: eth0: transmit timed out, tx_status 00 status e601. Oct 9 17:29:12 fwintern kernel: eth0: Interrupt posted but not delivered -- IRQ blocked by another device? Oct 9 17:29:12 fwintern kernel: Flags; bus-master 1, full 0; dirty 6543285 current 6543285. Oct 9 17:29:12 fwintern kernel: Transmit list 00000000 vs. cbef4250. ... repeating ... The problem was reproducible (several times) with the same download (a 300MB file) after a reboot. There should be no irq collision on irq 9, /proc/interrupts: CPU0 CPU1 0: 4444438 4611949 IO-APIC-edge timer 1: 2077 2163 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 5: 390236 391327 IO-APIC-level eth1 8: 2 0 IO-APIC-edge rtc 9: 435924 436646 IO-APIC-level eth0 10: 17 17 IO-APIC-level ncr53c8xx 11: 24525 24737 IO-APIC-level ida0 13: 0 0 XT-PIC fpu NMI: 0 ERR: 0 No other hardware is installed. Another observation: Normally the free (non buffer) memory of that machine is 3MB (of 192MB). At the point of the crash it was down to 1.7MB. I don't know if this is the cause or just a symptom of the locked up ethernet card. Is this a known problem? If it's not a hardware/driver bug, I suspect an SMP related race condition. Maybe the bios did set up the cpu's / io-apics wrong? I should mention that now, two days later, I can no longer reproduce the crash. This might be due to 2 reasons: 1) The bandwidth when downloading from the web server was measurably slower (about 1/2) this time (about 600kbit (of a 2Mb link)) today. 2) the machine was power cycled. Unfortunately the personnel cannot remember if the problem was only reproducible after a warm reboot or also after a power cycle. Finally: /proc/version: Linux version 2.2.14 (root@fwintern) (gcc version 2.95.2 19991024 (release)) #4 SMP Thu Aug 3 20:32:32 CEST 2000 The recent discussion on linux-kernel seems to indicate that gcc 2.95 cannot compile a kernel at all. Could this be the cause here (the system was up since that Aug 3 flawlessly). Unfortunately there is no other version available on that system and the distributor (Suse) seems to deny any gcc 2.95 - kernel incompatibilies. Please advise, Michael. -- Michael Weller: eowmob@exp-math.uni-essen.de, eowmob@ms.exp-math.uni-essen.de, or even mat42b@spi.power.uni-essen.de. If you encounter an eowmob account on any machine in the net, it's very likely it's me. - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org