Hi, I had fun with a remote system (custom vanilla kernel 2.6.24.4), a NIC sort-of died all of a sudden. (note: eth4 is udev's idea, not mine - the box has just 2 NICs). /var/log/messages: Jan 19 09:25:59 A kernel: NETDEV WATCHDOG: eth4: transmit timed out Jan 19 09:26:02 A kernel: eth4: link up, 100Mbps, full-duplex, lpa 0xFFFF Jan 19 09:26:11 A kernel: NETDEV WATCHDOG: eth4: transmit timed out Jan 19 09:26:14 A kernel: eth4: link up, 100Mbps, full-duplex, lpa 0xFFFF [goes on for 8 more minutes] /var/log/debug: Jan 19 09:26:02 A kernel: eth4: Transmit timeout, status ff ffff ffff media ff. Jan 19 09:26:02 A kernel: eth4: Tx queue start entry 73297348 dirty entry 73297344. Jan 19 09:26:02 A kernel: eth4: Tx descriptor 0 is ffffffff. (queue head) Jan 19 09:26:02 A kernel: eth4: Tx descriptor 1 is ffffffff. Jan 19 09:26:02 A kernel: eth4: Tx descriptor 2 is ffffffff. Jan 19 09:26:02 A kernel: eth4: Tx descriptor 3 is ffffffff. Jan 19 09:26:14 A kernel: eth4: Transmit timeout, status ff ffff ffff media ff. Jan 19 09:26:14 A kernel: eth4: Tx queue start entry 4 dirty entry 0. Jan 19 09:26:14 A kernel: eth4: Tx descriptor 0 is ffffffff. (queue head) Jan 19 09:26:14 A kernel: eth4: Tx descriptor 1 is ffffffff. Jan 19 09:26:14 A kernel: eth4: Tx descriptor 2 is ffffffff. Jan 19 09:26:14 A kernel: eth4: Tx descriptor 3 is ffffffff. [goes on for 8 more minutes] (FWIW, there also was a BUG with trace info via dmesg but I did not save that info, nor did it show up in any logfile, doh) System was rebooted but instead of getting rid of the problem, it showed eth4 was not usable anymore. It did not show up in dmesg/via ifconfig and the corresponding lspci -vvv entry states: 00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Unknown device 8119 (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. Unknown device 8119 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR+ INTx- Latency: 32 (16000ns max) Interrupt: pin A routed to IRQ 12 Region 0: I/O ports at a400 [size=256] Region 1: Memory at d5800000 (32-bit, non-prefetchable) [size=256] [virtual] Expansion ROM at 30000000 [disabled] [size=64K] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- I replaced the NIC and everything works as expected: 00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Subsystem: Allied Telesyn International AT-2500TX/ACPI Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 32 (8000ns min, 16000ns max) Interrupt: pin A routed to IRQ 12 Region 0: I/O ports at a400 [size=256] Region 1: Memory at d5800000 (32-bit, non-prefetchable) [size=256] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME+ Kernel driver in use: 8139too Kernel modules: 8139too, 8139cp Just for the fun of it, I plugged the faulty NIC into a sole win machine, it worked. Doesn't work in the linux machine afterwards, though. First time that I observe such strange behaviour, any ideas what could be the cause? -- left blank, right bald
Attachment:
pgpJeF2uDHqPP.pgp
Description: PGP signature