Re: [PATCH] 2.6.9-rc2 tulip_stop_rxtx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Grant Grundler wrote:
Andrew,
Patch #2 deals with how to make sure Tulip is no longer
doing DMA before deallocating the TX or RX descriptor rings.

IIRC, This bug was found by "ifconfig down" on an active link.
ISTR this patch also fixes the MCA caused by "yank the LAN cable out"
(on active load too).
In both cases, the RX/TX decriptors were deallocated BEFORE pending DMA
had completed and caused an MCA on ia64. I expect parisc and other
arches with IOMMUs are susceptible to the same problem.

There are two possible fixes:
a) poll in tulip_stop_rxtx() until chip says DMA is done.
   Credit goes to Charlie Brett (HP) for figuring this out.

b) disable PCI_COMMAND bus master enable before deallocating
   descriptor rings.

Jeff didn't like (a) for the same reason he didn't like the tulip
phy reset patch. (delay loop while interrupts are blocked).

(b) was my first hack and I could not find an explaination from him
why he didn't like (b) in my mail archive.  Maybe WoL support - but WoL
seems to use sideband signals to wake the host and may not need bus master.
Barring some reason, it seems prudent to just completely disable
the device on shutdown. But I've not included this fix here.

"DEC" 21143 HW ref manual is available from:
    http://www.intel.com/design/network/products/lan/docs/21143_docs.htm

The patch below uses CSR5 because that's what I'm told HPUX uses.
HPUX has been exclusively using tulip chips with parisc since
tulip became available in mid 1990's. I know CSR5 works.
Table 3-68 (TX Process State) and Table 3-69 (RX Process state)
on page 3-44 describe CSR5 bits.

But CSR6 could be polled as well. Charlie Brett's original patch
used CSR6 and he tested it.  Table 3-71, page 3-47 describes
the ST/SR bits in CSR6. The code looks basically the same though.


Last I heard from you on this issue, _you_ agreed the problem was solved by proper ordering of unregister_netdevice() and pci_disable_device() in tulip_remove_one(), thereby eliminating the need for this patch.

Incorrect ordering of unregister_netdevice() in earlier kernels was causing there to be activity when there should not have been.

Further, I don't see the need to poll the chip state given all this...

	Jeff


- : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux