Andrew, Patch #2 deals with how to make sure Tulip is no longer doing DMA before deallocating the TX or RX descriptor rings.
IIRC, This bug was found by "ifconfig down" on an active link. ISTR this patch also fixes the MCA caused by "yank the LAN cable out" (on active load too). In both cases, the RX/TX decriptors were deallocated BEFORE pending DMA had completed and caused an MCA on ia64. I expect parisc and other arches with IOMMUs are susceptible to the same problem.
There are two possible fixes: a) poll in tulip_stop_rxtx() until chip says DMA is done. Credit goes to Charlie Brett (HP) for figuring this out.
b) disable PCI_COMMAND bus master enable before deallocating descriptor rings.
Jeff didn't like (a) for the same reason he didn't like the tulip phy reset patch. (delay loop while interrupts are blocked).
(b) was my first hack and I could not find an explaination from him why he didn't like (b) in my mail archive. Maybe WoL support - but WoL seems to use sideband signals to wake the host and may not need bus master. Barring some reason, it seems prudent to just completely disable the device on shutdown. But I've not included this fix here.
"DEC" 21143 HW ref manual is available from: http://www.intel.com/design/network/products/lan/docs/21143_docs.htm
The patch below uses CSR5 because that's what I'm told HPUX uses. HPUX has been exclusively using tulip chips with parisc since tulip became available in mid 1990's. I know CSR5 works. Table 3-68 (TX Process State) and Table 3-69 (RX Process state) on page 3-44 describe CSR5 bits.
But CSR6 could be polled as well. Charlie Brett's original patch used CSR6 and he tested it. Table 3-71, page 3-47 describes the ST/SR bits in CSR6. The code looks basically the same though.
Last I heard from you on this issue, _you_ agreed the problem was solved by proper ordering of unregister_netdevice() and pci_disable_device() in tulip_remove_one(), thereby eliminating the need for this patch.
Incorrect ordering of unregister_netdevice() in earlier kernels was causing there to be activity when there should not have been.
Further, I don't see the need to poll the chip state given all this...
Jeff
- : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html