Some OHCI controllers (most notably those made by NVIDIA, but others too) sometimes lose track of completed Transfer Descriptors. When a TD completes, the controller is supposed to add it to the start of the Done Queue, to let the driver know the transfer is finished. The buggy controllers occasionally fail to do this. ohci-hcd already contains a couple of ad-hoc mechanisms for dealing with these failures. One of the quirk handlers (for Compaq ZF Micro) looks for lost TDs on an interrupt endpoint. In addition, the driver recognizes that whenever a TD is on the Done Queue, all the earlier TDs for the same endpoint must have completed as well, even if they aren't on the Done Queue. Still, these mechanisms don't handle all the possible scenarios. Lost TDs have been observed for non-interrupt endpoints, and if the lost TD is the last one in a transfer then there might not be anything following it in the Done Queue. This patch series replaces the ad-hoc mechanisms with a general approach. A new I/O watchdog routine runs every 200 ms as long as there are any active URBs. The routine scans the lists of TDs, looking for any which have completed but haven't shown up in the Done Queue, and takes care of them. (This will add a small amount of overhead, but OHCI has never been high-throughput.) The routine also checks for controllers malfunctioning so badly that they are unusable, and declares them dead. Making these changes requires a certain amount of care, because the controller might add a TD to the Done Queue any time up to a millisecond after the TD completes. The watchdog routine has to make sure it doesn't race with the hardware, and the done list (the driver's equivalent of the hardware's Done Queue) has to be treated differently from the way it is now. Also, there will be two pathways by which URBs may complete: the hardware IRQ handler and the watchdog routine. This requires the driver to make sure that URB completions are always single-threaded. The first four patches in this series remove the ad-hoc zfmicro quirk and make other preliminary adjustments. The last two patches add the I/O watchdog and add to it a check for a non-updating frame counter (another type of hardware problem observed in the field). In the past, users have reported controller failures like these that ended up hanging the kernel's USB stack. With these changes in place, the hardware problems will show up as graceful failures, leaving the rest of the USB subsystem intact. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html