Stratus fault-tolerant servers encounter hangs of khubd when, there is a suprise removal of a UHCI controller from the PCI bus accessible to the CPU. When a device breaks, PCI busses are disconnected to avoid corruption of the system state by an errant device. Within a few milliseconds, the Stratus management software detects the break and calls pci_remove_bus_device() for the associated devices. Following surprise removal, there is a hang with khubd stuck in usb_kill_urb(), waiting for the urb to be released. The call stack at this point is: usb_remove_hcd()--> usb_disconnect() --> usb_disable_device() --> usb_disable_endpoint() --> usb_flush_endpoint() --> usb_kill_urb() --> wait_event() Since the device has been removed from the PCI bus, there will be no more interrupts from the device; so uhci_scan_schedule() is not called. uhci_unlink_qh() does the right thing by forcing an interrupt. However, usb_hcd_poll_rh_status() is not doing the polling because hcd->rh_registered has already been set to 0 in usb_remove_hcd(). When that happens, the urb is never released and khubd waits forever. The following patch is addresses this issue. Signed-off-by: Robert N. Evans <evans.robert.n@xxxxxxxxx> --- linux-2.6.orig/drivers/usb/core/hcd.c 2010-05-30 20:21:02.000000000 +0000 +++ linux-2.6/drivers/usb/core/hcd.c 2010-06-03 02:45:22.000000000 +0000 @@ -667,8 +667,6 @@ void usb_hcd_poll_rh_status(struct usb_h unsigned long flags; char buffer[6]; /* Any root hubs with > 31 ports? */ - if (unlikely(!hcd->rh_registered)) - return; if (!hcd->uses_new_polling && !hcd->status_urb) return; -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html