On Wed, 19 Aug 2009, Pete Zaitcev wrote: > Hi, All: > > Our friends at Stratos found that uhci-hcd crashes if they remove > the UHCI hardware, like so: > > ACPI: PCI interrupt for device 0000:7c:00.1 disabled > Trying to free nonexistent resource <00000000a8000000-00000000afffffff> > Trying to free nonexistent resource <00000000a4800000-00000000a480ffff> > uhci_hcd 0000:7e:1d.0: remove, state 1 > usb usb2: USB disconnect, address 1 > usb 2-1: USB disconnect, address 2 > Unable to handle kernel paging request at 0000000000100100 RIP: > [<ffffffff88021950>] :uhci_hcd:uhci_scan_schedule+0xa2/0x89c > > #4 [ffff81011de17e50] uhci_scan_schedule at ffffffff88021918 > #5 [ffff81011de17ed0] uhci_irq at ffffffff88023cb8 > #6 [ffff81011de17f10] usb_hcd_irq at ffffffff801f1c1f > #7 [ffff81011de17f20] handle_IRQ_event at ffffffff8001123b > #8 [ffff81011de17f50] __do_IRQ at ffffffff800ba749 > > They found that when uhci_stop is running, an interrupt may come > and run over the schedule which was already freed, thus the crash. > > The usual complication is that they are testing an ancient 2.6.18 > based kernel, but to my eye it looks like the 2.6.31-rc6 should > be affected as well. > > It looks to me that the issue must be an interrupt that's delivered > too late for some reason, perhaps because the uhci_hc_died, although > stops the HC correctly, does not flush a pending interrupt. But > since testing is limited, I'm tempted to apply some kind of hammer, > like this: > > --- a/drivers/usb/host/uhci-hcd.c > +++ b/drivers/usb/host/uhci-hcd.c > @@ -734,6 +735,7 @@ static void uhci_stop(struct usb_hcd *hcd) > if (test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags) && !uhci->dead) > uhci_hc_died(uhci); > uhci_scan_schedule(uhci); > + uhci->scan_in_progress = 1; /* Trick: racing IRQs can crash */ > spin_unlock_irq(&uhci->lock); > > del_timer_sync(&uhci->fsbr_timer); > > Any other thoughts how to fix this? I don't like this approach because there are other pathways that can cause similar errors. For example, where uhci_irq() calls sprint_schedule() or usb_hcd_poll_rh_status(). How about calling synchronize_irq() in uhci_stop() just before del_timer_sync() instead? Once that completes, the controller should be totally idle with no pending interrupts. Any further calls to uhci_irq() should return IRQ_NONE immediately. It looks like similar problems might affect the other HCDs. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html