On Wed, 19 Aug 2009 15:59:10 -0400 (EDT) Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Wed, 19 Aug 2009, Pete Zaitcev wrote: > > Our friends at Stratos found that uhci-hcd crashes if they remove > > the UHCI hardware, like so: > > > > ACPI: PCI interrupt for device 0000:7c:00.1 disabled > > Trying to free nonexistent resource <00000000a8000000-00000000afffffff> > > Trying to free nonexistent resource <00000000a4800000-00000000a480ffff> > > uhci_hcd 0000:7e:1d.0: remove, state 1 > > usb usb2: USB disconnect, address 1 > > usb 2-1: USB disconnect, address 2 > > Unable to handle kernel paging request at 0000000000100100 RIP: > > [<ffffffff88021950>] :uhci_hcd:uhci_scan_schedule+0xa2/0x89c > > > > #4 [ffff81011de17e50] uhci_scan_schedule at ffffffff88021918 > > #5 [ffff81011de17ed0] uhci_irq at ffffffff88023cb8 > > #6 [ffff81011de17f10] usb_hcd_irq at ffffffff801f1c1f > > #7 [ffff81011de17f20] handle_IRQ_event at ffffffff8001123b > > #8 [ffff81011de17f50] __do_IRQ at ffffffff800ba749 > > > > They found that when uhci_stop is running, an interrupt may come > > and run over the schedule which was already freed, thus the crash. > > > > The usual complication is that they are testing an ancient 2.6.18 > > based kernel, but to my eye it looks like the 2.6.31-rc6 should > > be affected as well. > > > > It looks to me that the issue must be an interrupt that's delivered > > too late for some reason, perhaps because the uhci_hc_died, although > > stops the HC correctly, does not flush a pending interrupt. But > > since testing is limited, I'm tempted to apply some kind of hammer, > > like this: > > > > --- a/drivers/usb/host/uhci-hcd.c > > +++ b/drivers/usb/host/uhci-hcd.c > > @@ -734,6 +735,7 @@ static void uhci_stop(struct usb_hcd *hcd) > > if (test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags) && !uhci->dead) > > uhci_hc_died(uhci); > > uhci_scan_schedule(uhci); > > + uhci->scan_in_progress = 1; /* Trick: racing IRQs can crash */ > > spin_unlock_irq(&uhci->lock); > > > > del_timer_sync(&uhci->fsbr_timer); > > > > Any other thoughts how to fix this? > > I don't like this approach because there are other pathways that can > cause similar errors. For example, where uhci_irq() calls > sprint_schedule() or usb_hcd_poll_rh_status(). > > How about calling synchronize_irq() in uhci_stop() just before > del_timer_sync() instead? Once that completes, the controller should > be totally idle with no pending interrupts. Any further calls to > uhci_irq() should return IRQ_NONE immediately. The suggestion of synchronize_irq was tested to work (albeit on 2.6.18), I'll post a patch shortly. -- Pete -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html