On Mon, 15 Oct 2012, Don Zickus wrote: > Hi Alan, > > I am seeing an odd panic with uhci when a 160 cpu box panics and starts > running a kdump kernel (which is the same exact image as the boot kernel) > for our RHEL-6 (2.6.32) kernel. Now I understand 2.6.32 is not something > upstream supports. > > However, my question is what is expected to happen in the following > situtation. The attached panic below shows > > usb_hcd_pci_probe-> > usb_add_hcd-> > <snip> > /* enable irqs just before we start the controller, > * if the BIOS provides legacy PCI irqs. > */ > if (usb_hcd_is_primary_hcd(hcd) && irqnum) { > retval = usb_hcd_request_irqs(hcd, irqnum, irqflags); > if (retval) > goto err_request_irq; > } > > hcd->state = HC_STATE_RUNNING; > retval = hcd->driver->start(hcd); > if (retval < 0) { > dev_err(hcd->self.controller, "startup error %d\n", > retval); > goto err_hcd_driver_start; > } > <snip> > > That interrupts are setup and enabled before hcd->driver_start is called. > > Now in > > uhci_start-> > <snip> > uhci->term_td = uhci_alloc_td(uhci); > <snip> > > Happens. But what if an interrupt comes in before that call runs, for > example: > > uhci_irq-> > <snip> > if (status & USBSTS_RD) > usb_hcd_poll_rh_status(hcd); > else { > spin_lock(&uhci->lock); > uhci_scan_schedule(uhci); > spin_unlock(&uhci->lock); > } > <snip> > > In uhci_scan_schedule(uhci) > > uhci_scan_schedule-> > uhci_clear_next_interrupt-> > uhci->term_td->status &= ~cpu_to_hc32(uhci, TD_CTRL_IOC); > > This panics becase term_td is not allocated yet. > > Now I could be wrong about the interrupts and the uhci_start routine and > perhaps this is prevented somehow. This is why I am asking what is the > expectation for the above scenario. > I was just wondering if you had a quick thought about this or not. I suspect you're getting a shared interrupt, that is, one generated by another device using the same IRQ line and not by the UHCI controller itself. Can you check the IRQ assignments to see if that's possible? I'm reasonably sure that the controller didn't generate the bad interrupt, because it's not allowed to generate any IRQs until the USBINTR register is set to a nonzero value. That doesn't happen until start_rh() is called near the end of uhci_start(). Since uhci_irq() doesn't check to see whether the uhci->is_initialized flag has been set, a shared interrupt arriving too early could cause exactly the sort of problem you see here. I dislike adding an extra test to a hot path, but there doesn't seem to be any way around it. Some of the other HCDs may need a similar change. Does this patch fix the problem? Alan Stern Index: usb-3.6/drivers/usb/host/uhci-hcd.c =================================================================== --- usb-3.6.orig/drivers/usb/host/uhci-hcd.c +++ usb-3.6/drivers/usb/host/uhci-hcd.c @@ -447,6 +447,10 @@ static irqreturn_t uhci_irq(struct usb_h return IRQ_NONE; uhci_writew(uhci, status, USBSTS); /* Clear it */ + spin_lock(&uhci->lock); + if (!uhci->is_initialized) /* not yet configured */ + goto done; + if (status & ~(USBSTS_USBINT | USBSTS_ERROR | USBSTS_RD)) { if (status & USBSTS_HSE) dev_err(uhci_dev(uhci), "host system error, " @@ -455,7 +459,6 @@ static irqreturn_t uhci_irq(struct usb_h dev_err(uhci_dev(uhci), "host controller process " "error, something bad happened!\n"); if (status & USBSTS_HCH) { - spin_lock(&uhci->lock); if (uhci->rh_state >= UHCI_RH_RUNNING) { dev_err(uhci_dev(uhci), "host controller halted, " @@ -473,15 +476,15 @@ static irqreturn_t uhci_irq(struct usb_h * pending unlinks */ mod_timer(&hcd->rh_timer, jiffies); } - spin_unlock(&uhci->lock); } } - if (status & USBSTS_RD) + if (status & USBSTS_RD) { + spin_unlock(&uhci->lock); usb_hcd_poll_rh_status(hcd); - else { - spin_lock(&uhci->lock); + } else { uhci_scan_schedule(uhci); + done: spin_unlock(&uhci->lock); } @@ -662,9 +665,9 @@ static int uhci_start(struct usb_hcd *hc */ mb(); + spin_lock_irq(&uhci->lock); configure_hc(uhci); uhci->is_initialized = 1; - spin_lock_irq(&uhci->lock); start_rh(uhci); spin_unlock_irq(&uhci->lock); return 0; -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html