On Mon, Sep 24, 2012 at 12:59:51PM -0400, Alan Stern wrote: > On Mon, 17 Sep 2012, Alan Stern wrote: > > > On Mon, 17 Sep 2012, Don Zickus wrote: > > > > > I have added some in-depth analysis from our customer. > > > > > > > > > The problem is that the failing routine was called with a pointer to usbdev in > > > memory that has already been freed and overwritten with the pool poison pattern. > > > This causes an access violation on line 502: > > > > The stack trace shows that the failure occurred in usb_device_dump(), > > which was called from usb_device_read(). Since it wasn't a recursive > > call, the usbdev argument must have pointed to the root-hub device. > > > > I don't see how this could have happened if the kernel included the > > patch I sent earlier. usb_device_read() holds the usb_bus_list_lock > > mutex throughout, and it tests that the root hub's devnum field is not > > equal to 0 before calling usb_dump_device(). > > > > Meanwhile, usb_remove_hcd() sets hcd->self.bus.root_hub->devnum to 0 > > while holding usb_bus_list_lock, and it doesn't deallocate the root hub > > device until after the mutex is dropped. > > > > Are you certain that the test was conducted with the patch in place? > > > > If you want to track down what's going wrong, you'll have to add some > > debugging code to usb_device_read() and usb_remove_hcd(). By the time > > usb_device_dump() starts running, it's already too late. > > After thinking about this some more, I realized that my patch still > leaves a race -- although the oops would occur in a different place > (where usb_device_read checks bus->root_hub->devnum). > > Here's a different patch which should work better. It relies on the > rh_registered flag in the usb_hcd structure, which persists as long as > the usb_bus structure does, rather than on anything stored in the > root-hub device structure. Thanks! I will pass this on. Sorry for the lack of feedback. The customer I am coordinating with was on vacation last week. I am still trying to set up their machine on my end to help debug this faster. I appreciate you still thinking about this. Cheers, Don > > Alan Stern > > > > Index: usb-3.6/drivers/usb/core/devices.c > =================================================================== > --- usb-3.6.orig/drivers/usb/core/devices.c > +++ usb-3.6/drivers/usb/core/devices.c > @@ -624,7 +624,7 @@ static ssize_t usb_device_read(struct fi > /* print devices for all busses */ > list_for_each_entry(bus, &usb_bus_list, bus_list) { > /* recurse through all children of the root hub */ > - if (!bus->root_hub) > + if (!bus_to_hcd(bus)->rh_registered) > continue; > usb_lock_device(bus->root_hub); > ret = usb_device_dump(&buf, &nbytes, &skip_bytes, ppos, > Index: usb-3.6/drivers/usb/core/hcd.c > =================================================================== > --- usb-3.6.orig/drivers/usb/core/hcd.c > +++ usb-3.6/drivers/usb/core/hcd.c > @@ -1011,10 +1011,7 @@ static int register_root_hub(struct usb_ > if (retval) { > dev_err (parent_dev, "can't register root hub for %s, %d\n", > dev_name(&usb_dev->dev), retval); > - } > - mutex_unlock(&usb_bus_list_lock); > - > - if (retval == 0) { > + } else { > spin_lock_irq (&hcd_root_hub_lock); > hcd->rh_registered = 1; > spin_unlock_irq (&hcd_root_hub_lock); > @@ -1023,6 +1020,7 @@ static int register_root_hub(struct usb_ > if (HCD_DEAD(hcd)) > usb_hc_died (hcd); /* This time clean up */ > } > + mutex_unlock(&usb_bus_list_lock); > > return retval; > } > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html