Re: [PATCH] usb/core: Fix race condition when removing EHCI PCI devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 24, 2012 at 12:59:51PM -0400, Alan Stern wrote:
> > If you want to track down what's going wrong, you'll have to add some 
> > debugging code to usb_device_read() and usb_remove_hcd().  By the time 
> > usb_device_dump() starts running, it's already too late.
> 
> After thinking about this some more, I realized that my patch still 
> leaves a race -- although the oops would occur in a different place 
> (where usb_device_read checks bus->root_hub->devnum).
> 
> Here's a different patch which should work better.  It relies on the
> rh_registered flag in the usb_hcd structure, which persists as long as
> the usb_bus structure does, rather than on anything stored in the
> root-hub device structure.

Hi Alan,

This patch seemed to be successful.  I copied and pasted the response from
our customer (we backported the patch to RHEL-6/2.6.32):

"
This new patch testing went very well.  In previous tests, the kernel paniced
during the first or second surprise removal of the ehci_hcd PCI device on an
idle system.  I ran an entire night of surprise and polite device removals with
no kernel panic or Oops.  slub_debug=FZPU was used to poison deallocated
storage blocks and check for use after free.  No BUGs were logged by the
allocator.

Numerous messages like the following were seen at the console, indicating that
the stimulus leading to the panic should be occurring:
 cat: /proc/bus/usb/001/006: No such device

Surprise removals occur when the device is electrically disconnected from the
PCI bus while in use.  During a following clean-up operation, the driver's
remove function called.

Polite removals occur when the driver's remove function is called while the
device is in use.  After return from the driver, the device is electrically
disconnected from the PCI bus.

I ran 5 hours of surprise and polite removals on the same idle system that was
used for the previous tests, about 60 trials of each type of removal.  Then a
workload was started to cause mid-level CPU, Memory and Disk stress; the test
continued to run for 8.5 hours more executing about 38 more trials of each type
of PCI removal.
"

Is there anything else you need from us or can we move forward with this
patch?

Thanks again!

Cheers,
Don

> 
> 
> 
> Index: usb-3.6/drivers/usb/core/devices.c
> ===================================================================
> --- usb-3.6.orig/drivers/usb/core/devices.c
> +++ usb-3.6/drivers/usb/core/devices.c
> @@ -624,7 +624,7 @@ static ssize_t usb_device_read(struct fi
>  	/* print devices for all busses */
>  	list_for_each_entry(bus, &usb_bus_list, bus_list) {
>  		/* recurse through all children of the root hub */
> -		if (!bus->root_hub)
> +		if (!bus_to_hcd(bus)->rh_registered)
>  			continue;
>  		usb_lock_device(bus->root_hub);
>  		ret = usb_device_dump(&buf, &nbytes, &skip_bytes, ppos,
> Index: usb-3.6/drivers/usb/core/hcd.c
> ===================================================================
> --- usb-3.6.orig/drivers/usb/core/hcd.c
> +++ usb-3.6/drivers/usb/core/hcd.c
> @@ -1011,10 +1011,7 @@ static int register_root_hub(struct usb_
>  	if (retval) {
>  		dev_err (parent_dev, "can't register root hub for %s, %d\n",
>  				dev_name(&usb_dev->dev), retval);
> -	}
> -	mutex_unlock(&usb_bus_list_lock);
> -
> -	if (retval == 0) {
> +	} else {
>  		spin_lock_irq (&hcd_root_hub_lock);
>  		hcd->rh_registered = 1;
>  		spin_unlock_irq (&hcd_root_hub_lock);
> @@ -1023,6 +1020,7 @@ static int register_root_hub(struct usb_
>  		if (HCD_DEAD(hcd))
>  			usb_hc_died (hcd);	/* This time clean up */
>  	}
> +	mutex_unlock(&usb_bus_list_lock);
>  
>  	return retval;
>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux