Re: race in usb_get_from_anchor?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 31, 2010 at 06:24:33PM +0200, Christian Lamparter wrote:
> (No maintainer for usb/core?)

Yes, did you look in the MAINTAINERS file?  Or use
scripts/get_maintainer.pl?  It should show you the proper name...

> First a bit of background:
> 
> I'm developing a driver for the AR9170 USB 802.11n Wireless device.
> And if you are interested in this thing, the driver code + fw is available at:
> http://www.kernel.org/pub/linux/kernel/people/chr/carl9170/carl9170-1.7.4-pre/
> 
> Now, back to my problem:
> 
> The driver uses a sophisticated 3 anchor layout:
> 	1. rx_pool => a pool of prepared urbs, which can be
> 		submitted to the device anytime.

Why a pool?  Is allocating new ones somehow too slow?  If so, why?

> 	2. rx_work => storage for received urbs, which will be
> 		processed in the rx tasklet.
> 
> 	3. rx_anchor => all active rx urbs (i.e.: those which have been
> 		submitted to the device)
> 
> This setup allows the rx urb callback, the rx processing tasklet
> routine (and obviously: the initialization code) to restock
> rx_anchor, whenever they are running... 
> 
> So far so good. I tested this approach and it worked as long
> as I was testing it on a single-core machine...
> But as soon as I moved to a SMP, I got:
> 	* freezes or Oopses from usb core routines:
> 		usb_poison_anchor_urbs,
> 		usb_kill_anchored_urbs,
> 		usb_get_from_anchor
> 
> 	* CONFIG_DEBUG_LIST constantly complained
> 	  about mysterious link corruptions
> 
> 	* unexplainable rx data corruptions / loss
> 
> and a long time, I assumed it was caused by a bug in the driver, but
> I couldn't find it, so I moved to the usb core and found this:
> 
>  0.	struct urb *usb_get_from_anchor(struct usb_anchor *anchor)
>  1.	{
>  2.		struct urb *victim;
>  3.		unsigned long flags;
>  4.
>  5.		spin_lock_irqsave(&anchor->lock, flags);
>  6.		if (!list_empty(&anchor->urb_list)) {
>  7.			victim = list_entry(anchor->urb_list.next, struct urb,
>  8.								anchor_list);
>  9.				usb_get_urb(victim);
> 10.				spin_unlock_irqrestore(&anchor->lock, flags);
> 11.				usb_unanchor_urb(victim);
> 12.		} else {
> 13.			spin_unlock_irqrestore(&anchor->lock, flags);
> 14.			victim = NULL;
> 15.		}
> 16.
> 17.		return victim;
> 18.}
> 
> Take a close look at line 10. and 11. 
> 
> We are releasing the anchor->lock, unanchor the urb and return
> the "victim". But we never tested if usb_unanchor_urb actually
> removed the urb from the anchor, or if "we've lost the race
> to another thread --- see comment from usb_unanchor_urb" and
> a diffrent usb_get_from_anchor in a elevated context
> (or on a different CPU) went through the same loop and
> stole the urb (and might have put it into a different anchor already)! 
> 
> I've attached a patch which does fix the problem (for me).
> Let me know what you guys think, is this a genuine bug,
> or if I'm doing this all wrong.
> 
> Regards,
> 	Chr
> ---
> diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c
> index 7c05555..93f8f4c 100644
> --- a/drivers/usb/core/urb.c
> +++ b/drivers/usb/core/urb.c
> @@ -137,6 +137,16 @@ void usb_anchor_urb(struct urb *urb, struct usb_anchor *anchor)
>  }
>  EXPORT_SYMBOL_GPL(usb_anchor_urb);
>  
> +static void __usb_unanchor_urb(struct urb *urb, struct usb_anchor *anchor)
> +{
> +	urb->anchor = NULL;
> +
> +	list_del(&urb->anchor_list);
> +	usb_put_urb(urb);
> +	if (list_empty(&anchor->urb_list))
> +		wake_up(&anchor->wait);

You are calling wake_up with a spinlock held, is that ok?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux