Re: race in usb_get_from_anchor?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 31 Jul 2010, Christian Lamparter wrote:

> (No maintainer for usb/core?)

The core is maintained by multiple people.  This mailing list is the 
correct place for bug reports concerning usbcore.

> Hello,
> 
> First a bit of background:
> 
> I'm developing a driver for the AR9170 USB 802.11n Wireless device.
> And if you are interested in this thing, the driver code + fw is available at:
> http://www.kernel.org/pub/linux/kernel/people/chr/carl9170/carl9170-1.7.4-pre/
> 
> Now, back to my problem:
> 
> The driver uses a sophisticated 3 anchor layout:
> 	1. rx_pool => a pool of prepared urbs, which can be
> 		submitted to the device anytime.
> 
> 	2. rx_work => storage for received urbs, which will be
> 		processed in the rx tasklet.
> 
> 	3. rx_anchor => all active rx urbs (i.e.: those which have been
> 		submitted to the device)
> 
> This setup allows the rx urb callback, the rx processing tasklet
> routine (and obviously: the initialization code) to restock
> rx_anchor, whenever they are running... 
> 
> So far so good. I tested this approach and it worked as long
> as I was testing it on a single-core machine...
> But as soon as I moved to a SMP, I got:
> 	* freezes or Oopses from usb core routines:
> 		usb_poison_anchor_urbs,
> 		usb_kill_anchored_urbs,
> 		usb_get_from_anchor
> 
> 	* CONFIG_DEBUG_LIST constantly complained
> 	  about mysterious link corruptions
> 
> 	* unexplainable rx data corruptions / loss
> 
> and a long time, I assumed it was caused by a bug in the driver, but
> I couldn't find it, so I moved to the usb core and found this:
> 
>  0.	struct urb *usb_get_from_anchor(struct usb_anchor *anchor)
>  1.	{
>  2.		struct urb *victim;
>  3.		unsigned long flags;
>  4.
>  5.		spin_lock_irqsave(&anchor->lock, flags);
>  6.		if (!list_empty(&anchor->urb_list)) {
>  7.			victim = list_entry(anchor->urb_list.next, struct urb,
>  8.								anchor_list);
>  9.				usb_get_urb(victim);
> 10.				spin_unlock_irqrestore(&anchor->lock, flags);
> 11.				usb_unanchor_urb(victim);
> 12.		} else {
> 13.			spin_unlock_irqrestore(&anchor->lock, flags);
> 14.			victim = NULL;
> 15.		}
> 16.
> 17.		return victim;
> 18.}
> 
> Take a close look at line 10. and 11. 
> 
> We are releasing the anchor->lock, unanchor the urb and return
> the "victim". But we never tested if usb_unanchor_urb actually
> removed the urb from the anchor, or if "we've lost the race
> to another thread --- see comment from usb_unanchor_urb" and
> a diffrent usb_get_from_anchor in a elevated context
> (or on a different CPU) went through the same loop and
> stole the urb (and might have put it into a different anchor already)! 
> 
> I've attached a patch which does fix the problem (for me).
> Let me know what you guys think, is this a genuine bug,
> or if I'm doing this all wrong.

Yes, this is definitely a bug.  Your solution looks basically okay.  A
few comments are below.

> Regards,
> 	Chr
> ---
> diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c
> index 7c05555..93f8f4c 100644
> --- a/drivers/usb/core/urb.c
> +++ b/drivers/usb/core/urb.c
> @@ -137,6 +137,16 @@ void usb_anchor_urb(struct urb *urb, struct usb_anchor *anchor)
>  }
>  EXPORT_SYMBOL_GPL(usb_anchor_urb);
>  
> +static void __usb_unanchor_urb(struct urb *urb, struct usb_anchor *anchor)
> +{
> +	urb->anchor = NULL;
> +

Unnecessary blank line.

> +	list_del(&urb->anchor_list);
> +	usb_put_urb(urb);
> +	if (list_empty(&anchor->urb_list))
> +		wake_up(&anchor->wait);
> +}

Although it may be slightly less efficient to call usb_put_urb() and
wake_up() while still holding the spinlock, I think the difference
won't matter.

> +
>  /**
>   * usb_unanchor_urb - unanchors an URB
>   * @urb: pointer to the urb to anchor
> @@ -161,12 +171,9 @@ void usb_unanchor_urb(struct urb *urb)
>  		spin_unlock_irqrestore(&anchor->lock, flags);
>  		return;
>  	}
> -	urb->anchor = NULL;
> -	list_del(&urb->anchor_list);
> +
> +	__usb_unanchor_urb(urb, anchor);
>  	spin_unlock_irqrestore(&anchor->lock, flags);
> -	usb_put_urb(urb);
> -	if (list_empty(&anchor->urb_list))
> -		wake_up(&anchor->wait);
>  }
>  EXPORT_SYMBOL_GPL(usb_unanchor_urb);

Here you can rearrange the logic so that it makes more sense:

	...
	spin_lock_irqsave(&anchor->lock, flags)
	if (likely(anchor == urb->anchor))
		__usb_unanchor_urb(urb);
	spin_unlock_irqrestore(&anchor->lock, flags);
}


> @@ -799,8 +806,8 @@ struct urb *usb_get_from_anchor(struct usb_anchor *anchor)
>  		victim = list_entry(anchor->urb_list.next, struct urb,
>  				    anchor_list);
>  		usb_get_urb(victim);
> +		__usb_unanchor_urb(victim, anchor);
>  		spin_unlock_irqrestore(&anchor->lock, flags);
> -		usb_unanchor_urb(victim);
>  	} else {
>  		spin_unlock_irqrestore(&anchor->lock, flags);
>  		victim = NULL;

Here you can hoist the spin_unlock_irqrestore calls out of the "if" 
statement.

> @@ -827,9 +834,9 @@ void usb_scuttle_anchored_urbs(struct usb_anchor *anchor)
>  		victim = list_entry(anchor->urb_list.prev, struct urb,
>  				    anchor_list);
>  		usb_get_urb(victim);
> +		__usb_unanchor_urb(victim, anchor);
>  		spin_unlock_irqrestore(&anchor->lock, flags);
>  		/* this may free the URB */
> -		usb_unanchor_urb(victim);
>  		usb_put_urb(victim);
>  		spin_lock_irqsave(&anchor->lock, flags);
>  	}

This routine no longer needs the usb_{get,put}_urb calls or the 
internal spinlock manipulations.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux