Re: Spinlock recursion in usbnet + rndis-wlan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday 14 January 2009, Williams, Jeremy (Xetron) wrote:
> I am seeing a spinlock recursion bug when usbnet is 
> attempting to unlink the urbs from the device.

Maybe this will help ... all I did is refresh an older patch
so it builds again.  I've not re-tested.  I *expect* that on
a uniprocessor system it should be pretty much OK; but as I
note in the patch comments, usbcore isn't much cooperating
with this.

- Dave

=================== CUT HERE
This fixes a problem seen with one newish HCD, which doesn't have hardware
DMA queues and thus implements all urb_dequeue() methods synchronously.

The problem is that unlink_urbs() issues its unlink calls while holding the
same lock the urb completion callback uses (in this case, while the unlink
is running!) to safely move skbs to the destination queue.  Recent kernels
can provide a runtime "spinlock recursion" lock debug message.

The workaround just uses a bitflag to stop the relevant queue from growing,
instead of holding that lock ... but doesn't completely resolve the issue.

REVISIT: restore walking of the queue from back-to-front, to minimize
interference with the HCD's concurrent (but probably inoperative)
activity of processing entries from the front of the queue.

INCOMPLETE ... this approach isn't fully safe.

---
 drivers/net/usb/usbnet.c   |   42 ++++++++++++++++++++++++++++++++++++++----
 include/linux/usb/usbnet.h |    3 +++
 2 files changed, 41 insertions(+), 4 deletions(-)

--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -302,6 +302,11 @@ static void rx_submit (struct usbnet *de
 	unsigned long		lockflags;
 	size_t			size = dev->rx_urb_size;
 
+	if (test_bit (PURGE_RXQ, &dev->flags)) {
+		usb_free_urb (urb);
+		return;
+	}
+
 	if ((skb = alloc_skb (size + NET_IP_ALIGN, flags)) == NULL) {
 		if (netif_msg_rx_err (dev))
 			devdbg (dev, "no rx skb");
@@ -510,8 +515,28 @@ static int unlink_urbs (struct usbnet *d
 	unsigned long		flags;
 	struct sk_buff		*skb, *skbnext;
 	int			count = 0;
+	int			bit;
 
-	spin_lock_irqsave (&q->lock, flags);
+	/* Setting the PURGE bit prevents appending to the queue,
+	 * but urb completions can still modify it.
+	 */
+	bit = (q == &dev->rxq) ? PURGE_RXQ : PURGE_TXQ;
+	if (test_and_set_bit(bit, &dev->flags) != 0)
+		return q->qlen;
+
+	/* Blocking irqs prevents _this_ cpu from removing entries from
+	 * the queue, except when usb_unlink_urb() works synchronously.
+	 * (When the HCD has a deep DMA queue, it's asynchronous; that's
+	 * the normal case for EHCI, OHCI, and UHCI.)
+	 *
+	 * FIXME remove from back to front, so there's at most one entry
+	 * being processed by the hardware; front-to-back means some
+	 * hosts will be feeding every URB to the hardware...
+	 *
+	 * FIXME smp systems can't guarantee that this doesn't walk
+	 * through invalid pointers...
+	 */
+	local_irq_save(flags);
 	skb_queue_walk_safe(q, skb, skbnext) {
 		struct skb_data		*entry;
 		struct urb		*urb;
@@ -520,15 +545,18 @@ static int unlink_urbs (struct usbnet *d
 		entry = (struct skb_data *) skb->cb;
 		urb = entry->urb;
 
-		// during some PM-driven resume scenarios,
-		// these (async) unlinks complete immediately
+		/* these (async) unlinks _may_ complete immediately, so
+		 * we can't be holding any lock the completion handler
+		 * will be using...
+		 */
 		retval = usb_unlink_urb (urb);
 		if (retval != -EINPROGRESS && retval != 0)
 			devdbg (dev, "unlink urb err, %d", retval);
 		else
 			count++;
 	}
-	spin_unlock_irqrestore (&q->lock, flags);
+	local_irq_restore(flags);
+	clear_bit(bit, &dev->flags);
 	return count;
 }
 
@@ -930,6 +958,9 @@ static int usbnet_start_xmit (struct sk_
 	struct driver_info	*info = dev->driver_info;
 	unsigned long		flags;
 
+	if (test_bit (PURGE_TXQ, &dev->flags))
+		goto drop;
+
 	// some devices want funky USB-level framing, for
 	// win32 driver (usually) and/or hardware quirks
 	if (info->tx_fixup) {
@@ -1272,6 +1303,9 @@ int usbnet_suspend (struct usb_interface
 	struct usbnet		*dev = usb_get_intfdata(intf);
 
 	if (!dev->suspend_count++) {
+
+/* REVISIT:  status transfer and timer stay active ... bug? */
+
 		/*
 		 * accelerate emptying of the rx and queues, to avoid
 		 * having everything error out.
--- a/include/linux/usb/usbnet.h
+++ b/include/linux/usb/usbnet.h
@@ -64,6 +64,9 @@ struct usbnet {
 #		define EVENT_RX_MEMORY	2
 #		define EVENT_STS_SPLIT	3
 #		define EVENT_LINK_RESET	4
+
+#		define PURGE_RXQ	16
+#		define PURGE_TXQ	17
 };
 
 static inline struct usb_driver *driver_of(struct usb_interface *intf)
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux