Search Linux Wireless

Re: Possible BUG where mac80211 fails to stop queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2009-07-26 at 17:52 -0500, Larry Finger wrote:
> While stress testing the newest version of the open-source firmware
> for BCM43XX devices with the latest pull of wireless-testing, I ran
> into a problem of DMA TX queue overrun. Initially I thought this was
> due to the firmware change; however, I got the same error with the
> standard firmware. I have not seen this before, but it may not be a
> regression as it seems to occur only under special circumstances.

I've also seen it under extreme stress on Intel hardware, cf.
http://thread.gmane.org/gmane.linux.kernel.wireless.general/36497

> The critical code is in b43_dma_tx(), which is called by the .tx
> callback routine registered with mac80211.
> 
> After the fragment is transmitted by a call to dma_tx_fragment() at
> line 1353, the routine checks to see if there are sufficient free
> slots (2) to transmit another fragment using the code below:
> 
>         if ((free_slots(ring) < TX_SLOTS_PER_FRAME) ||
>             should_inject_overflow(ring)) {
>                 /* This TX ring is full. */
>                 ieee80211_stop_queue(dev->wl->hw,
> 				skb_get_queue_mapping(skb));
>                 ring->stopped = 1;
>                 if (b43_debug(dev, B43_DBG_DMAVERBOSE)) {
>                         b43dbg(dev->wl, "Stopped TX ring %d\n",
> 			       ring->index);
>                 }
>         }
> 
> 
> The problem shows up at line 1340 for the next fragment:
> 
>        B43_WARN_ON(ring->stopped);
> 
>         if (unlikely(free_slots(ring) < TX_SLOTS_PER_FRAME)) {
>                 b43warn(dev->wl, "DMA queue overflow\n");
>                 err = -ENOSPC;
>                 goto out_unlock;
>         }
> 
> The system generates the warning for ring->stopped and prints the "DMA
> queue overflow" message.

Right. Exactly the same behaviour as I'm seeing on Intel hardware.

> My understanding is that mac80211 serializes the calls for each TX
> queue, and that the TX callback should not have been entered for this
> case.
> 
> If I am not understanding the way that mac80211 works, please correct
> me. I would also appreciate any suggestions for further debugging.

I stared at the mac80211 code for a long time and concluded that it was
a race condition and couldn't really be fixed, see my analysis in the
iwlwifi patch. I'd love to be proved wrong though.

Are you seeing this multiple times? I don't think you have fragmentation
on, do you? At least I didn't and still saw the problem, which seemed a
bit strange, but I really couldn't see any other way for it to happen.

johannes

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux