Search Linux Wireless

Re: [PATCH v2 01/12] ath5k: Switch from read-and-clear to write-to-clear method when handling PISR/SISR registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2011/11/26 Adrian Chadd <adrian@xxxxxxxxxxx>:
> Hi,
>
> Review #2:
>
> One thing that I discovered whilst debugging some silly races in
> FreeBSD was that the update of the txq active mask wasn't done
> atomically.
>
> Ie, when you check this:
>
> +++ b/drivers/net/wireless/ath/ath5k/base.c
> @@ -1689,7 +1689,7 @@ ath5k_tasklet_tx(unsigned long data)
>       struct ath5k_hw *ah = (void *)data;
>
>       for (i = 0; i < AR5K_NUM_TX_QUEUES; i++)
> -               if (ah->txqs[i].setup && (ah->ah_txq_isr & BIT(i)))
> +               if (ah->txqs[i].setup && (ah->ah_txq_isr_txok_all & BIT(i)))
>                       ath5k_tx_processq(ah, &ah->txqs[i]);
>
> The way that the original code did it was:
>
> * update the HAL TX mask bits inside the HAL getisr routine, so it
> would just update the HAL idea of txqactive;
> * Then, ath_hal_gettxintrtxqs() would pass back to the driver the set
> of txq bits currently active, and reset those that are active (with a
> mask, so you can read only a few)
> * the ath_intr() routine would simply schedule a tasklet call when a
> TX interrupt came in;
> * .. and then the TX completion tasklet would call
> ath_hal_gettxintrtxqs() for each task bit to see if it needed
> servicing.

I'm not aware of ath_hal_gettxintrqs etc, this was the patch I wrote on this...

http://www.mail-archive.com/ath5k-devel@xxxxxxxxxxxxxxx/msg01312.html

> Now, in previous generations of hardware/kernel, this may not have
> raced - ath_intr() may have been a "fast" interrupt handler and thus
> would occur atomically on a single CPU - so you don't have any races.
> But these days, ath_intr() is just a software interrupt and can occur
> in parallel with another CPU executing the completion handler.
>
> So since there was no locking around the TXQ bit updates and clear, it
> was very possible that a very busy TX device would "miss" TX queue
> notifications, as the CPU running the TX completion would
> read-and-clear the TXQ active bits in the HAL, whilst another CPU was
> running ath_intr() which was trying to do a read/modify/write of those
> same bits.
>
> I _think_ felix fixed this in ath9k by hiding this stuff behind the
> PCU lock. I've also fixed it in FreeBSD. I think it's worth
> re-evaluating what is going on in ath5k and maybe add some similar
> locking.
>
> Oh wait. Where are you clearing these bits? It doesn't look like
> you're even clearing the ah_txq_isr bits in the most recent ath5k tree
> I have here. This means you're likely not missing events; but you're
> likely always scanning those queues. Making this code a bit pointless.
> :)

Yup we don't clean them but that's not a bug, we still only check
active queues that produced interrupts, not everything. Anyway we have
txq->lock already so it's not a big deal to implement this I'll post a
patch on top of this patchset.

> Anyway, I'd tidy this all up in a subsequent patchset. Get the
> interrupt changes in, then figure out how to properly lock the
> interrupt TXQ bit set path (the isr) and the clear path (once you've
> tested whether the TXQ needed servicing and then cleared the bit.)
>
> HTH,
>
>
> adrian
>

Thanks again for your comments ;-)


-- 
GPG ID: 0xEE878588
As you read this post global entropy rises. Have Fun ;-)
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux