Re: memory clobber in rx path, maybe related to ath9k.

"Luis R. Rodriguez" <lrodriguez@xxxxxxxxxxx> · Fri, 15 Oct 2010 16:57:41 -0700

On Fri, Oct 15, 2010 at 4:42 PM, Ben Greear <greearb@xxxxxxxxxxxxxxx> wrote:
> On 10/15/2010 04:38 PM, Luis R. Rodriguez wrote:
>>
>> On Fri, Oct 15, 2010 at 04:33:50PM -0700, Ben Greear wrote:
>>>
>>> On 10/15/2010 04:21 PM, Luis R. Rodriguez wrote:
>>>>
>>>> Ben, please give this patch a shot. I addresses three races on the PCU:
>>>>
>>>> Â Â* When we were stopping the CPU for non-EDMA cards we never locked
>>>> against
>>>> Â Â Âanything starting the PCU again
>>>>
>>>> Â Â* ath9k_hw_startpcureceive() was being called without locking
>>>>
>>>> Â Â* Although we lock on the rxbuf lock for contention against
>>>> starting/stopping
>>>> Â Â Âthe PCU, we also need to lock on the driver in locations where we
>>>> start/stop
>>>> Â Â Âthe PCU within the same location otherwise we end up in
>>>> inconsistant states
>>>> Â Â Âand the hardware may end up proessing an incorrect buffer for DMA.
>>>> To
>>>> Â Â Âprotect against this we use a new PCU lock on the main part of the
>>>> driver to
>>>> Â Â Âensure each start/stop/reset operation is done atomically.
>>>>
>>>> And fixes one issue as a side effect:
>>>>
>>>> Â Â* No more packet loss on ping flood when you have one STA associated
>>>> :)
>>>>
>>>> The only issue I see with this is I eventually run out of memory and my
>>>> box
>>>> becomes useless, unless I am mistaking that for some other issue.
>>>>
>>>> Please give this a shot and if it cures your woes I'll split it up into
>>>> 3 separate patches, or maybe just two, one for the first two and one for
>>>> the last issue.
>>>
>>> Sounds good, but this lockdep splat happens almost immediately upon
>>> starting
>>> my app:
>>>
>>> =======================================================
>>> [ INFO: possible circular locking dependency detected ]
>>> 2.6.36-rc8-wl+ #32
>>> -------------------------------------------------------
>>> swapper/0 is trying to acquire lock:
>>> Â (&(&sc->rx.pcu_lock)->rlock){+.-...}, at: [<fa16e5c7>]
>>> ath9k_tasklet+0x7e/0x140 [ath9k]
>>>
>>> but task is already holding lock:
>>> Â (&(&sc->rx.rxflushlock)->rlock){+.-...}, at: [<fa16e5b9>]
>>> ath9k_tasklet+0x70/0x140 [ath9k]
>>>
>>> which lock already depends on the new lock.
>>>
>>>
>>> the existing dependency chain (in reverse order) is:
>>>
>>> -> Â#1 (&(&sc->rx.rxflushlock)->rlock){+.-...}:
>>> Â Â Â Â [<c0457639>] lock_acquire+0x5a/0x78
>>> Â Â Â Â [<c075f6ed>] _raw_spin_lock_bh+0x20/0x2f
>>> Â Â Â Â [<fa170513>] ath_flushrecv+0x14/0x61 [ath9k]
>>
>> Ah we just need to nuke the flush lock, one second. Also remove my
>> skb_copy() otherwise you will really run out of memory quickly,
>> unless you really want to test with it :)
>
> Since you free the pkt, it shouldn't really consume, eh?

What we want to do is to use a kernel routine that gets free memory
*and* does a poison check against the memory area, so it does not
matter if we free it immediately.

> Seems like we should be able to handle an extra 5 skbs floating
> around the system..

It does consume 5 times more for each RX's buffer, so for 130 STAs it
would consume about 650 skbs instead of 130 if each RX'd an skb.

> I'll try your new patch now...

Thanks,

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html