Search Linux Wireless

Re: [RFC v2] mac80211: add A-MSDU tx support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2016-02-07 11:22, Emmanuel Grumbach wrote:
> On Sun, Feb 7, 2016 at 12:06 PM, Emmanuel Grumbach <egrumbach@xxxxxxxxx> wrote:
>> On Sun, Feb 7, 2016 at 11:08 AM, Felix Fietkau <nbd@xxxxxxxxxxx> wrote:
>>> On 2016-02-07 08:25, Emmanuel Grumbach wrote:
>>>> On Fri, Feb 5, 2016 at 12:41 PM, Felix Fietkau <nbd@xxxxxxxxxxx> wrote:
>>>>> Requires software tx queueing support. frag_list support (for zero-copy)
>>>>> is optional.
>>>>>
>>>>> Signed-off-by: Felix Fietkau <nbd@xxxxxxxxxxx>
>>>>
>>>> Looks nice!
>>>> This would allow us to create aggregates of TCP Acks, the problem is
>>>> that when you are mostly receiving data, the hardware queues are
>>>> pretty much empty (nothing besides the TCP Acks which should go out
>>>> quickly) so that packets don't pile up in the software queues and
>>>> hence you don't have enough material to build an A-MSDU.
>>>> I guess that for AP oriented devices, this is ideal solution since you
>>>> can't rely on TSO (packets are not locally generated) and this allows
>>>> to build an A-MSDU without adding more latency since you build an
>>>> A-MSDU with packets that are already in the queue waiting instead of
>>>> delaying transmission of the first packet.
>>>> IIRC, the latter was the approach chose by the new Marvell driver
>>>> posted a few weeks ago. This approach is better in my eyes.
>>>> For iwlwifi which is much more station oriented (of GO which is
>>>> basically an AP with locally generated traffic), I took the TSO
>>>> approach. I guess we could try to change iwlwifi to use your tx
>>>> queues, and check how that works. This would allow us to have A-MSDU
>>>> on bridged traffic as well, although this use case is much less common
>>>> for Intel devices.
>>
>>
>>> Can the iwlwifi firmware maintain per-sta per-tid queues? Because that
>>> way you would get the most benefits from using that tx queueing
>>> infrastructure.
>>
>> Well... iwlwifi and athXk are very different. iwlwifi really has the
>> concept of queues and not flat descriptors.
>> Any Tx descriptor lives in the context of a Tx queue which is
>> per-sta-per tid when A-MPDUs is enabled.
>> We are now moving to a scheme in which we will have per-sta-per-tid
>> queue regardless of the A-MPDU state which will make much sense to tie
>> to the tx queueing infrastructure.
>> Thing is that in that case I am afraid we will not have enough packets
>> in the software tx queue to get A-MSDUs from your code. with TSO, it
>> is easier :) Still worth trying to work with this instead of TSO and
>> see how it goes. That won't happen anytime soon though.
>>
>>>
>>>>> +
>>>>> +static bool ieee80211_amsdu_aggregate(struct ieee80211_sub_if_data *sdata,
>>>>> +                                     struct sta_info *sta,
>>>>> +                                     struct ieee80211_fast_tx *fast_tx,
>>>>> +                                     struct sk_buff *skb)
>>>>> +{
>>>>> +       struct ieee80211_local *local = sdata->local;
>>>>> +       u8 tid = skb->priority & IEEE80211_QOS_CTL_TAG1D_MASK;
>>>>> +       struct ieee80211_txq *txq = sta->sta.txq[tid];
>>>>> +       struct txq_info *txqi;
>>>>> +       struct sk_buff **frag_tail, *head;
>>>>> +       int subframe_len = skb->len - ETH_ALEN;
>>>>> +       int max_amsdu_len;
>>>>> +       __be16 len;
>>>>> +       void *data;
>>>>> +       bool ret = false;
>>>>> +       int n = 1;
>>>>> +
>>>>> +       if (!ieee80211_hw_check(&local->hw, TX_AMSDU))
>>>>> +               return false;
>>>>> +
>>>>> +       if (!txq)
>>>>> +               return false;
>>>>> +
>>>>> +       txqi = to_txq_info(txq);
>>>>> +       if (test_bit(IEEE80211_TXQ_NO_AMSDU, &txqi->flags))
>>>>> +               return false;
>>>>> +
>>>>> +       /*
>>>>> +        * A-MPDU limits maximum MPDU size to 4095 bytes. Since aggregation
>>>>> +        * sessions are started/stopped without txq flush, use the limit here
>>>>> +        * to avoid having to de-aggregate later.
>>>>> +        */
>>>>> +       max_amsdu_len = min_t(int, sta->sta.max_amsdu_len, 4095);
>>>>
>>>> So you can't get 10K A-MSDUs? I don't see where you check that you
>>>> have an A-MPDU session here. You seem to be applying the 4095 limit
>>>> also for streams that are not an A-MPDU?
>>>> I guess you could check if the sta is a VHT peer, in that case, no
>>>> limit applies.
>>> The explanation for the missing A-MPDU change is in that comment -
>>> checking for an active A-MPDU session would make it unnecessarily complex.
>>> Good point about checking for VHT capabilities to remove this limit, I
>>> will add that.
> 
> Yes - I read the comment, but it seemed very sub-optimal to limit all
> the A-MSDUs to 4K. With TSO I can get up to 10K and it really helps
> TPT.
This was built with the assumption that most scenarios use A-MPDU anyway
and thus don't need really large A-MSDUs.

> One more point. In VHT, there may be a limit on the numbers of
> subframes in the A-MSDU. I don't see you handle that. Maybe I missed
> it?
I haven't looked at that much yet. Right now the driver can only specify
a limit for the number of subframes.

> And... in case the driver doesn't handle frag_list, you linearize the
> skb which is pretty much the only thing you can do at this stage. But,
> when you'll lift the 4095 bytes limit, you'll get 11K A-MSDU,
> linarizing such a long packet is really putting the memory manager
> under pressure. 
I added no-frag_list support primarily for debugging purposes, it's not
supposed to perform well.

> This is an order 4 allocation, for each A-MSDU. Note
> that iwlwifi (and probably other drivers) can handle gather DMA in Tx,
> but they have a limited number of frags they can handle. iwlwifi e.g.
> can handle up to 20 frags, but 3 are taken for "paperwork". You'll
> have 2 frags per subframe at least (assuming that each subframe's
> payload is nicely contiguous and not on a page boundary). I think that
> it may be worthwhile to ask the driver how many frags it is supposed
> to handle. I can't promise iwlwifi will use it, but I guess it will be
> useful for someone.
You mean an extra frag limit in addition to the driver subframe limit,
in case individual subframes are fragmented as well?

- Felix
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux