Search Linux Wireless

Re: Regarding .wake_tx_queue() model

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Maxime Bizon <mbizon@xxxxxxxxxx> writes:

> Hello,
>
> Currently switching a driver to .wake_tx_queue() model

Yay :)

> and I would appreciate some guidance over a couple of issues.
>
> My hardware exposes 1 FIFO per ac, so the current driver basically
> queue skb in the correct fifo from drv_tx(), and once a FIFO is big
> "enough" (packet count or total duration), I use
> ieee80211_stop_queue(), and the corresponding ieee80211_wait_queue()
> in tx completion.
>
> Current driver TX flow is:
>  - drv_tx() => push into FIFO
>  - drv_tx() => push into FIFO
>  - drv_tx() => push into FIFO => FIFO full => ieee80211_stop_queue()
>  - [drv_tx won't be called]
>  - tx completion event => ieee80211_wake_queue()
>  - drv_tx()
>  [...]
>
>
> 1) drv_tx() & drv_wake_tx_queue() concurrency
>
> With the .wake_tx_queue model, there are now two entry points in the
> driver, how does the stack ensure that drv_tx() is not blocked forever
> if there is concurrent traffic on the same AC ?
>
>
> for example:
>
>  - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
>  - tx completion event => ieee80211_wake_queue()
>  - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
>  - tx completion event => ieee80211_wake_queue()
>  - [...]
>
> ieee80211_wake_queue() will schedule both tx_pending_tasklet and
> wake_txqs_tasklet, but I don't think there is any guarantee in the
> call ordering.
>
> Is it the driver's duty to leave a bit of room during
> drv_wake_tx_queue() scheduling and not stop the queues from there ?

Yeah, this is basically up to the driver. I'm mostly familiar with
ath9k, and I think basically what that does is that it doesn't fill the
HW FIFO in normal operation: For data packets being pulled off
ieee80211_tx_dequeue() it'll only queue two aggregates in the hardware
at a time. This is a good thing! We want the packets to be queued on the
mac80211 TXQs not in a dumb HW FIFO causing bufferbloat!

Given that you're building aggregates in the driver, you could just do
the same thing as ath9k and likely get pretty good results, I think :)

> 2) ieee80211_stop_queue() vs drv_wake_tx_queue()
>
> Commit 21a5d4c3a45ca608477a083096cfbce76e449a0c made it so that
> ieee80211_tx_dequeue() will return nothing if hardware queue is
> stopped, but drv_wake_tx_queue() is still called for ac whose queue is
> stopped.
>
>
> so should I do this ?
>
>  - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
>  - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => NULL => return
>  - tx completion event => ieee80211_wake_queue()
>  - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
>  - [...]
>
> or this ?
>
>  - .wake_tx_queue() => ieee80211_queue_stopped() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
>  - .wake_tx_queue() => ieee80211_queue_stopped() => return
>
> associated commit log only mentions edge cases (channel switch, DFS),
> so I'm not sure if ieee80211_stop_queue() for txqs was intended for
> "fast path", also see 3)

I don't think ieee80211_stop_queue() is meant to be used this way at all
in the wake_tx_queue case. Rather, when you get a wake_tx_queue()
callback, you just queue as many frames as you feel like (see '2
aggregate' limit above), and then return.  Then, on a TX completion you
just call your internal driver function that tries to pull more frames
from the mac80211 TXQs.

You'll keep getting wake_tx_queue callbacks from mac80211, but there's
nothing saying you have to pull any frames on each one.

See ath_txq_schedule() for how ath9k does this :)

> 3) _ieee80211_wake_txqs() looks buggy:
>
> If the cab_queue is not stopped, this loop will unconditionally wake
> up all txqs, which I guess is not what was intended:
>
>         for (i = 0; i < local->hw.queues; i++) {
>                 if (local->queue_stop_reasons[i])
>                         continue;
>
>                         for (ac = 0; ac < n_acs; ac++) {
>                                 int ac_queue = sdata->vif.hw_queue[ac];
>
>                                 if (ac_queue == i ||
>                                     sdata->vif.cab_queue == i)
>                                         __ieee80211_wake_txqs(sdata, ac);
>                         }


(not sure about this none)

> 4) building aggregation in the driver:
>
> I'm doing aggregation on the host side rather than in the firmware,
> which will ends up with more or less the same code as ath9k, is there
> any on-going effort to move that code from the driver into the stack ?

Not aware of any on-going efforts, no. Something like this usually
happens because someone feels it would make their life easier. Say, if
they're writing a new driver and wants to re-use code :)

Looking at the ath9k code, ath_tx_form_aggr() is tied into the internal
driver buffer representations, so I'm not sure how much work it would be
to generalise and split out parts of it. It need not be a complete
"build me an aggregate" function that you move into mac80211, though,
even some utility functions to calculate padding etc might be shareable?
I guess that if you're copying code from there I guess you'll find out :)

-Toke




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux