Regarding .wake_tx_queue() model

Maxime Bizon <mbizon@xxxxxxxxxx> · Mon, 4 May 2020 21:39:59 +0200

Hello,

Currently switching a driver to .wake_tx_queue() model, and I would
appreciate some guidance over a couple of issues.

My hardware exposes 1 FIFO per ac, so the current driver basically
queue skb in the correct fifo from drv_tx(), and once a FIFO is big
"enough" (packet count or total duration), I use
ieee80211_stop_queue(), and the corresponding ieee80211_wait_queue()
in tx completion.

Current driver TX flow is:
 - drv_tx() => push into FIFO
 - drv_tx() => push into FIFO
 - drv_tx() => push into FIFO => FIFO full => ieee80211_stop_queue()
 - [drv_tx won't be called]
 - tx completion event => ieee80211_wake_queue()
 - drv_tx()
 [...]

1) drv_tx() & drv_wake_tx_queue() concurrency

With the .wake_tx_queue model, there are now two entry points in the
driver, how does the stack ensure that drv_tx() is not blocked forever
if there is concurrent traffic on the same AC ?

for example:

 - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
 - tx completion event => ieee80211_wake_queue()
 - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
 - tx completion event => ieee80211_wake_queue()
 - [...]

ieee80211_wake_queue() will schedule both tx_pending_tasklet and
wake_txqs_tasklet, but I don't think there is any guarantee in the
call ordering.

Is it the driver's duty to leave a bit of room during
drv_wake_tx_queue() scheduling and not stop the queues from there ?

2) ieee80211_stop_queue() vs drv_wake_tx_queue()

Commit 21a5d4c3a45ca608477a083096cfbce76e449a0c made it so that
ieee80211_tx_dequeue() will return nothing if hardware queue is
stopped, but drv_wake_tx_queue() is still called for ac whose queue is
stopped.

so should I do this ?

 - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
 - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => NULL => return
 - tx completion event => ieee80211_wake_queue()
 - .wake_tx_queue() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
 - [...]

or this ?

 - .wake_tx_queue() => ieee80211_queue_stopped() => ieee80211_next_txq() => ieee80211_tx_dequeue() => FIFO now full => ieee80211_stop_queue()
 - .wake_tx_queue() => ieee80211_queue_stopped() => return

associated commit log only mentions edge cases (channel switch, DFS),
so I'm not sure if ieee80211_stop_queue() for txqs was intended for
"fast path", also see 3)

3) _ieee80211_wake_txqs() looks buggy:

If the cab_queue is not stopped, this loop will unconditionally wake
up all txqs, which I guess is not what was intended:

        for (i = 0; i < local->hw.queues; i++) {
                if (local->queue_stop_reasons[i])
                        continue;

                        for (ac = 0; ac < n_acs; ac++) {
                                int ac_queue = sdata->vif.hw_queue[ac];

                                if (ac_queue == i ||
                                    sdata->vif.cab_queue == i)
                                        __ieee80211_wake_txqs(sdata, ac);
                        }

4) building aggregation in the driver:

I'm doing aggregation on the host side rather than in the firmware,
which will ends up with more or less the same code as ath9k, is there
any on-going effort to move that code from the driver into the stack ?

Thanks,

-- 
Maxime