Re: debugging TXQs being empty

Ben Greear <greearb@xxxxxxxxxxxxxxx> · Thu, 5 Dec 2019 08:37:06 -0800

On 12/5/19 8:34 AM, Johannes Berg wrote:
Hi Toke, all,

I'm debugging some throughput issues and wondered if you had a hint.
This is at HE rates 2x2 80 MHz, so you'd expect ~1Gbps or a bit more,
I'm getting ~900 Mbps. Just to set the stage.

What I think is (part of) the problem is that I see in the logs that our
hardware queues become empty every once a while.

This seems to be when/because ieee80211_tx_dequeue() returns NULL, and
we hit the
                         skb = ieee80211_tx_dequeue(hw, txq);

                         if (!skb) {
                                 if (txq->sta)
                                         IWL_DEBUG_TX(mvm,
                                                      "TXQ of sta %pM tid %d is now empty\n",
                                                      txq->sta->addr,
                                                      txq->tid);

printout, e.g.
iwlwifi 0000:00:14.3: I iwl_mvm_mac_itxq_xmit TXQ of sta 0c:9d:92:03:12:44 tid 0 is now empty

This isn't always bad, but in most cases I see it happen the hardware
queue actually is rather shallow at the time, say only 57 packets in
some instance. Then we can basically send all the packets in the queue
in one or two aggregations (see I here an example with 57 packets in the
queue, ieee80211_tx_dequeue() returns NULL, and we then send an A-MPDU
with 38 followed by one with 19 packets, making the HW queue empty.)

This is with 10 simultaneous TCP streams, so there *shouldn't* be any
issues with that, I did indeed try to lower the pacing shift and it had
no effect. I couldn't try with just one or two streams (actually one
stream is not enough because the AP has only GBit LAN ... so in the
ideal case wireless is faster than ethernet!!) - somehow the test hangs
then, but I'll get back to that later.

Anyhow, do you have any tips on debugging this? This is still without
AQL code. The AQM stats for the AP look fine, basically everything is 0
except for "new-flows", "tx-bytes" and "tx-packets".

One thing that does seem odd is that the new-flows counter is increasing
this rapidly - shouldn't we expect it to be like 10 new flows for 10 TCP
sessions? I see this counter increase by the thousands per second.

I don't see any calls to __ieee80211_stop_queue() either, as expected
(per trace-cmd).

CPU load is not an issue AFAICT, even with all the debugging being
written into the syslog (or journal or something) that's the only thing
that takes noticable CPU time - ~50% for systemd-journal and ~20% for
rsyslogd, <10% for the throughput testing program and that's about it.
The system has 4 threads and seems mostly idle.

All this seems to mean that the TCP stack isn't feeding us fast enough,
but is that really possible?

Does UDP work better?

or pktgen?

Thanks,
Ben

Any other ideas?

Thanks,
johannes

--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc  http://www.candelatech.com