Re: [PATCH RFC v5 3/4] mac80211: Add airtime accounting and scheduling to TXQs

Toke Høiland-Jørgensen <toke@xxxxxxx> · Wed, 10 Oct 2018 13:15:55 +0200

Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> writes:

> On 2018-10-09 05:32, Toke Høiland-Jørgensen wrote:
>> This adds airtime accounting and scheduling to the mac80211 TXQ
>> scheduler. A new callback, ieee80211_sta_register_airtime(), is added
>> that drivers can call to report airtime usage for stations.
>> 
>> When airtime information is present, mac80211 will schedule TXQs
>> (through ieee80211_next_txq()) in a way that enforces airtime fairness
>> between active stations. This scheduling works the same way as the 
>> ath9k
>> in-driver airtime fairness scheduling. If no airtime usage is reported
>> by the driver, the scheduler will default to round-robin scheduling.
>> 
>> For drivers that don't control TXQ scheduling in software, a new API
>> function, ieee80211_txq_may_transmit(), is added which the driver can 
>> use
>> to check if the TXQ is eligible for transmission, or should be 
>> throttled to
>> enforce fairness. Calls to this function must also be enclosed in
>> ieee80211_txq_schedule_{start,end}() calls to ensure proper locking. 
>> TXQs
>> that are throttled by ieee802111_txq_may_transmit() will be woken up 
>> again
>> by a check added to the ieee80211_wake_txqs() tasklet.
>> 
>
> Toke,
>
> I am observing soft lockup issues again with this new series while
> running traffic with 50 clients. I am continuing testing with earlier
> series along with snippet I shared.

Are these new lockups (that was not in your patched previous version),
or did I just not get all your lock-related fixes incorporated?

> When driver operates in pull-mode, throttled txqs are marked and
> refilled in airtime_tasklet. This is causing major throughput drops
> and packet loss and I am suspecting the latency in replenishing
> deficit. Whereas in push-mode or in ath9k model, refill happens
> quicker at every packet indication as well as tx completion.

Yeah, the tasklet shouldn't be the main source of deficit replenishing.
Can see why that would give bad performance :)

> I am planning to get rid of tasklet completely as it is only meant for
> pull-mode. It would be better to refill in may_transmit() itself.

Hmm, right. So the way to do this correctly (from a fairness point of
view) would be something like this (in max_tx()):

if (this_txq.stn.deficit > 0)
  return true;

else if (any queued TXQ currently have positive deficit)
  return false; /* other TXQ should try may_tx() later and get permission */

else /* all deficits < 0 */
  return replenish_deficits(this_txq);

And replenish_deficits() would be something like:

replenish_deficits(this_txq) {
repeat:
  for (txq in queued txqs) {
    txq.stn.deficit += stn.weight;
    if (txq.stn.deficit > 0 && !wake_txq)
      wake_txq = txq;
  }
  if not wake_txq:
    goto repeat;

  if (this_txq.stn.deficit > 0)
    return true;
  else
    drv_wake_tx_queue(wake_txq);
}

The wake_tx_queue call may have to be delegated to a tasklet still, to
avoid the infinite recursion problem I mentioned earlier. But the
tasklet could be made simpler and wouldn't have to be called so often...

Does the above make sense?

-Toke