Felix Fietkau <nbd@xxxxxxxx> writes: > On 2018-11-12 23:51, Rajkumar Manoharan wrote: >> From: Toke Høiland-Jørgensen <toke@xxxxxxx> >> >> This adds airtime accounting and scheduling to the mac80211 TXQ >> scheduler. A new callback, ieee80211_sta_register_airtime(), is added >> that drivers can call to report airtime usage for stations. >> >> When airtime information is present, mac80211 will schedule TXQs >> (through ieee80211_next_txq()) in a way that enforces airtime fairness >> between active stations. This scheduling works the same way as the ath9k >> in-driver airtime fairness scheduling. If no airtime usage is reported >> by the driver, the scheduler will default to round-robin scheduling. >> >> For drivers that don't control TXQ scheduling in software, a new API >> function, ieee80211_txq_may_transmit(), is added which the driver can use >> to check if the TXQ is eligible for transmission, or should be throttled to >> enforce fairness. Calls to this function must also be enclosed in >> ieee80211_txq_schedule_{start,end}() calls to ensure proper locking. >> >> The API ieee80211_txq_may_transmit() also ensures that TXQ list will be >> aligned aginst driver's own round-robin scheduler list. i.e it rotates >> the TXQ list till it makes the requested node becomes the first entry >> in TXQ list. Thus both the TXQ list and driver's list are in sync. >> >> Co-Developed-by: Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> >> Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxx> >> Signed-off-by: Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> >> --- >> include/net/mac80211.h | 59 ++++++++++++++++++++++++++++++ >> net/mac80211/cfg.c | 3 ++ >> net/mac80211/debugfs.c | 3 ++ >> net/mac80211/debugfs_sta.c | 50 ++++++++++++++++++++++++-- >> net/mac80211/ieee80211_i.h | 2 ++ >> net/mac80211/main.c | 4 +++ >> net/mac80211/sta_info.c | 44 +++++++++++++++++++++-- >> net/mac80211/sta_info.h | 13 +++++++ >> net/mac80211/status.c | 6 ++++ >> net/mac80211/tx.c | 90 +++++++++++++++++++++++++++++++++++++++++++--- >> 10 files changed, 264 insertions(+), 10 deletions(-) >> >> diff --git a/net/mac80211/status.c b/net/mac80211/status.c >> index aa4afbf0abaf..a1f1256448f5 100644 >> --- a/net/mac80211/status.c >> +++ b/net/mac80211/status.c >> @@ -818,6 +818,12 @@ static void __ieee80211_tx_status(struct ieee80211_hw *hw, >> ieee80211_sta_tx_notify(sta->sdata, (void *) skb->data, >> acked, info->status.tx_time); >> >> + if (info->status.tx_time && >> + wiphy_ext_feature_isset(local->hw.wiphy, >> + NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)) >> + ieee80211_sta_register_airtime(&sta->sta, tid, >> + info->status.tx_time, 0); >> + >> if (ieee80211_hw_check(&local->hw, REPORTS_TX_ACK_STATUS)) { >> if (info->flags & IEEE80211_TX_STAT_ACK) { >> if (sta->status_stats.lost_packets) > I think the same is needed in ieee80211_tx_status_ext. Right, good point. >> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c >> index 305965283506..3f417e80e041 100644 >> --- a/net/mac80211/tx.c >> +++ b/net/mac80211/tx.c >> @@ -3660,12 +3680,74 @@ void ieee80211_return_txq(struct ieee80211_hw *hw, >> lockdep_assert_held(&local->active_txq_lock[txq->ac]); >> >> if (list_empty(&txqi->schedule_order) && >> - (!skb_queue_empty(&txqi->frags) || txqi->tin.backlog_packets)) >> - list_add_tail(&txqi->schedule_order, >> - &local->active_txqs[txq->ac]); >> + (!skb_queue_empty(&txqi->frags) || txqi->tin.backlog_packets)) { >> + /* If airtime accounting is active, always enqueue STAs at the >> + * head of the list to ensure that they only get moved to the >> + * back by the airtime DRR scheduler once they have a negative >> + * deficit. A station that already has a negative deficit will >> + * get immediately moved to the back of the list on the next >> + * call to ieee80211_next_txq(). >> + */ >> + if (txqi->txq.sta && >> + wiphy_ext_feature_isset(local->hw.wiphy, >> + NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)) >> + list_add(&txqi->schedule_order, >> + &local->active_txqs[txq->ac]); >> + else >> + list_add_tail(&txqi->schedule_order, >> + &local->active_txqs[txq->ac]); >> + } >> } > This part doesn't really make much sense to me, but maybe I'm > misunderstanding how the code works. > Let's assume we have a driver like ath9k or mt76, which tries to keep a > number of aggregates in the hardware queue, and the hardware queue is > currently empty. > If the current txq entry is kept at the head of the schedule list, > wouldn't the code just pull from that one over and over again, until > enough packets are transmitted by the hardware and their tx status > processed? > It seems to me that while fairness is still preserved in the long run, > this could lead to rather bursty scheduling, which may not be > particularly latency friendly. Yes, it'll be a bit more bursty when the hardware queue is completely empty. However, when a TX completion comes back, that will adjust the deficit of that sta and cause it to be rotated on the next dequeue. This obviously relies on the fact that the lower-level hardware queue is sufficiently shallow to not add a lot of latency. But we want that to be the case anyway. In practice, it works quite well for ath9k, but not so well for ath10k because it has a large buffer in firmware. If we requeue the TXQ at the end of the list, a station that is taking up too much airtime will fail to be throttled properly, so the queue-at-head is kinda needed to ensure fairness... -Toke