Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> writes: > On 2018-10-10 04:15, Toke Høiland-Jørgensen wrote: >> Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> writes: >> >>> On 2018-10-09 05:32, Toke Høiland-Jørgensen wrote: >>>> This adds airtime accounting and scheduling to the mac80211 TXQ >>>> scheduler. A new callback, ieee80211_sta_register_airtime(), is added >>>> that drivers can call to report airtime usage for stations. >>>> >>>> When airtime information is present, mac80211 will schedule TXQs >>>> (through ieee80211_next_txq()) in a way that enforces airtime >>>> fairness >>>> between active stations. This scheduling works the same way as the >>>> ath9k >>>> in-driver airtime fairness scheduling. If no airtime usage is >>>> reported >>>> by the driver, the scheduler will default to round-robin scheduling. >>>> >>>> For drivers that don't control TXQ scheduling in software, a new API >>>> function, ieee80211_txq_may_transmit(), is added which the driver can >>>> use >>>> to check if the TXQ is eligible for transmission, or should be >>>> throttled to >>>> enforce fairness. Calls to this function must also be enclosed in >>>> ieee80211_txq_schedule_{start,end}() calls to ensure proper locking. >>>> TXQs >>>> that are throttled by ieee802111_txq_may_transmit() will be woken up >>>> again >>>> by a check added to the ieee80211_wake_txqs() tasklet. >>>> >>> >>> Toke, >>> >>> I am observing soft lockup issues again with this new series while >>> running traffic with 50 clients. I am continuing testing with earlier >>> series along with snippet I shared. >> >> Are these new lockups (that was not in your patched previous version), >> or did I just not get all your lock-related fixes incorporated? >> >>> When driver operates in pull-mode, throttled txqs are marked and >>> refilled in airtime_tasklet. This is causing major throughput drops >>> and packet loss and I am suspecting the latency in replenishing >>> deficit. Whereas in push-mode or in ath9k model, refill happens >>> quicker at every packet indication as well as tx completion. >> >> Yeah, the tasklet shouldn't be the main source of deficit replenishing. >> Can see why that would give bad performance :) >> >>> I am planning to get rid of tasklet completely as it is only meant for >>> pull-mode. It would be better to refill in may_transmit() itself. >> >> Hmm, right. So the way to do this correctly (from a fairness point of >> view) would be something like this (in max_tx()): >> >> if (this_txq.stn.deficit > 0) >> return true; >> >> else if (any queued TXQ currently have positive deficit) >> return false; /* other TXQ should try may_tx() later and get >> permission */ >> >> else /* all deficits < 0 */ >> return replenish_deficits(this_txq); >> >> And replenish_deficits() would be something like: >> >> replenish_deficits(this_txq) { >> repeat: >> for (txq in queued txqs) { >> txq.stn.deficit += stn.weight; >> if (txq.stn.deficit > 0 && !wake_txq) >> wake_txq = txq; >> } >> if not wake_txq: >> goto repeat; >> >> if (this_txq.stn.deficit > 0) >> return true; >> else >> drv_wake_tx_queue(wake_txq); >> } >> >> The wake_tx_queue call may have to be delegated to a tasklet still, to >> avoid the infinite recursion problem I mentioned earlier. But the >> tasklet could be made simpler and wouldn't have to be called so >> often... >> >> Does the above make sense? >> > Hmm... mine is bit different. txqs are refilled only once for all txqs. > It will give more opportunity for non-served txqs. drv_wake_tx_queue > won't be > called from may_tx as the driver anyway will not push packets in > pull-mode. So, as far as I can tell, this requires the hardware to "keep trying"? I.e., if it just stops scheduling a TXQ after may_transmit() returns false, there is no guarantee that that TXQ will ever get re-awoken unless a new packet arrives for it? -Toke