Felix Fietkau <nbd@xxxxxxxx> writes: > On 2019-03-13 23:55, Toke Høiland-Jørgensen wrote: >> Felix Fietkau <nbd@xxxxxxxx> writes: >> >>> Holding the lock around the entire duration of tx scheduling can >>> create some nasty lock contention, especially when processing airtime >>> information from the tx status or the rx path. >> >> Right, I can see how that could become an issue at higher loads than >> what I tested with :) > I stumbled onto this when I was doing perf runs with mt76 (before even > adding support for this API) and I noticed that a visible amount of lock > contention was caused by mac80211 tx calls being blocked by mt76 tx > scheduling. > I wanted to fix these issues by switching to the mac80211 txq scheduling > API, but when reading the code I noticed that using this API had the > very same issue I was trying to fix. That's why I made this patch :) Ah, I see. Well I applaud you making fixing this and switching over mt76 your solution :) >>> Improve locking by only holding the active_txq_lock for lookups / >>> scheduling list modifications. >> >> So the (potential) issue I see with this modification is that it >> requires the driver to ensure that it will not interleave two different >> scheduling rounds for the same acno. I.e., another call to >> schedule_start() will reset the round counter and something needs to >> guarantee that this doesn't happen until the driver has actually >> finished the previous round. >> >> I am not sure to what extent this would *actually* be a problem. For >> ath9k, for instance, there's already the in-driver chan_lock (although >> the call to ieee80211_txq_schedule_start() would then have to be moved >> into the section covered by that lock). But it does introduce an >> implicit dependency in the API, which should at least be documented. > With ath9k it's also protected by the per-txq lock. Ah, right, that was just moved not removed entirely. Great! > With ath10k it's protected by having scheduling calls come from the NAPI > poll function. Cool. >> If memory serves, avoiding this implicit dependency was the original >> reason I had for adding the full lock around everything. I can't think >> of any other reason right now, but if I do think of something I'll be >> sure to let you know :) > I'll change the patch to make this more explicit and resubmit. > Thanks for your comments. Sounds good! -Toke