Hi Toke, Thanks for continuing to work on this! I just did a quick reading of the code and haven't tested this yet - I might have some more comments in the next few days. On 2021-03-18 22:31, Toke Høiland-Jørgensen wrote: > This switches the airtime scheduler in mac80211 to use a virtual time-based > scheduler instead of the round-robin scheduler used before. This has a > couple of advantages: > > - No need to sync up the round-robin scheduler in firmware/hardware with > the round-robin airtime scheduler. > > - If several stations are eligible for transmission we can schedule both of > them; no need to hard-block the scheduling rotation until the head of the > queue has used up its quantum. > > - The check of whether a station is eligible for transmission becomes > simpler (in ieee80211_txq_may_transmit()). > > The drawback is that scheduling becomes slightly more expensive, as we need > to maintain an rbtree of TXQs sorted by virtual time. This means that > ieee80211_register_airtime() becomes O(logN) in the number of currently > scheduled TXQs because it can change the order of the scheduled stations. > We mitigate this overhead by only resorting when a station changes position > in the tree, and hopefully N rarely grows too big (it's only TXQs currently > backlogged, not all associated stations), so it shouldn't be too big of an > issue. > > To prevent divisions in the fast path, we maintain both station sums and > pre-computed reciprocals of the sums. This turns the fast-path operation > into a multiplication, with divisions only happening as the number of > active stations change (to re-compute the current sum of all active station > weights). To prevent this re-computation of the reciprocal from happening > too frequently, we use a time-based notion of station activity, instead of > updating the weight every time a station gets scheduled or de-scheduled. As > queues can oscillate between empty and occupied quite frequently, this can > significantly cut down on the number of re-computations. It also has the > added benefit of making the station airtime calculation independent on > whether the queue happened to have drained at the time an airtime value was > accounted. > > Co-developed-by: Yibo Zhao <yiboz@xxxxxxxxxxxxxx> > Signed-off-by: Yibo Zhao <yiboz@xxxxxxxxxxxxxx> > Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > --- > Respinning this has taken way too long, but assuming anyone actually remembers > the previous version from a bit over a year ago, here's the changelog: > > Changes since v5: > Rebase on latest mac80211-next. > > Fix issue with scheduling hanging because the schedule position was not > cleared when starting a new scheduling round. > > Switch the reciprocal calculation to use u32 (split 19/13) for per-station > weights and a u64 only for the weight sum (to cut down on the number of 64-bit > operations performed) > > Introduce the notion of time-based station activity when calculating weight > sums. This also gets rid of the need for a "grace time" when catching up > stations, since we now have a direct notion of when a station has been > inactive for a while. Not sure if I'm misunderstanding the code, but this does not seem enough to me. From what I can see, you consider a station active if it has been scheduled in the last 100ms. Let's say we keep sending a single small packet to a particular sta every 90ms (thus keeping it active) for a long period of time and then suddenly start a really huge transfer. What keeps it from then taking an unreasonably large share of the airtime for as long as it takes for the virtual time to catch up? Am I missing something or should we maybe use the new notion of time-based activity *and* do a grace time catch up? - Felix