Search Linux Wireless

Re: [PATCH 3/6] mac80211: Add airtime accounting and scheduling to TXQs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> writes:

> On 2018-11-07 06:53, Toke Høiland-Jørgensen wrote:
>> Rajkumar Manoharan <rmanohar@xxxxxxxxxxxxxx> writes:
>> 
>>> Meanwhile we did some more experiments with both modes. The experiment
>>> was done in open environment and fixed rate and UDP traffic ran for 60
>>> seconds.
>>> 
>>> Seems like push mode not honoring the configured weight. Always the
>>> airtime was almost same whereas in pull-mode airtime is changing based
>>> on configured weight. Hence I would like to know your results.
>> 
>> Right, so I verified that the current version of the patch set still
>> works with ath9k. However, the ath10k card I have doesn't seem to
>> support peer stats, so I can't test ath10k.
>> 
>> $ lspci | grep Qualcomm
>> 03:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac
>> Wireless Network Adapter
>> 
>> $ cat /sys/kernel/debug/ieee80211/phy1/ath10k/chip_id
>> 0x043202ff
>> 
>> $ cat /sys/kernel/debug/ieee80211/phy1/ath10k/wmi_services  | grep PEER
>> WMI_SERVICE_PEER_CACHING                 -
>> WMI_SERVICE_PEER_STATS                   -
>> 
>
> Oops... Yeah 988x firmware (10.2.4) does not have peer stats support.
>
>> 
>> Is there a way to force-enable airtime support, is this a hardware 
>> issue?
>> 
> Unfortunately not. There is one more pending change that handles
> airtime report from HTT tx-compl. Again it depends firmware support.
> These experiments are taken with this f/w interface. Will post the
> change.

Right, thought so. Too bad. Guess you are doing all the ath10k testing,
then ;)

>>> 		sta1		sta2		sta3		sta4
>>> pull-mode	8s(205us)	18s(3.2ms)	8s(205us)	14s(410us)
>>> 		12s(256us)	12s(256us)	13s(256us)	12s(256us)
>>> 		14s(4ms)	13s(4ms)	14s(4ms)	13s(4ms)
>>> 
>>> push-mode	15s(205us)	12s(3.2ms)	16s(205us)	12s(410us)
>>> 		15s(256us)	12s(256us)	16s(256us)	12s(256us)
>>> 		14s(4ms)	13s(4ms)	16s(4ms)	12s(4ms)
>> 
>> Right, so the pull-mode results are encouraging! *Something* is
>> happening, at least, even though the aggregate airtime doesn't quite
>> match the ratios of the configured weights.
>> 
>> Are you running the UDP generator on the AP itself, or on a separate
>> device, BTW? If it's on the AP, the userspace socket can get throttled,
>> which will interfere with results, so it's better to have it on a
>> separate device (connected via ethernet).
>> 
> Traffic b/w wired client (via ethernet) and wireless clients.

Cool.

>> As for push-mode, could this be because of bad buffer management? I.e.,
>> because there's a lag between the time airtime is registered, and the
>> time that airtime usage is reported, the driver just pushes a whole
>> bunch of packets to the firmware when it gets the chance, which 
>> prevents
>> the scheduler from throttling properly. This could also explain the
>> better, but not quite perfect, results in pull mode, assuming that pull
>> mode results in better firmware buffer management which reduces, but
>> doesn't quite remove, the lag.
>> 
> Hmm... I agree that lag in reporting airtime may cause more data push
> to hw. Right now both tx and tx-compl are serialized by
> active_txq_lock. So once lock acquired by tx path, it will download
> all frames. i.e it is even true for ath9k driver. Hence I am wondering
> how it is working only with ath9k.

If enough packets are dequeued at once that the TXQ empties and is not
put back on the scheduling loop, the next_txq() loop is just going to
loop through the remaining TXQs and immediately increase their quantum.
In ath9k, there's a maximum of two outstanding aggregates below the TXQ
level, but I think you mentioned that ath10k can queue thousands in
firmware?

Your patch removes the 'max 16 packets at a time' check before the call
to ath10k_mac_tx_push_txq(); try adding that back and see if it helps?

>> If this is indeed the reason, the queue limit patches should hopefully
>> be a solution. So guess we need to get those working as well :)
>> 
> I would prefer to baseline the basic infra into upstream first and do
> enhancement on top of that.

Sure, I'm fine with merging this first and building on top.

> I request you to revisit maintaining per driver default. Otherwise
> there would be performance impact with 256us. :(

Hmm, how about we make it a driver-specified multiplier instead (which
could be 4, 8 or 16 for ath10k)? That way it would still work even if
userspace changes the weights. It would affect the minimum possible
granularity, but that is probably acceptable; and we would be free to
change the mechanism later, without affecting the userspace API.

-Toke




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux