On Wed, Mar 08, 2023 at 12:36:15AM +0000, Ping-Ke Shih wrote: > > > > -----Original Message----- > > From: Larry Finger <larry.finger@xxxxxxxxx> On Behalf Of Larry Finger > > Sent: Tuesday, March 7, 2023 10:44 PM > > To: Ping-Ke Shih <pkshih@xxxxxxxxxxx>; Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> > > Cc: linux-wireless <linux-wireless@xxxxxxxxxxxxxxx> > > Subject: Re: Performance of rtw88_8822bu > > > > On 3/6/23 19:39, Ping-Ke Shih wrote: > > > > > > > > >> -----Original Message----- > > >> From: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> > > >> Sent: Monday, March 6, 2023 9:00 PM > > >> To: Larry Finger <Larry.Finger@xxxxxxxxxxxx> > > >> Cc: Ping-Ke Shih <pkshih@xxxxxxxxxxx>; linux-wireless <linux-wireless@xxxxxxxxxxxxxxx> > > >> Subject: Re: Performance of rtw88_8822bu > > >> > > >> On Mon, Mar 06, 2023 at 10:18:45AM +0100, Sascha Hauer wrote: > > >>> Hi Larry, > > >>> > > >>> On Sat, Mar 04, 2023 at 08:52:26PM -0600, Larry Finger wrote: > > >>>> Sascha an Ping-Ke, > > >>>> > > >>>> I have been testing the RTW8822BU driver found in my rtw88 GitHub repo. This > > >>>> code matches the code found in wireless-next. I created 9 files of 5.8 GiB > > >>>> each and used a for loop to copy them from the test computer to/from my > > >>>> server. The wireless connection is on the 5 GHz band (channel 153) connected > > >>>> to an ax1500 Wifi 6 router, which in turn is connected to the server via a > > >>>> 1G ethernet cable. The connection has not crashed, but I see strange > > >>>> behavior. > > >>> > > >>> What chipset are you using? Is it a RTL8822bu or some other chipset > > >>> reported by the driver? > > >>> > > >>>> > > >>>> With both TX and RX, the rate is high at 13.5 MiB/s for RX and 11.1 MiB/s > > >>>> for TX for about 1/3 of the time, but then the driver reports "timed out to > > >>>> flush queue 3" and the rate drops to 3-5 MiB/s for RX and 2-3 MiB/s for TX. > > >>>> These low rates are in effect for 2/3 of the time. The 5G bands are > > >>>> relatively unused in my house, thus I do not suspect interference. > > >>> > > >>> I've received a very similar report this weekend. About 3-4 messages per > > >>> second, "timed out to flush queue 3", but driver continues to work. > > >>> I've also seen it this morning by accident and once again while writing > > >>> this mail. This was on a RTL8821CU. > > >>> > > >>> So far I have no idea what the problem might be. > > >> > > >> The "timed out to flush queue %d\n" message comes from > > >> __rtw_mac_flush_prio_queue(). Here some registers are read which show > > >> the number of reserved pages for a queue and the number of available > > >> pages of a queue. I used the debugfs interface to observe these > > >> registers from time to time: > > >> > > >> f=$(echo /sys/kernel/debug/ieee80211/phy*/rtw88/read_reg); for i in 0x230 0x234 0x238 0x23c; do echo > > "$i > > >> 4" > $f; cat $f; done > > >> > > >> This is what they show: > > >> > > >> reg 0x230: 0x00230040 > > >> reg 0x234: 0x00400040 > > >> reg 0x238: 0x00400040 > > >> reg 0x23c: 0x00000000 > > >> > > >> The upper 16bit contain the number of available pages and the lower > > >> 16bit contain the number of reserved pages (Note these are the registers > > >> on a RTL8822CU, on other chipsets the number of available pages is > > >> lower, like 0x10 on RTL8821CU). Register 0x230 is the interesting one > > >> for us, it has the values for queue 3. > > >> > > >> What I can see is that for the other queues the number of reserved pages > > >> usually matches the number of available pages. It happens sometimes that > > >> the number of available pages goes down to 0x3f, but with the next > > >> register read it goes back to 0x40. For 0x230 this is different though. > > >> Here the number of available pages continuously decreases over time and > > >> never goes back up. > > >> > > >> I don't know what this is trying to tell me. It seems that things queued > > >> to queue RTW_DMA_MAPPING_HIGH are sometimes (always?) stuck. > > >> Unfortunately I also don't know how the different priority queues relate > > >> to the different USB endpoints and how these in turn go together with > > >> the qsel settings. Maybe Ping-Ke can shed some light on this. > > >> > > > > > > To quickly check if RTW_DMA_MAPPING_HIGH get stuck, changing qsel_to_ep[] > > > to different priority queue would be helpful to identify the problem. > > > If only this queue works not well, we may dig MAC settings. Otherwise, > > > it may be a RF performance problem. > > > > > > 0x240 is another queue called public queue. If 0x230/0x234/0x238/0x23c > > > become full, packets are queued into this queue. From view of MAC circuit, > > > it fetches these queues in specific order (from high to low conceptually; > > > I'm 100% sure.), and apply EDCA contention parameters for internal and > > > external contention. > > > > > > I don't have much useful ideas to this problem for now. > > > > Ping-Ke and Sasha, > > > > I made a discovery this morning. I set up a transfer from my NFS server to the > > computer over an rtw8822bu link using rsync with the --progress option. In a > > second window, I ran Sasha's register dump in a loop using a 5 second delay > > between readouts. A third window showed was running 'dmesg -w'. > > > > The transfer ran to completion on a 5.8 GiB file with all incremental speeds > > reported as 11-12 MB/s. No timeouts on flushing the queue were logged, until I > > opened the NetworkManager applet! At that point, I got many queue timeouts > > logged, and the instantaneous throughput dropped to 2-3 MB/s as I reported > > earlier. Surprisingly, there were no changes in the registers when the errors > > happened. > > > > The NM applet is going to be reading the transfer rate from the device, which > > apparently messes up the data flow to/from the device. > > > > As long as I do not cause the NM applet to display the connections, I get > > nothing logged. > > > > I think NM triggers scan operation when turning it on. Then, driver switches channels > between AP and scan channels with flushing queue that causes timeout. The cause is > still hard to transmit packets out, so TX buffer gets jammed. > > If you enlarge the retry count or timeout value of __rtw_mac_flush_prio_queue(), > the timeout flushing could be disappear. Also, if we can implement > rtwdev->hci.ops->flush_queues for USB, the flushing log can be reduced. I don't have nm-applet available on my box, but with a 'nmcli dev wifi list --rescan yes' I run into problems quite fast. That also happens on an otherwise idle wifi link. Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |