On Wed, 2023-03-08 at 00:36 +0000, Ping-Ke Shih wrote: > > > > -----Original Message----- > > From: Larry Finger <larry.finger@xxxxxxxxx> On Behalf Of Larry > > Finger > > Sent: Tuesday, March 7, 2023 10:44 PM > > To: Ping-Ke Shih <pkshih@xxxxxxxxxxx>; Sascha Hauer > > <s.hauer@xxxxxxxxxxxxxx> > > Cc: linux-wireless <linux-wireless@xxxxxxxxxxxxxxx> > > Subject: Re: Performance of rtw88_8822bu > > > > On 3/6/23 19:39, Ping-Ke Shih wrote: > > > > > > > > > > -----Original Message----- > > > > From: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx> > > > > Sent: Monday, March 6, 2023 9:00 PM > > > > To: Larry Finger <Larry.Finger@xxxxxxxxxxxx> > > > > Cc: Ping-Ke Shih <pkshih@xxxxxxxxxxx>; linux-wireless > > > > <linux-wireless@xxxxxxxxxxxxxxx> > > > > Subject: Re: Performance of rtw88_8822bu > > > > > > > > On Mon, Mar 06, 2023 at 10:18:45AM +0100, Sascha Hauer wrote: > > > > > Hi Larry, > > > > > > > > > > On Sat, Mar 04, 2023 at 08:52:26PM -0600, Larry Finger wrote: > > > > > > Sascha an Ping-Ke, > > > > > > > > > > > > I have been testing the RTW8822BU driver found in my rtw88 > > > > > > GitHub repo. This > > > > > > code matches the code found in wireless-next. I created 9 > > > > > > files of 5.8 GiB > > > > > > each and used a for loop to copy them from the test > > > > > > computer to/from my > > > > > > server. The wireless connection is on the 5 GHz band > > > > > > (channel 153) connected > > > > > > to an ax1500 Wifi 6 router, which in turn is connected to > > > > > > the server via a > > > > > > 1G ethernet cable. The connection has not crashed, but I > > > > > > see strange > > > > > > behavior. > > > > > > > > > > What chipset are you using? Is it a RTL8822bu or some other > > > > > chipset > > > > > reported by the driver? > > > > > > > > > > > > > > > > > With both TX and RX, the rate is high at 13.5 MiB/s for RX > > > > > > and 11.1 MiB/s > > > > > > for TX for about 1/3 of the time, but then the driver > > > > > > reports "timed out to > > > > > > flush queue 3" and the rate drops to 3-5 MiB/s for RX and > > > > > > 2-3 MiB/s for TX. > > > > > > These low rates are in effect for 2/3 of the time. The 5G > > > > > > bands are > > > > > > relatively unused in my house, thus I do not suspect > > > > > > interference. > > > > > > > > > > I've received a very similar report this weekend. About 3-4 > > > > > messages per > > > > > second, "timed out to flush queue 3", but driver continues to > > > > > work. > > > > > I've also seen it this morning by accident and once again > > > > > while writing > > > > > this mail. This was on a RTL8821CU. > > > > > > > > > > So far I have no idea what the problem might be. > > > > > > > > The "timed out to flush queue %d\n" message comes from > > > > __rtw_mac_flush_prio_queue(). Here some registers are read > > > > which show > > > > the number of reserved pages for a queue and the number of > > > > available > > > > pages of a queue. I used the debugfs interface to observe these > > > > registers from time to time: > > > > > > > > f=$(echo /sys/kernel/debug/ieee80211/phy*/rtw88/read_reg); for > > > > i in 0x230 0x234 0x238 0x23c; do echo > > "$i > > > > 4" > $f; cat $f; done > > > > > > > > This is what they show: > > > > > > > > reg 0x230: 0x00230040 > > > > reg 0x234: 0x00400040 > > > > reg 0x238: 0x00400040 > > > > reg 0x23c: 0x00000000 > > > > > > > > The upper 16bit contain the number of available pages and the > > > > lower > > > > 16bit contain the number of reserved pages (Note these are the > > > > registers > > > > on a RTL8822CU, on other chipsets the number of available pages > > > > is > > > > lower, like 0x10 on RTL8821CU). Register 0x230 is the > > > > interesting one > > > > for us, it has the values for queue 3. > > > > > > > > What I can see is that for the other queues the number of > > > > reserved pages > > > > usually matches the number of available pages. It happens > > > > sometimes that > > > > the number of available pages goes down to 0x3f, but with the > > > > next > > > > register read it goes back to 0x40. For 0x230 this is different > > > > though. > > > > Here the number of available pages continuously decreases over > > > > time and > > > > never goes back up. > > > > > > > > I don't know what this is trying to tell me. It seems that > > > > things queued > > > > to queue RTW_DMA_MAPPING_HIGH are sometimes (always?) stuck. > > > > Unfortunately I also don't know how the different priority > > > > queues relate > > > > to the different USB endpoints and how these in turn go > > > > together with > > > > the qsel settings. Maybe Ping-Ke can shed some light on this. > > > > > > > > > > To quickly check if RTW_DMA_MAPPING_HIGH get stuck, changing > > > qsel_to_ep[] > > > to different priority queue would be helpful to identify the > > > problem. > > > If only this queue works not well, we may dig MAC settings. > > > Otherwise, > > > it may be a RF performance problem. > > > > > > 0x240 is another queue called public queue. If > > > 0x230/0x234/0x238/0x23c > > > become full, packets are queued into this queue. From view of MAC > > > circuit, > > > it fetches these queues in specific order (from high to low > > > conceptually; > > > I'm 100% sure.), and apply EDCA contention parameters for > > > internal and > > > external contention. > > > > > > I don't have much useful ideas to this problem for now. > > > > Ping-Ke and Sasha, > > > > I made a discovery this morning. I set up a transfer from my NFS > > server to the > > computer over an rtw8822bu link using rsync with the --progress > > option. In a > > second window, I ran Sasha's register dump in a loop using a 5 > > second delay > > between readouts. A third window showed was running 'dmesg -w'. > > > > The transfer ran to completion on a 5.8 GiB file with all > > incremental speeds > > reported as 11-12 MB/s. No timeouts on flushing the queue were > > logged, until I > > opened the NetworkManager applet! At that point, I got many queue > > timeouts > > logged, and the instantaneous throughput dropped to 2-3 MB/s as I > > reported > > earlier. Surprisingly, there were no changes in the registers when > > the errors > > happened. > > > > The NM applet is going to be reading the transfer rate from the > > device, which > > apparently messes up the data flow to/from the device. > > > > As long as I do not cause the NM applet to display the connections, > > I get > > nothing logged. > > > > I think NM triggers scan operation when turning it on. Then, driver > switches channels > between AP and scan channels with flushing queue that causes timeout. > The cause is > still hard to transmit packets out, so TX buffer gets jammed. Yes, (at least historically) nm-applet requests that NM perform a scan when you interact with it, on the theory that when you open the WiFi network menu you probably want to see recent scan results. Similar to MacOS's AirPort menu. Most drivers handle that OK with intermittent traffic, but it will cause disruption if for high throughput and/or latency-sensitive traffic. Dan > > If you enlarge the retry count or timeout value of > __rtw_mac_flush_prio_queue(), > the timeout flushing could be disappear. Also, if we can implement > rtwdev->hci.ops->flush_queues for USB, the flushing log can be > reduced. > > Ping-Ke >