Hello Ping-Ke, sorry again for the long waiting time. I'll be quicker next time. On Tue, Jun 20, 2023 at 7:26 AM Ping-Ke Shih <pkshih@xxxxxxxxxxx> wrote: [...] > > > The unit of BIT_RXDMA_AGG_PG_TH is 1k bytes, so I think you can > > > set mmc_host->max_req_size/1024. > > I tried this but I got a result that I don't understand. > > I've been testing with three BIT_RXDMA_AGG_PG_TH values on a SoC that > > can handle 255 * 1024 bytes. Each time I connected to the same AP and > > downloaded a bigger file over http(s). > > BIT_RXDMA_AGG_PG_TH: biggest observed rx_len in rtw_sdio_rxfifo_recv() > > 255: 20968 > > 6: 5122 > > 1: 1602 > > Please also print out number of packets you receive, and then we can see how > many packets aggregate. sure - here are the results: BIT_RXDMA_AGG_PG_TH: biggest observed rx_len in rtw_sdio_rxfifo_recv() / number of (aggregated) packets 255: 20824 / 12 6: 5128 / 4 1: 3132 / 1 (these were a few exceptions and I'm not able to reliably reproduce it, 1602 / 1 is what I can easily reproduce) > > The biggest rx_len I have observed for BIT_RXDMA_AGG_PG_TH 1 looks suspicious: > > My understanding is that I shouldn't be seeing rx_len larger than > > BIT_RXDMA_AGG_PG_TH * 1024. > > BIT_RXDMA_AGG_PG_TH = 6 is within this limit but BIT_RXDMA_AGG_PG_TH = > > 1 isn't (I'm seeing 578 extra bytes in addition to the 1024 bytes that > > I was expecting). > > Assume threshold is 1k, and single one packet is larger than 1k. Hardware > will not split it into two. Also, please make sure 0x280[29] BIT_EN_PRE_CALC > is 1. Otherwise, it will possibly aggregate additional one packet to over > the threshold. >From the numbers above it seems most likely that we're hitting the "one packet is larger than 1k" case. Also I'm seeing: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 My interface's MTU is 1500 bytes. Seeing 1602 bytes rx_len with one packet is already odd (that would mean 102 bytes for overhead like RX descriptor and other headers/metadata). But 3132 bytes rx_len is very odd. BIT_EN_PRE_CALC is set, see also the attached diff (it's not meant to be applied anywhere - it's just so you understand what I've been testing with). > 0x280[15:8] is timeout time in unit of 1us for SDIO interface. When set > threshold to 255, you can enlarge this to see if it can aggregate more as > expected. I did not experiment with this yet as I'd like to understand the above findings first. Best regards, Martin
diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c index 2c1fb2dabd40..0bb872ef1296 100644 --- a/drivers/net/wireless/realtek/rtw88/sdio.c +++ b/drivers/net/wireless/realtek/rtw88/sdio.c @@ -654,23 +654,35 @@ static void rtw_sdio_init(struct rtw_dev *rtwdev) static void rtw_sdio_enable_rx_aggregation(struct rtw_dev *rtwdev) { - u8 size, timeout; + struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv; + struct mmc_host *host = rtwsdio->sdio_func->card->host; + unsigned int host_max_pages; + u8 chip_max_pages, timeout; if (rtw_chip_wcpu_11n(rtwdev)) { - size = 0x6; + chip_max_pages = 0x6; timeout = 0x6; } else { - size = 0xff; + chip_max_pages = 0xff; timeout = 0x1; } + host_max_pages = host->max_req_size / SZ_1K; + + rtw_err(rtwdev, "Max RX pages - chip aggregation limit: %u, host controller limit: %u\n", chip_max_pages, host_max_pages);// HACK + rtw_dbg(rtwdev, RTW_DBG_SDIO, + "Max RX pages - chip aggregation limit: %u, host controller limit: %u\n", + chip_max_pages, host_max_pages); + /* Make the firmware honor the size limit configured below */ rtw_write32_set(rtwdev, REG_RXDMA_AGG_PG_TH, BIT_EN_PRE_CALC); rtw_write8_set(rtwdev, REG_TXDMA_PQ_MAP, BIT_RXDMA_AGG_EN); rtw_write16(rtwdev, REG_RXDMA_AGG_PG_TH, - FIELD_PREP(BIT_RXDMA_AGG_PG_TH, size) | + FIELD_PREP(BIT_RXDMA_AGG_PG_TH, min_t(unsigned int, + chip_max_pages, + host_max_pages)) | FIELD_PREP(BIT_DMA_AGG_TO_V1, timeout)); rtw_write8_set(rtwdev, REG_RXDMA_MODE, BIT_DMA_MODE); @@ -936,6 +948,7 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len) { struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv; const struct rtw_chip_info *chip = rtwdev->chip; + unsigned int num_pkt = 0, orig_rx_len = rx_len; u32 pkt_desc_sz = chip->rx_pkt_desc_sz; struct ieee80211_rx_status rx_status; struct rtw_rx_pkt_stat pkt_stat; @@ -974,6 +987,7 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len) */ rtw_sdio_rx_skb(rtwdev, skb, pkt_offset, &pkt_stat, &rx_status); + num_pkt++; break; } @@ -981,6 +995,7 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len) if (!split_skb) { rtw_sdio_rx_skb(rtwdev, skb, pkt_offset, &pkt_stat, &rx_status); + num_pkt++; break; } @@ -989,11 +1004,14 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len) rtw_sdio_rx_skb(rtwdev, split_skb, pkt_offset, &pkt_stat, &rx_status); + num_pkt++; /* Move to the start of the next RX descriptor */ skb_reserve(skb, curr_pkt_len); rx_len -= curr_pkt_len; } + + pr_err("%s(%u) - number of packets: %u\n", __func__, orig_rx_len, num_pkt);// HACK } static void rtw_sdio_rx_isr(struct rtw_dev *rtwdev)