On 6/3/24 11:02 AM, Paul Barker wrote: [...] >>> This patch makes multiple changes that can't be separated: >>> >>> 1) Allocate plain RX buffers via a page pool instead of allocating >>> SKBs, then use build_skb() when a packet is received. >>> 2) For GbEth IP, reduce the RX buffer size to 2kB. >>> 3) For GbEth IP, merge packets which span more than one RX descriptor >>> as SKB fragments instead of copying data. >>> >>> Implementing (1) without (2) would require the use of an order-1 page >>> pool (instead of an order-0 page pool split into page fragments) for >>> GbEth. >>> >>> Implementing (2) without (3) would leave us no space to re-assemble >>> packets which span more than one RX descriptor. >>> >>> Implementing (3) without (1) would not be possible as the network stack >>> expects to use put_page() or page_pool_put_page() to free SKB fragments >>> after an SKB is consumed. >>> >>> RX checksum offload support is adjusted to handle both linear and >>> nonlinear (fragmented) packets. >>> >>> This patch gives the following improvements during testing with iperf3. >>> >>> * RZ/G2L: >>> * TCP RX: same bandwidth at -43% CPU load (70% -> 40%) >>> * UDP RX: same bandwidth at -17% CPU load (88% -> 74%) >>> >>> * RZ/G2UL: >>> * TCP RX: +30% bandwidth (726Mbps -> 941Mbps) >>> * UDP RX: +417% bandwidth (108Mbps -> 558Mbps) >>> >>> * RZ/G3S: >>> * TCP RX: +64% bandwidth (562Mbps -> 920Mbps) >>> * UDP RX: +420% bandwidth (90Mbps -> 468Mbps) >>> >>> * RZ/Five: >>> * TCP RX: +217% bandwidth (145Mbps -> 459Mbps) >>> * UDP RX: +470% bandwidth (20Mbps -> 114Mbps) >>> >>> There is no significant impact on bandwidth or CPU load in testing on >>> RZ/G2H or R-Car M3N. >>> >>> Signed-off-by: Paul Barker <paul.barker.ct@xxxxxxxxxxxxxx> [...] >>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c [...] >>> @@ -298,13 +269,14 @@ static void ravb_ring_free(struct net_device *ndev, int q) >>> priv->tx_ring[q] = NULL; >>> } >>> >>> - /* Free RX skb ringbuffer */ >>> - if (priv->rx_skb[q]) { >>> - for (i = 0; i < priv->num_rx_ring[q]; i++) >>> - dev_kfree_skb(priv->rx_skb[q][i]); >>> + /* Free RX buffers */ >>> + for (i = 0; i < priv->num_rx_ring[q]; i++) { >>> + if (priv->rx_buffers[q][i].page) >>> + page_pool_put_page(priv->rx_pool[q], priv->rx_buffers[q][i].page, 0, true); >> >> nit: Networking still prefers code to be 80 columns wide or less. >> It looks like that can be trivially achieved here. >> >> Flagged by checkpatch.pl --max-line-length=80 > > Sergey has asked me to wrap to 100 cols [1]. I can only find a reference > to 80 in the docs though [2], so I guess you may be right. > > [1]: https://lore.kernel.org/all/611a49b8-ecdb-6b91-9d3e-262bf3851f5b@xxxxxx/ > [2]: https://www.kernel.org/doc/html/latest/process/coding-style.html Note that I (mostly) meant the comments... [...] MBR, Sergey