Re: [net-next PATCH v4 7/7] net: ravb: Allocate RX buffers via page pool

Sergey Shtylyov <s.shtylyov@xxxxxx> · Mon, 3 Jun 2024 23:45:51 +0300

On 6/3/24 11:02 AM, Paul Barker wrote:
[...]
>>> This patch makes multiple changes that can't be separated:
>>>
>>>   1) Allocate plain RX buffers via a page pool instead of allocating
>>>      SKBs, then use build_skb() when a packet is received.
>>>   2) For GbEth IP, reduce the RX buffer size to 2kB.
>>>   3) For GbEth IP, merge packets which span more than one RX descriptor
>>>      as SKB fragments instead of copying data.
>>>
>>> Implementing (1) without (2) would require the use of an order-1 page
>>> pool (instead of an order-0 page pool split into page fragments) for
>>> GbEth.
>>>
>>> Implementing (2) without (3) would leave us no space to re-assemble
>>> packets which span more than one RX descriptor.
>>>
>>> Implementing (3) without (1) would not be possible as the network stack
>>> expects to use put_page() or page_pool_put_page() to free SKB fragments
>>> after an SKB is consumed.
>>>
>>> RX checksum offload support is adjusted to handle both linear and
>>> nonlinear (fragmented) packets.
>>>
>>> This patch gives the following improvements during testing with iperf3.
>>>
>>>   * RZ/G2L:
>>>     * TCP RX: same bandwidth at -43% CPU load (70% -> 40%)
>>>     * UDP RX: same bandwidth at -17% CPU load (88% -> 74%)
>>>
>>>   * RZ/G2UL:
>>>     * TCP RX: +30% bandwidth (726Mbps -> 941Mbps)
>>>     * UDP RX: +417% bandwidth (108Mbps -> 558Mbps)
>>>
>>>   * RZ/G3S:
>>>     * TCP RX: +64% bandwidth (562Mbps -> 920Mbps)
>>>     * UDP RX: +420% bandwidth (90Mbps -> 468Mbps)
>>>
>>>   * RZ/Five:
>>>     * TCP RX: +217% bandwidth (145Mbps -> 459Mbps)
>>>     * UDP RX: +470% bandwidth (20Mbps -> 114Mbps)
>>>
>>> There is no significant impact on bandwidth or CPU load in testing on
>>> RZ/G2H or R-Car M3N.
>>>
>>> Signed-off-by: Paul Barker <paul.barker.ct@xxxxxxxxxxxxxx>

[...]

>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
[...]
>>> @@ -298,13 +269,14 @@ static void ravb_ring_free(struct net_device *ndev, int q)
>>>  		priv->tx_ring[q] = NULL;
>>>  	}
>>>  
>>> -	/* Free RX skb ringbuffer */
>>> -	if (priv->rx_skb[q]) {
>>> -		for (i = 0; i < priv->num_rx_ring[q]; i++)
>>> -			dev_kfree_skb(priv->rx_skb[q][i]);
>>> +	/* Free RX buffers */
>>> +	for (i = 0; i < priv->num_rx_ring[q]; i++) {
>>> +		if (priv->rx_buffers[q][i].page)
>>> +			page_pool_put_page(priv->rx_pool[q], priv->rx_buffers[q][i].page, 0, true);
>>
>> nit: Networking still prefers code to be 80 columns wide or less.
>>      It looks like that can be trivially achieved here.
>>
>>      Flagged by checkpatch.pl --max-line-length=80
> 
> Sergey has asked me to wrap to 100 cols [1]. I can only find a reference
> to 80 in the docs though [2], so I guess you may be right.
> 
> [1]: https://lore.kernel.org/all/611a49b8-ecdb-6b91-9d3e-262bf3851f5b@xxxxxx/
> [2]: https://www.kernel.org/doc/html/latest/process/coding-style.html

   Note that I (mostly) meant the comments...

[...]

MBR, Sergey