Hi Paolo, Niklas, On 20/06/2024 16:23, Paolo Abeni wrote: > On Thu, 2024-06-20 at 13:50 +0200, Niklas Söderlund wrote: >> On 2024-06-20 13:13:21 +0200, Paolo Abeni wrote: >>> >>> skb allocation is preferred at receive time, so that the sk_buff itself >>> is hot in the cache. Adapting to such style would likely require a >>> larger refactor, so feel free to avoid it. >> >> This is good feedback. There are advanced features in TSN that I would >> like to work on in the future. One of them is to improve the Rx path to >> support split descriptors allowing for larger MTU. That too would >> require invasive changes in this code. I will make a note of it and try >> to do both. > > In the context of a largish refactor, then I suggest additional > investigating replacing napi_gro_receive() with napi_gro_frags(). > > The latter should provide the best performances for GRO-ed traffic. This prompted me to try converting ravb_rx_gbeth() in the ravb driver to use napi_get_frags()/napi_gro_frags(). The result of that change was no improvement in TCP RX performance and a roughly 10% loss in UDP RX performance on the RZ/G2UL. i.e. napi_gro_frags() is worse than napi_gro_receive() in this driver. I guess using napi_gro_frags() removes the need to copy data if you need to add space to the first fragment for a struct skb_shared_info. For the GbEth IP, we reserve space for the shared info structure in every fragment buffer anyway. For <=1500 byte packets there is no benefit to changing this, but for larger packets perhaps we would see better efficiency if all of each 2kB fragment buffer could be used for packet data, with space for the shared info being allocated separately via napi_get_frags(). Some thoughts for the future I guess. Am I missing anything here about why napi_gro_frags() should be better? Thanks, -- Paul Barker
Attachment:
OpenPGP_0x27F4B3459F002257.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature