[...] > > > > Yes, precisely. > > I distinctly remember what I tried to poke you and Eric on this approach > > earlier, but I cannot find a link to that email. > > > > I would really appreciate, if you Alex, could give the approach in > > veth_convert_skb_to_xdp_buff() some review, as I believe that is a huge > > potential for improvements that will lead to large performance > > improvements. (I'm sure Maryam will be eager to help re-test performance > > for her use-cases). > > Well just looking at it the quick and dirty answer would be to look at > making use of something like page_frag_cache. I won't go into details > since it isn't too different from the frag allocator, but it is much > simpler since it is just doing reference count hacks instead of having > to do the extra overhead to keep the DMA mapping in place. The veth > would then just be sitting on at most an order 3 page while it is > waiting to fully consume it rather than waiting on a full pool of > pages. Hi, I did some experiments using page_frag_cache/page_frag_alloc() instead of page_pools in a simple environment I used to test XDP for veth driver. In particular, I allocate a new buffer in veth_convert_skb_to_xdp_buff() from the page_frag_cache in order to copy the full skb in the new one, actually "linearizing" the packet (since we know the original skb length). I run an iperf TCP connection over a veth pair where the remote device runs the xdp_rxq_info sample (available in the kernel source tree, with action XDP_PASS): TCP clietn -- v0 === v1 (xdp_rxq_info) -- TCP server net-next (page_pool): - MTU 1500B: ~ 7.5 Gbps - MTU 8000B: ~ 15.3 Gbps net-next + page_frag_alloc: - MTU 1500B: ~ 8.4 Gbps - MTU 8000B: ~ 14.7 Gbps It seems there is no a clear "win" situation here (at least in this environment and we this simple approach). Moreover: - can the linearization introduce any issue whenever we perform XDP_REDIRECT into a destination device? - can the page_frag_cache introduce more memory fragmentation (IIRC we were experiencing this issue in mt76 before switching to page_pools). What do you think? Regards, Lorenzo > > Alternatively it could do something similar to page_frag_alloc_align > itself and just bypass doing a custom allocator. If it went that route > it could do something almost like a ring buffer and greatly improve > the throughput since it would be able to allocate a higher order page > and just copy the entire skb in so the entire thing would be linear > rather than having to allocate a bunch of single pages.
Attachment:
signature.asc
Description: PGP signature