> > > On 12/6/23 00:58, Jakub Kicinski wrote: > > On Wed, 6 Dec 2023 00:08:15 +0100 Lorenzo Bianconi wrote: > > > v00 (NS:ns0 - 192.168.0.1/24) <---> (NS:ns1 - 192.168.0.2/24) v01 ==(XDP_REDIRECT)==> v10 (NS:ns1 - 192.168.1.1/24) <---> (NS:ns2 - 192.168.1.2/24) v11 > > > > > > - v00: iperf3 client (pinned on core 0) > > > - v11: iperf3 server (pinned on core 7) > > > > > > net-next veth codebase (page_pool APIs): > > > ======================================= > > > - MTU 1500: ~ 5.42 Gbps > > > - MTU 8000: ~ 14.1 Gbps > > > - MTU 64000: ~ 18.4 Gbps > > > > > > net-next veth codebase + page_frag_cahe APIs [0]: > > > ================================================= > > > - MTU 1500: ~ 6.62 Gbps > > > - MTU 8000: ~ 14.7 Gbps > > > - MTU 64000: ~ 19.7 Gbps > > > > > > xdp_generic codebase + page_frag_cahe APIs (current proposed patch): > > > ==================================================================== > > > - MTU 1500: ~ 6.41 Gbps > > > - MTU 8000: ~ 14.2 Gbps > > > - MTU 64000: ~ 19.8 Gbps > > > > > > xdp_generic codebase + page_frag_cahe APIs [1]: > > > =============================================== > > > > This one should say page pool? yep, sorry > > > > > - MTU 1500: ~ 5.75 Gbps > > > - MTU 8000: ~ 15.3 Gbps > > > - MTU 64000: ~ 21.2 Gbps > > > > > > It seems page_pool APIs are working better for xdp_generic codebase > > > (except MTU 1500 case) while page_frag_cache APIs are better for > > > veth driver. What do you think? Am I missing something? > > > > IDK the details of veth XDP very well but IIUC they are pretty much > > the same. Are there any clues in perf -C 0 / 7? > > > > > [0] Here I have just used napi_alloc_frag() instead of > > > page_pool_dev_alloc_va()/page_pool_dev_alloc() in > > > veth_convert_skb_to_xdp_buff() > > > > > > [1] I developed this PoC to use page_pool APIs for xdp_generic code: > > > > Why not put the page pool in softnet_data? > > First I thought cool that Jakub is suggesting softnet_data, which will > make page_pool (PP) even more central as the netstacks memory layer. > > BUT then I realized that PP have a weakness, which is the return/free > path that need to take a normal spin_lock, as that can be called from > any CPU (unlike the RX/alloc case). Thus, I fear that making multiple > devices share a page_pool via softnet_data, increase the chance of lock > contention when packets are "freed" returned/recycled. yep, afaik skb_attempt_defer_free() is used just by the tcp stack so far (e.g. we will have contention for udp). moreover it seems page_pool return path is not so optimized for the percpu approach (we have a lot of atomic read/write operations and page_pool stats are already implemented as percpu variables). Regards, Lorenzo > > --Jesper > > p.s. PP have the page_pool_put_page_bulk() API, but only XDP (NIC-drivers) > leverage this.
Attachment:
signature.asc
Description: PGP signature