"Samudrala, Sridhar" <sridhar.samudrala@xxxxxxxxx> writes: > On 8/15/2019 4:12 AM, Toke Høiland-Jørgensen wrote: >> Sridhar Samudrala <sridhar.samudrala@xxxxxxxxx> writes: >> >>> This patch series introduces XDP_SKIP_BPF flag that can be specified >>> during the bind() call of an AF_XDP socket to skip calling the BPF >>> program in the receive path and pass the buffer directly to the socket. >>> >>> When a single AF_XDP socket is associated with a queue and a HW >>> filter is used to redirect the packets and the app is interested in >>> receiving all the packets on that queue, we don't need an additional >>> BPF program to do further filtering or lookup/redirect to a socket. >>> >>> Here are some performance numbers collected on >>> - 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz >>> - Intel 40Gb Ethernet NIC (i40e) >>> >>> All tests use 2 cores and the results are in Mpps. >>> >>> turbo on (default) >>> --------------------------------------------- >>> no-skip-bpf skip-bpf >>> --------------------------------------------- >>> rxdrop zerocopy 21.9 38.5 >>> l2fwd zerocopy 17.0 20.5 >>> rxdrop copy 11.1 13.3 >>> l2fwd copy 1.9 2.0 >>> >>> no turbo : echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo >>> --------------------------------------------- >>> no-skip-bpf skip-bpf >>> --------------------------------------------- >>> rxdrop zerocopy 15.4 29.0 >>> l2fwd zerocopy 11.8 18.2 >>> rxdrop copy 8.2 10.5 >>> l2fwd copy 1.7 1.7 >>> --------------------------------------------- >> >> You're getting this performance boost by adding more code in the fast >> path for every XDP program; so what's the performance impact of that for >> cases where we do run an eBPF program? > > The no-skip-bpf results are pretty close to what i see before the > patches are applied. As umem is cached in rx_ring for zerocopy the > overhead is much smaller compared to the copy scenario where i am > currently calling xdp_get_umem_from_qid(). I meant more for other XDP programs; what is the performance impact of XDP_DROP, for instance? >> Also, this is basically a special-casing of a particular deployment >> scenario. Without a way to control RX queue assignment and traffic >> steering, you're basically hard-coding a particular app's takeover of >> the network interface; I'm not sure that is such a good idea... > > Yes. This is mainly targeted for application that create 1 AF_XDP > socket per RX queue and can use a HW filter (via ethtool or TC flower) > to redirect the packets to a queue or a group of queues. Yeah, and I'd prefer it if the handling of this to be unified somehow... -Toke