On 8/15/2019 4:12 AM, Toke Høiland-Jørgensen wrote:
Sridhar Samudrala <sridhar.samudrala@xxxxxxxxx> writes:
This patch series introduces XDP_SKIP_BPF flag that can be specified
during the bind() call of an AF_XDP socket to skip calling the BPF
program in the receive path and pass the buffer directly to the socket.
When a single AF_XDP socket is associated with a queue and a HW
filter is used to redirect the packets and the app is interested in
receiving all the packets on that queue, we don't need an additional
BPF program to do further filtering or lookup/redirect to a socket.
Here are some performance numbers collected on
- 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
- Intel 40Gb Ethernet NIC (i40e)
All tests use 2 cores and the results are in Mpps.
turbo on (default)
---------------------------------------------
no-skip-bpf skip-bpf
---------------------------------------------
rxdrop zerocopy 21.9 38.5
l2fwd zerocopy 17.0 20.5
rxdrop copy 11.1 13.3
l2fwd copy 1.9 2.0
no turbo : echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
---------------------------------------------
no-skip-bpf skip-bpf
---------------------------------------------
rxdrop zerocopy 15.4 29.0
l2fwd zerocopy 11.8 18.2
rxdrop copy 8.2 10.5
l2fwd copy 1.7 1.7
---------------------------------------------
You're getting this performance boost by adding more code in the fast
path for every XDP program; so what's the performance impact of that for
cases where we do run an eBPF program?
The no-skip-bpf results are pretty close to what i see before the
patches are applied. As umem is cached in rx_ring for zerocopy the
overhead is much smaller compared to the copy scenario where i am
currently calling xdp_get_umem_from_qid().
Also, this is basically a special-casing of a particular deployment
scenario. Without a way to control RX queue assignment and traffic
steering, you're basically hard-coding a particular app's takeover of
the network interface; I'm not sure that is such a good idea...
Yes. This is mainly targeted for application that create 1 AF_XDP socket
per RX queue and can use a HW filter (via ethtool or TC flower) to
redirect the packets to a queue or a group of queues.
-Toke