Re: [PATCH bpf-next 0/5] Add support for SKIP_BPF flag for AF_XDP sockets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/15/2019 10:11 AM, Toke Høiland-Jørgensen wrote:
"Samudrala, Sridhar" <sridhar.samudrala@xxxxxxxxx> writes:

On 8/15/2019 4:12 AM, Toke Høiland-Jørgensen wrote:
Sridhar Samudrala <sridhar.samudrala@xxxxxxxxx> writes:

This patch series introduces XDP_SKIP_BPF flag that can be specified
during the bind() call of an AF_XDP socket to skip calling the BPF
program in the receive path and pass the buffer directly to the socket.

When a single AF_XDP socket is associated with a queue and a HW
filter is used to redirect the packets and the app is interested in
receiving all the packets on that queue, we don't need an additional
BPF program to do further filtering or lookup/redirect to a socket.

Here are some performance numbers collected on
    - 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
    - Intel 40Gb Ethernet NIC (i40e)

All tests use 2 cores and the results are in Mpps.

turbo on (default)
---------------------------------------------	
                        no-skip-bpf    skip-bpf
---------------------------------------------	
rxdrop zerocopy           21.9         38.5
l2fwd  zerocopy           17.0         20.5
rxdrop copy               11.1         13.3
l2fwd  copy                1.9          2.0

no turbo :  echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
---------------------------------------------	
                        no-skip-bpf    skip-bpf
---------------------------------------------	
rxdrop zerocopy           15.4         29.0
l2fwd  zerocopy           11.8         18.2
rxdrop copy                8.2         10.5
l2fwd  copy                1.7          1.7
---------------------------------------------

You're getting this performance boost by adding more code in the fast
path for every XDP program; so what's the performance impact of that for
cases where we do run an eBPF program?

The no-skip-bpf results are pretty close to what i see before the
patches are applied. As umem is cached in rx_ring for zerocopy the
overhead is much smaller compared to the copy scenario where i am
currently calling xdp_get_umem_from_qid().

I meant more for other XDP programs; what is the performance impact of
XDP_DROP, for instance?

Will run xdp1 with and without the patches and include that data with the next revision.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux