On 8/15/2019 5:51 AM, Björn Töpel wrote:
On 2019-08-15 05:46, Sridhar Samudrala wrote:
This patch series introduces XDP_SKIP_BPF flag that can be specified
during the bind() call of an AF_XDP socket to skip calling the BPF
program in the receive path and pass the buffer directly to the socket.
When a single AF_XDP socket is associated with a queue and a HW
filter is used to redirect the packets and the app is interested in
receiving all the packets on that queue, we don't need an additional
BPF program to do further filtering or lookup/redirect to a socket.
Here are some performance numbers collected on
- 2 socket 28 core Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
- Intel 40Gb Ethernet NIC (i40e)
All tests use 2 cores and the results are in Mpps.
turbo on (default)
---------------------------------------------
no-skip-bpf skip-bpf
---------------------------------------------
rxdrop zerocopy 21.9 38.5
l2fwd zerocopy 17.0 20.5
rxdrop copy 11.1 13.3
l2fwd copy 1.9 2.0
no turbo : echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
---------------------------------------------
no-skip-bpf skip-bpf
---------------------------------------------
rxdrop zerocopy 15.4 29.0
l2fwd zerocopy 11.8 18.2
rxdrop copy 8.2 10.5
l2fwd copy 1.7 1.7
---------------------------------------------
This work is somewhat similar to the XDP_ATTACH work [1]. Avoiding the
retpoline in the XDP program call is a nice performance boost! I like
the numbers! :-) I also like the idea of adding a flag that just does
what most AF_XDP Rx users want -- just getting all packets of a
certain queue into the XDP sockets.
In addition to Toke's mail, I have some more concerns with the series:
* AFAIU the SKIP_BPF only works for zero-copy enabled sockets. IMO, it
should work for all modes (including XDP_SKB).
This patch enables SKIP_BPF for AF_XDP sockets where an XDP program is
attached at driver level (both zerocopy and copy modes)
I tried a quick hack to see the perf benefit with generic XDP mode, but
i didn't see any significant improvement in performance in that
scenario. so i didn't include that mode.
* In order to work, a user still needs an XDP program running. That's
clunky. I'd like the behavior that if no XDP program is attached,
and the option is set, the packets for a that queue end up in the
socket. If there's an XDP program attached, the program has
precedence.
I think this would require more changes in the drivers to take XDP
datapath even when there is no XDP program loaded.
* It requires changes in all drivers. Not nice, and scales badly. Try
making it generic (xdp_do_redirect/xdp_flush), so it Just Works for
all XDP capable drivers.
I tried to make this as generic as possible and make the changes to the
driver very minimal, but could not find a way to avoid any changes at
all to the driver. xdp_do_direct() gets called based after the call to
bpf_prog_run_xdp() in the drivers.
Thanks for working on this!
Björn
[1]
https://lore.kernel.org/netdev/20181207114431.18038-1-bjorn.topel@xxxxxxxxx/
Sridhar Samudrala (5):
xsk: Convert bool 'zc' field in struct xdp_umem to a u32 bitmap
xsk: Introduce XDP_SKIP_BPF bind option
i40e: Enable XDP_SKIP_BPF option for AF_XDP sockets
ixgbe: Enable XDP_SKIP_BPF option for AF_XDP sockets
xdpsock_user: Add skip_bpf option
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 22 +++++++++-
drivers/net/ethernet/intel/i40e/i40e_xsk.c | 6 +++
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 20 ++++++++-
drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 16 ++++++-
include/net/xdp_sock.h | 21 ++++++++-
include/uapi/linux/if_xdp.h | 1 +
include/uapi/linux/xdp_diag.h | 1 +
net/xdp/xdp_umem.c | 9 ++--
net/xdp/xsk.c | 43 ++++++++++++++++---
net/xdp/xsk_diag.c | 5 ++-
samples/bpf/xdpsock_user.c | 8 ++++
11 files changed, 135 insertions(+), 17 deletions(-)