Please see the first patch in the series for the overall design and use-cases. See the following email from Toke for the per-packet metadata overhead: https://lore.kernel.org/bpf/20221206024554.3826186-1-sdf@xxxxxxxxxx/T/#m49d48ea08d525ec88360c7d14c4d34fb0e45e798 Recent changes: - Drop _supported kfuncs, return status from the existing ones, return the actual payload via arguments (Jakub) - Use 'device-bound' instead of 'offloaded' in existing error message (Jakub) - Move offload init into late_initcall (Jakub) - Separate xdp_metadata_ops to host netdev kfunc pointers (Jakub) - Remove forward declarations (Jakub) - Rename more offload routines to dev_bound (Jakub) bpf_offload_resolve_kfunc -> bpf_dev_bound_resolve_kfunc bpf_offload_bound_netdev_unregister -> bpf_dev_bound_netdev_unregister bpf_prog_offload_init -> bpf_prog_dev_bound_init bpf_prog_offload_destroy -> bpf_prog_dev_bound_destroy maybe_remove_bound_netdev -> bpf_dev_bound_try_remove_netdev - Move bpf_prog_is_dev_bound check into bpf_prog_map_compatible (Toke) - Prohibit metadata kfuncs unless device-bound (Toke) - Adjust selftest to exercise freplace + include the path (Toke) - Take rtnl in bpf_prog_offload_destroy to avoid the race (Martin) - BPF_F_XDP_HAS_METADATA -> BPF_F_XDP_DEV_BOUND_ONLY (Martin) - Prohibit only metadata kfuncs, not all (Alexei/Martin) - Try to fix xdp_hw_metadata.c build issue on CI (Alexei) Wasn't able to reproduce it locally, so trying my best guess... - mlx4 -> mlx4_en (Tariq) Plus other issues like using net/mlx4 prefix and using mlx4_en_xdp_buff instead of mlx4_xdp_buff. Applied those same patterns to mlx5. - Separate device-bound changes into separate patch to make it easier to review Prior art (to record pros/cons for different approaches): - Stable UAPI approach: https://lore.kernel.org/bpf/20220628194812.1453059-1-alexandr.lobakin@xxxxxxxxx/ - Metadata+BTF_ID appoach: https://lore.kernel.org/bpf/166256538687.1434226.15760041133601409770.stgit@firesoul/ - v3: https://lore.kernel.org/bpf/20221206024554.3826186-1-sdf@xxxxxxxxxx/ - v2: https://lore.kernel.org/bpf/20221121182552.2152891-1-sdf@xxxxxxxxxx/ - v1: https://lore.kernel.org/bpf/20221115030210.3159213-1-sdf@xxxxxxxxxx/ - kfuncs v2 RFC: https://lore.kernel.org/bpf/20221027200019.4106375-1-sdf@xxxxxxxxxx/ - kfuncs v1 RFC: https://lore.kernel.org/bpf/20221104032532.1615099-1-sdf@xxxxxxxxxx/ Cc: John Fastabend <john.fastabend@xxxxxxxxx> Cc: David Ahern <dsahern@xxxxxxxxx> Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx> Cc: Jakub Kicinski <kuba@xxxxxxxxxx> Cc: Willem de Bruijn <willemb@xxxxxxxxxx> Cc: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> Cc: Anatoly Burakov <anatoly.burakov@xxxxxxxxx> Cc: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx> Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> Cc: Maryam Tahhan <mtahhan@xxxxxxxxxx> Cc: xdp-hints@xxxxxxxxxxxxxxx Cc: netdev@xxxxxxxxxxxxxxx Stanislav Fomichev (11): bpf: Document XDP RX metadata bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloaded bpf: Introduce device-bound XDP programs selftests/bpf: Update expected test_offload.py messages bpf: XDP metadata RX kfuncs veth: Introduce veth_xdp_buff wrapper for xdp_buff veth: Support RX XDP metadata selftests/bpf: Verify xdp_metadata xdp->af_xdp path net/mlx4_en: Introduce wrapper for xdp_buff net/mlx4_en: Support RX XDP metadata selftests/bpf: Simple program to dump XDP RX metadata Toke Høiland-Jørgensen (4): bpf: Support consuming XDP HW metadata from fext programs xsk: Add cb area to struct xdp_buff_xsk net/mlx5e: Introduce wrapper for xdp_buff net/mlx5e: Support RX XDP metadata Documentation/bpf/xdp-rx-metadata.rst | 90 ++++ drivers/net/ethernet/mellanox/mlx4/en_clock.c | 13 +- .../net/ethernet/mellanox/mlx4/en_netdev.c | 6 + drivers/net/ethernet/mellanox/mlx4/en_rx.c | 63 ++- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 5 + drivers/net/ethernet/mellanox/mlx5/core/en.h | 11 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 26 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 11 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 35 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 2 + .../net/ethernet/mellanox/mlx5/core/en_main.c | 6 + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 99 +++-- drivers/net/netdevsim/bpf.c | 4 - drivers/net/veth.c | 80 ++-- include/linux/bpf.h | 39 +- include/linux/netdevice.h | 7 + include/net/xdp.h | 25 ++ include/net/xsk_buff_pool.h | 5 + include/uapi/linux/bpf.h | 5 + kernel/bpf/core.c | 11 +- kernel/bpf/offload.c | 360 +++++++++------ kernel/bpf/syscall.c | 35 +- kernel/bpf/verifier.c | 56 ++- net/core/dev.c | 9 +- net/core/filter.c | 2 +- net/core/xdp.c | 44 ++ tools/include/uapi/linux/bpf.h | 5 + tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 8 +- .../selftests/bpf/prog_tests/xdp_metadata.c | 412 ++++++++++++++++++ .../selftests/bpf/progs/xdp_hw_metadata.c | 81 ++++ .../selftests/bpf/progs/xdp_metadata.c | 56 +++ .../selftests/bpf/progs/xdp_metadata2.c | 23 + tools/testing/selftests/bpf/test_offload.py | 10 +- tools/testing/selftests/bpf/xdp_hw_metadata.c | 405 +++++++++++++++++ tools/testing/selftests/bpf/xdp_metadata.h | 15 + 36 files changed, 1780 insertions(+), 285 deletions(-) create mode 100644 Documentation/bpf/xdp-rx-metadata.rst create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_hw_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_metadata2.c create mode 100644 tools/testing/selftests/bpf/xdp_hw_metadata.c create mode 100644 tools/testing/selftests/bpf/xdp_metadata.h -- 2.39.0.rc1.256.g54fd8350bd-goog