Please see the first patch in the series for the overall design and use-cases. Changes since v2: - Rework bpf_prog_aux->xdp_netdev refcnt (Martin) Switched to dropping the count early, after loading / verification is done. At attach time, the pointer value is used only for comparing the actual netdev at attach vs netdev at load. (potentially can be a problem if the same slub slot is reused for another netdev later on?) - Use correct RX queue number in xdp_hw_metadata (Toke / Jakub) - Fix wrongly placed '*cnt=0' in fixup_kfunc_call after merge (Toke) - Fix sorted BTF_SET8_START (Toke) Introduce old-school unsorted BTF_ID_LIST for lookup purposes. - Zero-initialize mlx4_xdp_buff (Tariq) - Separate common timestamp handling into mlx4_en_get_hwtstamp (Tariq) - mlx5 patches (Toke) Note, I've renamed the following for consistency with the rest: - s/mlx5_xdp_ctx/mlx5_xdp_buff/ - s/mctx/mxbuf/ Changes since v1: - Drop xdp->skb metadata path (Jakub) No consensus yet on exposing xdp_skb_metadata in UAPI. Exploring whether everyone would be ok with kfunc to access that part.. Will follow up separately. - Drop kfunc unrolling (Alexei) Starting with simple code to resolve per-device ndo kfuncs. We can always go back to unrolling and keep the same kfuncs interface in the future. - Add rx hash metadata (Toke) Not adding the rest (csum/hash_type/etc), I'd like us to agree on the framework. - use dev_get_by_index and add proper refcnt (Toke) Changes since last RFC: - drop ice/bnxt example implementation (Alexander) -ENOHARDWARE to test - fix/test mlx4 implementation Confirmed that I get reasonable looking timestamp. The last patch in the series is the small xsk program that can be used to dump incoming metadata. - bpf_push64/bpf_pop64 (Alexei) x86_64+arm64(untested)+disassembler - struct xdp_to_skb_metadata -> struct xdp_skb_metadata (Toke) s/xdp_to_skb/xdp_skb/ - Documentation/bpf/xdp-rx-metadata.rst Documents functionality, assumptions and limitations. - bpf_xdp_metadata_export_to_skb returns true/false (Martin) Plus xdp_md->skb_metadata field to access it. - BPF_F_XDP_HAS_METADATA flag (Toke/Martin) Drop magic, use the flag instead. - drop __randomize_layout Not sure it's possible to sanely expose it via UAPI. Because every .o potentially gets its own randomized layout, test_progs refuses to link. - remove __net_timestamp in veth driver (John/Jesper) Instead, calling ktime_get from the kfunc; enough for the selftests. Future work on RX side: - Support more devices besides veth and mlx4 - Support more metadata besides RX timestamp. - Convert skb_metadata_set() callers to xdp_convert_skb_metadata() which handles extra xdp_skb_metadata Prior art (to record pros/cons for different approaches): - Stable UAPI approach: https://lore.kernel.org/bpf/20220628194812.1453059-1-alexandr.lobakin@xxxxxxxxx/ - Metadata+BTF_ID appoach: https://lore.kernel.org/bpf/166256538687.1434226.15760041133601409770.stgit@firesoul/ - v1: https://lore.kernel.org/bpf/20221115030210.3159213-1-sdf@xxxxxxxxxx/T/#t - kfuncs v2 RFC: https://lore.kernel.org/bpf/20221027200019.4106375-1-sdf@xxxxxxxxxx/ - kfuncs v1 RFC: https://lore.kernel.org/bpf/20221104032532.1615099-1-sdf@xxxxxxxxxx/ Cc: John Fastabend <john.fastabend@xxxxxxxxx> Cc: David Ahern <dsahern@xxxxxxxxx> Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx> Cc: Jakub Kicinski <kuba@xxxxxxxxxx> Cc: Willem de Bruijn <willemb@xxxxxxxxxx> Cc: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> Cc: Anatoly Burakov <anatoly.burakov@xxxxxxxxx> Cc: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx> Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> Cc: Maryam Tahhan <mtahhan@xxxxxxxxxx> Cc: xdp-hints@xxxxxxxxxxxxxxx Cc: netdev@xxxxxxxxxxxxxxx Stanislav Fomichev (8): bpf: Document XDP RX metadata bpf: XDP metadata RX kfuncs veth: Introduce veth_xdp_buff wrapper for xdp_buff veth: Support RX XDP metadata selftests/bpf: Verify xdp_metadata xdp->af_xdp path mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff mxl4: Support RX XDP metadata selftests/bpf: Simple program to dump XDP RX metadata Toke Høiland-Jørgensen (3): xsk: Add cb area to struct xdp_buff_xsk mlx5: Introduce mlx5_xdp_buff wrapper for xdp_buff mlx5: Support RX XDP metadata Documentation/bpf/xdp-rx-metadata.rst | 90 ++++ drivers/net/ethernet/mellanox/mlx4/en_clock.c | 13 +- .../net/ethernet/mellanox/mlx4/en_netdev.c | 10 + drivers/net/ethernet/mellanox/mlx4/en_rx.c | 68 ++- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/en.h | 11 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 32 +- .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 13 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.c | 35 +- .../ethernet/mellanox/mlx5/core/en/xsk/rx.h | 2 + .../net/ethernet/mellanox/mlx5/core/en_main.c | 4 + .../net/ethernet/mellanox/mlx5/core/en_rx.c | 92 ++-- drivers/net/veth.c | 88 ++-- include/linux/bpf.h | 4 + include/linux/mlx4/device.h | 7 + include/linux/netdevice.h | 5 + include/net/xdp.h | 25 ++ include/net/xsk_buff_pool.h | 5 + include/uapi/linux/bpf.h | 5 + kernel/bpf/syscall.c | 24 +- kernel/bpf/verifier.c | 37 +- net/core/dev.c | 5 + net/core/xdp.c | 58 +++ tools/include/uapi/linux/bpf.h | 5 + tools/testing/selftests/bpf/.gitignore | 1 + tools/testing/selftests/bpf/Makefile | 8 +- .../selftests/bpf/prog_tests/xdp_metadata.c | 365 ++++++++++++++++ .../selftests/bpf/progs/xdp_hw_metadata.c | 93 ++++ .../selftests/bpf/progs/xdp_metadata.c | 57 +++ tools/testing/selftests/bpf/xdp_hw_metadata.c | 405 ++++++++++++++++++ tools/testing/selftests/bpf/xdp_metadata.h | 7 + 31 files changed, 1467 insertions(+), 108 deletions(-) create mode 100644 Documentation/bpf/xdp-rx-metadata.rst create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_hw_metadata.c create mode 100644 tools/testing/selftests/bpf/progs/xdp_metadata.c create mode 100644 tools/testing/selftests/bpf/xdp_hw_metadata.c create mode 100644 tools/testing/selftests/bpf/xdp_metadata.h -- 2.38.1.584.g0f3c55d4c2-goog