Re: [PATCH RFC bpf-next 07/20] xdp: Track if metadata is supported in xdp_frame <> xdp_buff conversions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On 05/03/2025 18.02, Arthur Fabre wrote:
On Wed Mar 5, 2025 at 4:24 PM CET, Alexander Lobakin wrote:
From: Arthur <arthur@xxxxxxxxxxxxxxx>
Date: Wed, 05 Mar 2025 15:32:04 +0100

From: Arthur Fabre <afabre@xxxxxxxxxxxxxx>

xdp_buff stores whether metadata is supported by a NIC by setting
data_meta to be greater than data.

But xdp_frame only stores the metadata size (as metasize), so converting
between xdp_frame and xdp_buff is lossy.

Steal an unused bit in xdp_frame to track whether metadata is supported
or not.

This will lets us have "generic" functions for setting skb fields from
either xdp_frame or xdp_buff from drivers.

Signed-off-by: Arthur Fabre <afabre@xxxxxxxxxxxxxx>
  include/net/xdp.h | 10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 58019fa299b56dbd45c104fdfa807f73af6e4fa4..84afe07d09efdb2ab0cb78b904f02cb74f9a56b6 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -116,6 +116,9 @@ static __always_inline void xdp_buff_set_frag_pfmemalloc(struct xdp_buff *xdp)
+static bool xdp_data_meta_unsupported(const struct xdp_buff *xdp);
+static void xdp_set_data_meta_invalid(struct xdp_buff *xdp);
  static __always_inline void *xdp_buff_traits(const struct xdp_buff *xdp)
  	return xdp->data_hard_start + _XDP_FRAME_SIZE;
@@ -270,7 +273,9 @@ struct xdp_frame {
  	void *data;
  	u32 len;
  	u32 headroom;
-	u32 metasize; /* uses lower 8-bits */
+	u32	:23, /* unused */
+		meta_unsupported:1,
+		metasize:8;

See the history of this structure how we got rid of using bitfields here
and why.

...because of performance.

Even though metasize uses only 8 bits, 1-byte access is slower than
32-byte access.

Interesting, thanks!

I agree with Olek, we should not use bitfields.  Thanks for catching this.

(The xdp_frame have a flags member...)
Why don't we use the flags member for storing this information?

I was going to write "you can use the fact that metasize is always a
multiple of 4 or that it's never > 252, for example, you could reuse LSB
as a flag indicating that meta is not supported", but first of all

Do we still have drivers which don't support metadata?
Why don't they do that? It's not HW-specific or even driver-specific.
They don't reserve headroom? Then they're invalid, at least XDP_REDIRECT
won't work.

I'm fairly sure that all drivers support XDP_REDIRECT.
Except didn't Lorenzo add a feature bit for this?
(so, some drivers might explicitly not-support this)

So maybe we need to fix those drivers first, if there are any.

Most drivers don't support metadata unfortunately:

rg -U "xdp_prepare_buff\([^)]*false\);" drivers/net/
1712:		xdp_prepare_buff(&xdp, buf, pad, len, false);

94:	xdp_prepare_buff(xdp, buf_va, XDP_PACKET_HEADROOM, pkt_len, false);

2344:	xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
2345:			 data_len, false);

1436:	xdp_prepare_buff(&xdp, hard_start, OTX2_HEAD_ROOM,
1437:			 cqe->sg.seg_size, false);

1021:		xdp_prepare_buff(&xdp, desc->addr, NETSEC_RXBUF_HEADROOM,
1022:				 pkt_len, false);

740:	xdp_prepare_buff(&new, frame, headroom, len, false);
859:		xdp_prepare_buff(&xdp, page_info->page_address +
860:				 page_info->page_offset, GVE_RX_PAD,
861:				 len, false);

3984:			xdp_prepare_buff(&xdp, data,
3986:					 rx_bytes, false);

794:		xdp_prepare_buff(&xdp, hard_start, rx_ring->page_offset,
795:				 buff->len, false);

554:	xdp_prepare_buff(&xdp, hard_start, data - hard_start, len, false);

348:		xdp_prepare_buff(&xdp, pa, headroom, size, false);

1710:	xdp_prepare_buff(xdp_buff, hard_start - rx_ring->buffer_offset,
1711:			 rx_ring->buffer_offset, size, false);

1335:		xdp_prepare_buff(&xdp, page_addr, AM65_CPSW_HEADROOM,
1336:				 pkt_len, false);

403:		xdp_prepare_buff(&xdp, pa, headroom, size, false);

289:	xdp_prepare_buff(&xdp, *ehp - EFX_XDP_HEADROOM, EFX_XDP_HEADROOM,
290:			 rx_buf->len, false);

2097:			xdp_prepare_buff(&xdp, data, MTK_PP_HEADROOM, pktlen,
2098:					 false);

291:	xdp_prepare_buff(&xdp, *ehp - EFX_XDP_HEADROOM, EFX_XDP_HEADROOM,
292:			 rx_buf->len, false)

I don't know if it's just because no one has added calls to
skb_metadata_set() in yet, or if there's a more fundamental reason.

I simply think driver developers have been lazy.

If someone want some easy kernel commits, these drivers should be fairly
easy to fix...

I think they all reserve some amount of headroom, but not always the

The Intel drivers use 192 (AFAIK if that is still true). The API ended
up supporting non-standard XDP_PACKET_HEADROOM, due to the Intel
drivers, when XDP support was added to those (which is a long time ago now).

/* Non-standard XDP_PACKET_HEADROOM and tailroom to satisfy XDP_REDIRECT and
  * still fit two standard MTU size packets into a single 4K page.
#define EFX_XDP_HEADROOM	128

This is smaller than most drivers, but still have enough headroom for xdp_frame + traits.

If it's just because skb_metadata_set() is missing, I can take the
patches from this series that adds a "generic" XDP -> skb hook ("trait:
Propagate presence of traits to sk_buff"), have it call
skb_metadata_set(), and try to add it to all the drivers in a separate

I think someone should cleanup those drivers and add support.


  	/* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
  	 * while mem_type is valid on remote CPU.
@@ -369,6 +374,8 @@ void xdp_convert_frame_to_buff(const struct xdp_frame *frame,
  	xdp->data = frame->data;
  	xdp->data_end = frame->data + frame->len;
  	xdp->data_meta = frame->data - frame->metasize;
+	if (frame->meta_unsupported)
+		xdp_set_data_meta_invalid(xdp);
  	xdp->frame_sz = frame->frame_sz;
  	xdp->flags = frame->flags;
@@ -396,6 +403,7 @@ int xdp_update_frame_from_buff(const struct xdp_buff *xdp,
  	xdp_frame->len  = xdp->data_end - xdp->data;
  	xdp_frame->headroom = headroom - sizeof(*xdp_frame);
  	xdp_frame->metasize = metasize;
+	xdp_frame->meta_unsupported = xdp_data_meta_unsupported(xdp);
  	xdp_frame->frame_sz = xdp->frame_sz;
  	xdp_frame->flags = xdp->flags;


[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux