sdf@xxxxxxxxxx writes: > On 11/23, Jakub Kicinski wrote: >> On Wed, 23 Nov 2022 10:26:41 -0800 Stanislav Fomichev wrote: >> > > This embedding trick works for drivers that put xdp_buff on the stack, >> > > but mlx5 supports XSK zerocopy, which uses the xsk_buff_pool for >> > > allocating them. This makes it a bit awkward to do the same thing >> there; >> > > and since it's probably going to be fairly common to do something like >> > > this, how about we just add a 'void *drv_priv' pointer to struct >> > > xdp_buff that the drivers can use? The xdp_buff already takes up a >> full >> > > cache line anyway, so any data stuffed after it will spill over to a >> new >> > > one; so I don't think there's much difference performance-wise. >> > >> > I guess the alternative is to extend xsk_buff_pool with some new >> > argument for xdp_buff tailroom? (so it can kmalloc(sizeof(xdp_buff) + >> > xdp_buff_tailroom)) >> > But it seems messy because there is no way of knowing what the target >> > device's tailroom is, so it has to be a user setting :-/ >> > I've started with a priv pointer in xdp_buff initially, it seems fine >> > to go back. I'll probably convert veth/mlx4 to the same mode as well >> > to avoid having different approaches in different places.. > >> Can we not do this please? Add 16B of "private driver space" after >> the xdp_buff in xdp_buff_xsk (we have 16B to full cacheline), the >> drivers decide how they use it. Drivers can do BUILD_BUG_ON() for their >> expected size and cast that to whatever struct they want. This is how >> various offloads work, the variable size tailroom would be an over >> design IMO. > >> And this way non XSK paths can keep its normal typing. > > Good idea, prototyped below, lmk if it that's not what you had in mind. > > struct xdp_buff_xsk { > struct xdp_buff xdp; /* 0 56 */ > u8 cb[16]; /* 56 16 */ > /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */ As pahole helpfully says here, xdp_buff is actually only 8 bytes from being a full cache line. I thought about adding a 'cb' field like this to xdp_buff itself, but figured that since there's only room for a single pointer, why not just add that and let the driver point it to where it wants to store the extra context data? I am not suggesting to make anything variable-size; the 'void *drv_priv' is just a normal pointer. There's no changes to any typing; not sure where you got that from, Jakub? Also, the priv pointer approach works for both XSK and on-stack allocations, unlike this approach (see below). > dma_addr_t dma; /* 72 8 */ > dma_addr_t frame_dma; /* 80 8 */ > struct xsk_buff_pool * pool; /* 88 8 */ > u64 orig_addr; /* 96 8 */ > struct list_head free_list_node; /* 104 16 */ > > /* size: 120, cachelines: 2, members: 7 */ > /* last cacheline: 56 bytes */ > }; > > Toke, I can try to merge this into your patch + keep your SoB (or feel free > to try this and retest yourself, whatever works). > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h > b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h > index bc2d9034af5b..837bf103b871 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h > @@ -44,6 +44,11 @@ > (MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \ > sizeof(struct mlx5_wqe_inline_seg)) > > +struct mlx5_xdp_cb { > + struct mlx5_cqe64 *cqe; > + struct mlx5e_rq *rq; > +}; > + > struct mlx5e_xsk_param; > int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param > *xsk); > bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page, > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c > b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c > index c91b54d9ff27..84d23b2da7ce 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c > @@ -5,6 +5,7 @@ > #include "en/xdp.h" > #include <net/xdp_sock_drv.h> > #include <linux/filter.h> > +#include <linux/build_bug.h> > > /* RX data path */ > > @@ -286,8 +287,14 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct > mlx5e_rq *rq, > u32 cqe_bcnt) > { > struct xdp_buff *xdp = wi->au->xsk; > + struct mlx5_xdp_cb *cb; > struct bpf_prog *prog; > > + BUILD_BUG_ON(sizeof(struct mlx5_xdp_cb) > XSKB_CB_SIZE); > + cb = xp_get_cb(xdp); > + cb->cqe = NULL /*cqe*/; > + cb->rq = rq; So this works fine for the XSK path, but for the regular XDP path, mlx5 *does* indeed put the xdp_buff on the stack. So to re-use code there would be an implicit assumption that both memory layout and size matches between the two paths. I'm not sure that's better than just having a pointer inside the xdp_buff and pointing it wherever makes sense for that driver (as my patch did)? -Toke