RE: [PATCH net-next v5 12/19] xdp: add generic xdp_build_skb_from_buff()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx>
> Sent: Friday, 15 November 2024 16:35
> To: Ido Schimmel <idosch@xxxxxxxxxx>
> Cc: David S. Miller <davem@xxxxxxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Paolo Abeni
> <pabeni@xxxxxxxxxx>; Toke Høiland-Jørgensen <toke@xxxxxxxxxx>; Alexei Starovoitov <ast@xxxxxxxxxx>; Daniel Borkmann
> <daniel@xxxxxxxxxxxxx>; John Fastabend <john.fastabend@xxxxxxxxx>; Andrii Nakryiko <andrii@xxxxxxxxxx>; Maciej Fijalkowski
> <maciej.fijalkowski@xxxxxxxxx>; Stanislav Fomichev <sdf@xxxxxxxxxxx>; Magnus Karlsson <magnus.karlsson@xxxxxxxxx>;
> nex.sw.ncis.osdt.itp.upstreaming@xxxxxxxxx; bpf@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH net-next v5 12/19] xdp: add generic xdp_build_skb_from_buff()
> 
> From: Ido Schimmel <idosch@xxxxxxxxxx>
> Date: Thu, 14 Nov 2024 17:16:44 +0200
> 
> > On Thu, Nov 14, 2024 at 05:06:06PM +0200, Ido Schimmel wrote:
> >> Looks good (no objections to the patch), but I have a question. See
> >> below.
> >>
> >> On Wed, Nov 13, 2024 at 04:24:35PM +0100, Alexander Lobakin wrote:
> >>> The code which builds an skb from an &xdp_buff keeps multiplying itself
> >>> around the drivers with almost no changes. Let's try to stop that by
> >>> adding a generic function.
> >>> Unlike __xdp_build_skb_from_frame(), always allocate an skbuff head
> >>> using napi_build_skb() and make use of the available xdp_rxq pointer to
> >>> assign the Rx queue index. In case of PP-backed buffer, mark the skb to
> >>> be recycled, as every PP user's been switched to recycle skbs.
> >>>
> >>> Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> >>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx>
> >>
> >> Reviewed-by: Ido Schimmel <idosch@xxxxxxxxxx>
> >>
> >>> ---
> >>>  include/net/xdp.h |  1 +
> >>>  net/core/xdp.c    | 55 +++++++++++++++++++++++++++++++++++++++++++++++
> >>>  2 files changed, 56 insertions(+)
> >>>
> >>> diff --git a/include/net/xdp.h b/include/net/xdp.h
> >>> index 4c19042adf80..b0a25b7060ff 100644
> >>> --- a/include/net/xdp.h
> >>> +++ b/include/net/xdp.h
> >>> @@ -330,6 +330,7 @@ xdp_update_skb_shared_info(struct sk_buff *skb, u8 nr_frags,
> >>>  void xdp_warn(const char *msg, const char *func, const int line);
> >>>  #define XDP_WARN(msg) xdp_warn(msg, __func__, __LINE__)
> >>>
> >>> +struct sk_buff *xdp_build_skb_from_buff(const struct xdp_buff *xdp);
> >>>  struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp);
> >>>  struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
> >>>  					   struct sk_buff *skb,
> >>> diff --git a/net/core/xdp.c b/net/core/xdp.c
> >>> index b1b426a9b146..3a9a3c14b080 100644
> >>> --- a/net/core/xdp.c
> >>> +++ b/net/core/xdp.c
> >>> @@ -624,6 +624,61 @@ int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp)
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(xdp_alloc_skb_bulk);
> >>>
> >>> +/**
> >>> + * xdp_build_skb_from_buff - create an skb from an &xdp_buff
> >>> + * @xdp: &xdp_buff to convert to an skb
> >>> + *
> >>> + * Perform common operations to create a new skb to pass up the stack from
> >>> + * an &xdp_buff: allocate an skb head from the NAPI percpu cache, initialize
> >>> + * skb data pointers and offsets, set the recycle bit if the buff is PP-backed,
> >>> + * Rx queue index, protocol and update frags info.
> >>> + *
> >>> + * Return: new &sk_buff on success, %NULL on error.
> >>> + */
> >>> +struct sk_buff *xdp_build_skb_from_buff(const struct xdp_buff *xdp)
> >>> +{
> >>> +	const struct xdp_rxq_info *rxq = xdp->rxq;
> >>> +	const struct skb_shared_info *sinfo;
> >>> +	struct sk_buff *skb;
> >>> +	u32 nr_frags = 0;
> >>> +	int metalen;
> >>> +
> >>> +	if (unlikely(xdp_buff_has_frags(xdp))) {
> >>> +		sinfo = xdp_get_shared_info_from_buff(xdp);
> >>> +		nr_frags = sinfo->nr_frags;
> >>> +	}
> >>> +
> >>> +	skb = napi_build_skb(xdp->data_hard_start, xdp->frame_sz);
> >>> +	if (unlikely(!skb))
> >>> +		return NULL;
> >>> +
> >>> +	skb_reserve(skb, xdp->data - xdp->data_hard_start);
> >>> +	__skb_put(skb, xdp->data_end - xdp->data);
> >>> +
> >>> +	metalen = xdp->data - xdp->data_meta;
> >>> +	if (metalen > 0)
> >>> +		skb_metadata_set(skb, metalen);
> >>> +
> >>> +	if (is_page_pool_compiled_in() && rxq->mem.type == MEM_TYPE_PAGE_POOL)
> >>> +		skb_mark_for_recycle(skb);
> >>> +
> >>> +	skb_record_rx_queue(skb, rxq->queue_index);
> >>> +
> >>> +	if (unlikely(nr_frags)) {
> >>> +		u32 tsize;
> >>> +
> >>> +		tsize = sinfo->xdp_frags_truesize ? : nr_frags * xdp->frame_sz;
> >>> +		xdp_update_skb_shared_info(skb, nr_frags,
> >>> +					   sinfo->xdp_frags_size, tsize,
> >>> +					   xdp_buff_is_frag_pfmemalloc(xdp));
> >>> +	}
> >>> +
> >>> +	skb->protocol = eth_type_trans(skb, rxq->dev);
> >>
> >> The device we are working with has more ports (net devices) than Rx
> >> queues, so each queue can receive packets from different net devices.
> >> Currently, each Rx queue has its own NAPI instance and its own page
> >> pool. All the Rx NAPI instances are initialized using the same dummy net
> >> device which is allocated using alloc_netdev_dummy().
> >>
> >> What are our options with regards to the XDP Rx queue info structure? As
> >> evident by this patch, it does not seem valid to register one such
> >> structure per Rx queue and pass the dummy net device. Would it be valid
> >> to register one such structure per port (net device) and pass zero for
> >> the queue index and NAPI ID?
> >
> > Actually, this does not seem to be valid either as we need to associate
> > an XDP Rx queue info with the correct page pool :/
> 
> Right.
> But I'd say, this assoc slowly becomes redundant. For example, idpf has
> up to 4 page_pools per queue and I only pass 1 of them to rxq_info as
> there are no other options. Regardless, its frames get processed
> correctly thanks to that we have struct page::pp pointer + patch 9 from
> this series which teaches put_page_bulk() to handle mixed bulks.
> 
> Regarding your usecase -- after calling this function, you are free to
> overwrite any skb fields as this helper doesn't pass it up the stack.
> For example, in ice driver we have port reps and sometimes we need to
> pass a different net_device, not the one saved in rxq_info. So when
> switching to this function, we'll do eth_type_trans() once again (it's
> either way under unlikely() in our code as it's swichdev slowpath).
> Same for the queue number in rxq_info.

With this series, maintaining 'struct xdp_mem_allocator' in hash-table looks unnecessary.
If so, xdp_reg_mem_model() does not need 'allocator' when mem_type is Page-Pool.

Is there a reason for not removing 'mem_id_ht'? With this patch, the nodes are no longer used.

> 
> >
> >>
> >> To be clear, I understand it is not a common use case.
> >>
> >> Thanks
> 
> Thanks,
> Olek





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux