Re: [PATCH v5 net-next 1/3] net: introduce page_pool pointer in softnet_data percpu struct

Lorenzo Bianconi <lorenzo@xxxxxxxxxx> · Wed, 17 Jan 2024 18:36:25 +0100

> 
> 
> On 19/12/2023 16.23, Paolo Abeni wrote:
> > On Thu, 2023-12-14 at 15:29 +0100, Lorenzo Bianconi wrote:
> > > Allocate percpu page_pools in softnet_data.
> > > Moreover add cpuid filed in page_pool struct in order to recycle the
> > > page in the page_pool "hot" cache if napi_pp_put_page() is running on
> > > the same cpu.
> > > This is a preliminary patch to add xdp multi-buff support for xdp running
> > > in generic mode.
> > > 
> > > Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx>
> > > ---
> > >   include/linux/netdevice.h       |  1 +
> > >   include/net/page_pool/helpers.h |  5 +++++
> > >   include/net/page_pool/types.h   |  1 +
> > >   net/core/dev.c                  | 39 ++++++++++++++++++++++++++++++++-
> > >   net/core/page_pool.c            |  5 +++++
> > >   net/core/skbuff.c               |  5 +++--
> > >   6 files changed, 53 insertions(+), 3 deletions(-)
> > 
> > @Jesper, @Ilias: could you please have a look at the pp bits?
> > 
> 
> I have some concerns... I'm still entertaining the idea, but we need to
> be aware of the tradeoffs we are making.
> 
> (1)
> Adding PP to softnet_data means per CPU caching 256 pages in the
> ptr_ring (plus likely 64 in the alloc-cache).   Fortunately, PP starts
> out empty, so as long as this PP isn't used they don't get cached. But
> if used, then PP don't have a MM shrinker that removes these cached
> pages, in case system is under MM pressure.  I guess, you can argue that
> keeping this per netdev rx-queue would make memory usage even higher.
> This is a tradeoff, we are trading memory (waste) for speed.
> 
> 
> (2) (Question to Jakub I guess)
> How does this connect with Jakub's PP netlink stats interface?
> E.g. I find it very practical that this allow us get PP stats per
> netdev, but in this case there isn't a netdev.
> 
> 
> (3) (Implicit locking)
> PP have lockless "alloc" because it it relies on drivers NAPI context.
> The places where netstack access softnet_data provide similar protection
> that we can rely on for PP, so this is likely correct implementation
> wise.  But it will give people like Sebastian (Cc) more gray hair when
> figuring out how PREEMPT_RT handle these cases.
> 
> (4)
> The optimization is needed for the case where we need to re-allocate and
> copy SKB fragments.  I think we should focus on avoiding this code path,
> instead of optimizing it.  For UDP it should be fairly easy, but for TCP
> this is harder.

Hi all,

I would resume this activity and it seems to me there is no a clear direction
about where we should add the page_pool (in a per_cpu pointer or in
netdev_rx_queue struct) or if we can rely on page_frag_cache instead.

@Jakub: what do you think? Should we add a page_pool in a per_cpu pointer?

Regards,
Lorenzo

> 
> --Jesper
Attachment:
signature.asc

Description: PGP signature