RE: [EXTERNAL] Re: [PATCH] mm: page_frag: Fix refill handling in __page_frag_alloc_align()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> Sent: Saturday, March 1, 2025 12:00 PM
> To: Yunsheng Lin <yunshenglin0825@xxxxxxxxx>; linux-
> hyperv@xxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx
> Cc: Dexuan Cui <decui@xxxxxxxxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>;
> Paul Rosswurm <paulros@xxxxxxxxxxxxx>; olaf@xxxxxxxxx; vkuznets
> <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; wei.liu@xxxxxxxxxx; Long Li
> <longli@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> linyunsheng@xxxxxxxxxx; stable@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx;
> Alexander Duyck <alexander.duyck@xxxxxxxxx>
> Subject: RE: [EXTERNAL] Re: [PATCH] mm: page_frag: Fix refill handling in
> __page_frag_alloc_align()
>
>
>
> > -----Original Message-----
> > From: Yunsheng Lin <yunshenglin0825@xxxxxxxxx>
> > Sent: Saturday, March 1, 2025 8:50 AM
> > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; linux-
> hyperv@xxxxxxxxxxxxxxx;
> > akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx
> > Cc: Dexuan Cui <decui@xxxxxxxxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>;
> > Paul Rosswurm <paulros@xxxxxxxxxxxxx>; olaf@xxxxxxxxx; vkuznets
> > <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; wei.liu@xxxxxxxxxx; Long Li
> > <longli@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> > linyunsheng@xxxxxxxxxx; stable@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx;
> > Alexander Duyck <alexander.duyck@xxxxxxxxx>
> > Subject: [EXTERNAL] Re: [PATCH] mm: page_frag: Fix refill handling in
> > __page_frag_alloc_align()
> >
> > +cc netdev ML & Alexander
> >
> > On 3/1/2025 10:03 AM, Haiyang Zhang wrote:
> > > In commit 8218f62c9c9b ("mm: page_frag: use initial zero offset for
> > > page_frag_alloc_align()"), the check for fragsz is moved earlier.
> > > So when the cache is used up, and if the fragsz > PAGE_SIZE, it won't
> > > try to refill, and just return NULL.
> > > I tested it with fragsz:8192, cache-size:32768. After the initial four
> > > successful allocations, it failed, even there is plenty of free memory
> > > in the system.
> >
> > Hi, Haiyang
> > It seems the PAGE_SIZE is 4K for the tested system?
> Yes.
>
> > Which drivers or subsystems are passing the fragsz being bigger than
> > PAGE_SIZE to page_frag_alloc_align() related API?
> For example, our MANA driver when using jumbo frame.
> https://web.git/.
> kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fnetdev%2Fnet-
> next.git%2Ftree%2Fdrivers%2Fnet%2Fethernet%2Fmicrosoft%2Fmana&data=05%7C02
> %7Chaiyangz%40microsoft.com%7Cea9cc3de8c904a5c720408dd58e2913a%7C72f988bf8
> 6f141af91ab2d7cd011db47%7C1%7C0%7C638764452327076527%7CUnknown%7CTWFpbGZsb
> 3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWF
> pbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=7%2BZ9hBYGudGZUeeF0i4UEa3zjx4ZLFd
> q5E3qcZxnIWE%3D&reserved=0
>
> > > To fix, revert the refill logic like before: the refill is attempted
> > > before the check & return NULL.
> >
> > page_frag API is not really for allocating memory being bigger than
> > PAGE_SIZE as __page_frag_cache_refill() will not try hard enough to
> > allocate order 3 compound page when calling __alloc_pages() and will
> > fail back to allocate base page as the discussed in below:
> >
> https://lore.ker/
> %2F&data=05%7C02%7Chaiyangz%40microsoft.com%7Cea9cc3de8c904a5c720408dd58e2
> 913a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638764452327105287%7CUnk
> nown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
> zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=g4VAI8DbzUD95qgth
> vzFV0PYgOIA3%2F%2FI3gmQHzuLwbo%3D&reserved=0
> > nel.org%2Fall%2Fead00fb7-8538-45b3-8322-
> >
> 8a41386e7381%40huawei.com%2F&data=05%7C02%7Chaiyangz%40microsoft.com%7Cd73
> >
> d6a0ae65b4a42681c08dd58c8087b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
> >
> C638764338396356411%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOi
> >
> IwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7
> > C&sdata=FJ7Ggrxxxv6QzKepUiHmtns1GZC2G2oJMcWSzOuFbsE%3D&reserved=0
> We are already aware of this, and have error checking in place for the
> failover
> case to "base page".
>
> From the discussion thread above, there are other drivers using
> page_frag_alloc_align() for over PAGE_SIZE too. If making the page_frag
> API
> support only fragsz <= PAGE_SIZE is desired, can we create another API?
> One
> keeps the existing API semantics (allowing > PAGE_SIZE), the other uses
> your new code. By the way, it should add an explicit check and fail ALL
> requests
> for fragsz > PAGE_SIZE. Currently your code successfully allocates big
> frags
> for a few times, then fail. This is not a desired behavior. It's also a
> breaking change for our MANA driver, which can no longer run Jumbo frames.
>
> @Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> And other maintainers, could you please also evaluate the idea above?
>

And, quote from current doc 6.14.0-rc4:
"A page fragment is an arbitrary-length arbitrary-offset area of memory
which resides within a 0 or higher order compound page."
https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/mm/page_frags.rst

So, it is designed to be *arbitrary-length* within a 0 or higher order
compound page.

If the commit 8218f62c9c9b ("mm: page_frag: use initial zero offset for
page_frag_alloc_align()") intended to change the existing API semantics
to be Page Frag Length <= PAGE_SIZE, the document and all breaking drivers
need to be updated.

Thanks,
- Haiyang






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux