> -----Original Message----- > From: Yunsheng Lin <yunshenglin0825@xxxxxxxxx> > Sent: Saturday, March 1, 2025 8:50 AM > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; linux-hyperv@xxxxxxxxxxxxxxx; > akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx > Cc: Dexuan Cui <decui@xxxxxxxxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>; > Paul Rosswurm <paulros@xxxxxxxxxxxxx>; olaf@xxxxxxxxx; vkuznets > <vkuznets@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; wei.liu@xxxxxxxxxx; Long Li > <longli@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; > linyunsheng@xxxxxxxxxx; stable@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; > Alexander Duyck <alexander.duyck@xxxxxxxxx> > Subject: [EXTERNAL] Re: [PATCH] mm: page_frag: Fix refill handling in > __page_frag_alloc_align() > > +cc netdev ML & Alexander > > On 3/1/2025 10:03 AM, Haiyang Zhang wrote: > > In commit 8218f62c9c9b ("mm: page_frag: use initial zero offset for > > page_frag_alloc_align()"), the check for fragsz is moved earlier. > > So when the cache is used up, and if the fragsz > PAGE_SIZE, it won't > > try to refill, and just return NULL. > > I tested it with fragsz:8192, cache-size:32768. After the initial four > > successful allocations, it failed, even there is plenty of free memory > > in the system. > > Hi, Haiyang > It seems the PAGE_SIZE is 4K for the tested system? Yes. > Which drivers or subsystems are passing the fragsz being bigger than > PAGE_SIZE to page_frag_alloc_align() related API? For example, our MANA driver when using jumbo frame. https://web.git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/microsoft/mana > > To fix, revert the refill logic like before: the refill is attempted > > before the check & return NULL. > > page_frag API is not really for allocating memory being bigger than > PAGE_SIZE as __page_frag_cache_refill() will not try hard enough to > allocate order 3 compound page when calling __alloc_pages() and will > fail back to allocate base page as the discussed in below: > https://lore.ker/ > nel.org%2Fall%2Fead00fb7-8538-45b3-8322- > 8a41386e7381%40huawei.com%2F&data=05%7C02%7Chaiyangz%40microsoft.com%7Cd73 > d6a0ae65b4a42681c08dd58c8087b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7 > C638764338396356411%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOi > IwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7 > C&sdata=FJ7Ggrxxxv6QzKepUiHmtns1GZC2G2oJMcWSzOuFbsE%3D&reserved=0 We are already aware of this, and have error checking in place for the failover case to "base page".