On Sun, Dec 1, 2024 at 11:40 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Sun, 1 Dec 2024 01:12:34 +0900 Seiji Nishikawa <snishika@xxxxxxxxxx> wrote: > > > The kernel hangs due to a task stuck in throttle_direct_reclaim(), > > caused by a node being incorrectly deemed balanced despite pressure in > > certain zones, such as ZONE_NORMAL. This issue arises from > > zone_reclaimable_pages() returning 0 for zones without reclaimable file- > > backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient > > free pages to be skipped. > > > > The lack of swap or reclaimable pages results in ZONE_DMA32 being > > ignored during reclaim, masking pressure in other zones. Consequently, > > pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback > > mechanisms in allow_direct_reclaim() from being triggered, leading to an > > infinite loop in throttle_direct_reclaim(). > > > > This patch modifies zone_reclaimable_pages() to account for free pages > > (NR_FREE_PAGES) when no other reclaimable pages exist. This ensures > > zones with sufficient free pages are not skipped, enabling proper > > balancing and reclaim behavior. > > We'll want to backport a fix for this into -stable kernels. For that > it's best to be able to identify a suitable Fixes: target, to tell > others whether their kernel needs the fix. Are you able to help > identify that commit? Based on my analysis, the issue appears to be fundamentally rooted in the original design of zone_reclaimable_pages(). The subsequent change introduced with a2a36488a61c ("mm/vmscan: Consider anonymous pages without swap") does not fundamentally alter the behavior but it just refines the handling of anonymous pages. It does not account for zones with sufficient free pages but no reclaimable file-backed or anonymous pages. The relevant commit that introduced this function is: Fixes: 5a1c84b404a7 ("mm: remove reclaim and compaction retry approximations") This commit seems to be the most appropriate target for the Fixes: tag, as it introduced the logic that my patch modifies to address the observed kernel hang. > > Thanks. > > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -374,7 +374,14 @@ unsigned long zone_reclaimable_pages(struct zone *zone) > > if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL)) > > nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) + > > zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON); > > - > > + /* > > + * If there are no reclaimable file-backed or anonymous pages, > > + * ensure zones with sufficient free pages are not skipped. > > + * This prevents zones like DMA32 from being ignored in reclaim > > + * scenarios where they can still help alleviate memory pressure. > > + */ > > + if (nr == 0) > > + nr = zone_page_state_snapshot(zone, NR_FREE_PAGES); > > return nr; > > } > > > > -- > > 2.47.0 >