On Sun, 1 Dec 2024 01:12:34 +0900 Seiji Nishikawa <snishika@xxxxxxxxxx> wrote: > The kernel hangs due to a task stuck in throttle_direct_reclaim(), > caused by a node being incorrectly deemed balanced despite pressure in > certain zones, such as ZONE_NORMAL. This issue arises from > zone_reclaimable_pages() returning 0 for zones without reclaimable file- > backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient > free pages to be skipped. > > The lack of swap or reclaimable pages results in ZONE_DMA32 being > ignored during reclaim, masking pressure in other zones. Consequently, > pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback > mechanisms in allow_direct_reclaim() from being triggered, leading to an > infinite loop in throttle_direct_reclaim(). > > This patch modifies zone_reclaimable_pages() to account for free pages > (NR_FREE_PAGES) when no other reclaimable pages exist. This ensures > zones with sufficient free pages are not skipped, enabling proper > balancing and reclaim behavior. We'll want to backport a fix for this into -stable kernels. For that it's best to be able to identify a suitable Fixes: target, to tell others whether their kernel needs the fix. Are you able to help identify that commit? Thanks. > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -374,7 +374,14 @@ unsigned long zone_reclaimable_pages(struct zone *zone) > if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL)) > nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) + > zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON); > - > + /* > + * If there are no reclaimable file-backed or anonymous pages, > + * ensure zones with sufficient free pages are not skipped. > + * This prevents zones like DMA32 from being ignored in reclaim > + * scenarios where they can still help alleviate memory pressure. > + */ > + if (nr == 0) > + nr = zone_page_state_snapshot(zone, NR_FREE_PAGES); > return nr; > } > > -- > 2.47.0