Hi, I have just noticed that that pages allocated for demotion targets includes __GFP_KSWAPD_RECLAIM (through GFP_NOWAIT). This is the case since the code has been introduced by 26aa2d199d6f ("mm/migrate: demote pages during reclaim"). I suspect the intention is to trigger the aging on the fallback node and either drop or further demote oldest pages. This makes sense but I suspect that this wasn't intended also for memcg triggered reclaim. This would mean that a memory pressure in one hierarchy could trigger paging out pages of a different hierarchy if the demotion target is close to full. I haven't really checked at the current kswapd wake up checks but I suspect that kswapd would back off in most cases so this shouldn't really cause any big problems. But I guess it would be better to simply not wake kswapd up for the memcg reclaim. What do you think? --- diff --git a/mm/vmscan.c b/mm/vmscan.c index 8fcc5fa768c0..1f3161173b85 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1568,7 +1568,7 @@ static struct page *alloc_demote_page(struct page *page, unsigned long private) * Folios which are not demoted are left on @demote_folios. */ static unsigned int demote_folio_list(struct list_head *demote_folios, - struct pglist_data *pgdat) + struct pglist_data *pgdat, bool cgroup_reclaim) { int target_nid = next_demotion_node(pgdat->node_id); unsigned int nr_succeeded; @@ -1589,6 +1589,10 @@ static unsigned int demote_folio_list(struct list_head *demote_folios, if (list_empty(demote_folios)) return 0; + /* local memcg reclaim shouldn't directly reclaim from other memcgs */ + if (cgroup_reclaim) + mtc->gfp_mask &= ~__GFP_RECLAIM; + if (target_nid == NUMA_NO_NODE) return 0; @@ -2066,7 +2070,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, /* 'folio_list' is always empty here */ /* Migrate folios selected for demotion */ - nr_reclaimed += demote_folio_list(&demote_folios, pgdat); + nr_reclaimed += demote_folio_list(&demote_folios, pgdat, cgroup_reclaim(sc)); /* Folios that could not be demoted are still in @demote_folios */ if (!list_empty(&demote_folios)) { /* Folios which weren't demoted go back on @folio_list for retry: */ -- Michal Hocko SUSE Labs