On Thu, Mar 07, 2024 at 04:24:43PM +0000, Ryan Roberts wrote: > > But if I run only with the deferred split fix and DO NOT revert the other > > change, everything grinds to a halt when swapping 2M pages. Sometimes with RCU > > stalls where I can't even interact on the serial port. Sometimes (more usually) > > everything just gets stuck trying to reclaim and allocate memory. And when I > > kill the jobs, I still have barely any memory in the system - about 10% what I > > would expect. (for the benefit of anyone trying to follow along, this is now understood; it was my missing folio_put() in the 'folio_trylock failed' path) > I notice that before the commit, large folios are uncharged with > __mem_cgroup_uncharge() and now they use __mem_cgroup_uncharge_folios(). > > The former has an upfront check: > > if (!folio_memcg(folio)) > return; > > I'm not exactly sure what that's checking but could the fact this is missing > after the change cause things to go wonky? Honestly, I think that's stale. uncharge_folio() checks the same thing very early on, so all it's actually saving is a test of the LRU flag. Looks like the need for it went away in 2017 with commit a9d5adeeb4b2c73c which stopped using page->lru to gather the single page onto a degenerate list. I'll try to remember to submit a patch to delete that check. By the way, something we could try to see if the problem goes away is to re-narrow the window that i widened. ie something like this: +++ b/mm/swap.c @@ -1012,6 +1012,8 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) free_huge_folio(folio); continue; } + if (folio_test_large(folio) && folio_test_large_rmappable(folio)) + folio_undo_large_rmappable(folio); __page_cache_release(folio, &lruvec, &flags);