Re:

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Thu, 7 Mar 2024 23:02:46 +0000




On Thu, Mar 07, 2024 at 04:24:43PM +0000, Ryan Roberts wrote:
> > But if I run only with the deferred split fix and DO NOT revert the other
> > change, everything grinds to a halt when swapping 2M pages. Sometimes with RCU
> > stalls where I can't even interact on the serial port. Sometimes (more usually)
> > everything just gets stuck trying to reclaim and allocate memory. And when I
> > kill the jobs, I still have barely any memory in the system - about 10% what I
> > would expect.

(for the benefit of anyone trying to follow along, this is now
understood; it was my missing folio_put() in the 'folio_trylock failed'
path)

> I notice that before the commit, large folios are uncharged with
> __mem_cgroup_uncharge() and now they use __mem_cgroup_uncharge_folios().
> 
> The former has an upfront check:
> 
> 	if (!folio_memcg(folio))
> 		return;
> 
> I'm not exactly sure what that's checking but could the fact this is missing
> after the change cause things to go wonky?

Honestly, I think that's stale.  uncharge_folio() checks the same
thing very early on, so all it's actually saving is a test of the LRU
flag.

Looks like the need for it went away in 2017 with commit a9d5adeeb4b2c73c
which stopped using page->lru to gather the single page onto a
degenerate list.  I'll try to remember to submit a patch to delete
that check.

By the way, something we could try to see if the problem goes away is to
re-narrow the window that i widened.  ie something like this:

+++ b/mm/swap.c
@@ -1012,6 +1012,8 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
                        free_huge_folio(folio);
                        continue;
                }
+               if (folio_test_large(folio) && folio_test_large_rmappable(folio))
+                       folio_undo_large_rmappable(folio);

                __page_cache_release(folio, &lruvec, &flags);