On Thu, Mar 07, 2024 at 06:35:16PM +0000, Ryan Roberts wrote: > I noticed commit dfa3df509576 ("mm: fix list corruption in put_pages_list") > turned up in mm-unstable today (after I sent the above). Although I haven't done > much of the exact testing that was previously causing oopses, I also haven't > seen any since I rebased onto today's mm-unstable. Could that fix be helping us? I wish. Wrong list (lru vs deferred), and the symptom of that crash was an immediate crash, not a deferred one. Although maybe with the right/wrong debugging options ... > > The thought occurs that we don't need to take the folios off the list. > > I don't know that will fix anything, but this will fix your "running out > > of memory" problem -- I forgot to drop the reference if folio_trylock() > > failed. > > Ugh, how did I not spot that! So I guess that fits the hypothesis that the > original change is just increasing the race window and therefore we are leaking > more folios due to the failed trylock. It doesn't _confirm_ it, but it certainly fits the theory! Thanks for testing.