On 9/18/24 9:12 PM, Linus Torvalds wrote: > On Thu, 19 Sept 2024 at 05:03, Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> I think we should just do the simple one-liner of adding a >> "xas_reset()" to after doing xas_split_alloc() (or do it inside the >> xas_split_alloc()). > > .. and obviously that should be actually *verified* to fix the issue > not just with the test-case that Chris and Jens have been using, but > on Christian's real PostgreSQL load. > > Christian? > > Note that the xas_reset() needs to be done after the check for errors > - or like Willy suggested, xas_split_alloc() needs to be re-organized. > > So the simplest fix is probably to just add a > > if (xas_error(&xas)) > goto error; > } > + xas_reset(&xas); > xas_lock_irq(&xas); > xas_for_each_conflict(&xas, entry) { > old = entry; > > in __filemap_add_folio() in mm/filemap.c > > (The above is obviously a whitespace-damaged pseudo-patch for the > pre-6758c1128ceb state. I don't actually carry a stable tree around on > my laptop, but I hope it's clear enough what I'm rambling about) I kicked off a quick run with this on 6.9 with my debug patch as well, and it still fails for me... I'll double check everything is sane. For reference, below is the 6.9 filemap patch. diff --git a/mm/filemap.c b/mm/filemap.c index 30de18c4fd28..88093e2b7256 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -883,6 +883,7 @@ noinline int __filemap_add_folio(struct address_space *mapping, if (order > folio_order(folio)) xas_split_alloc(&xas, xa_load(xas.xa, xas.xa_index), order, gfp); + xas_reset(&xas); xas_lock_irq(&xas); xas_for_each_conflict(&xas, entry) { old = entry; -- Jens Axboe