On Mon, Jan 06, 2025 at 04:17:08PM +1300, Barry Song wrote: > From: Barry Song <v-songbaohua@xxxxxxxx> > > Commit 735ecdfaf4e80 ("mm/vmscan: avoid splitting lazyfree THP during > shrink_folio_list()") prevents the splitting of MADV_FREE'd THP in madvise.c. > However, those folios are still added to the deferred_split list in > try_to_unmap_one() because we are unmapping PTEs and removing rmap entries > one by one. This approach is not only slow but also increases the risk of a > race condition where lazyfree folios are incorrectly set back to swapbacked, > as a speculative folio_get may occur in the shrinker's callback. > > This patchset addresses the issue by only marking truly dirty folios as > swapbacked as suggested by David and shifting to batched unmapping of the > entire folio in try_to_unmap_one(). As a result, we've observed > deferred_split dropping to zero and significant performance improvements > in memory reclamation. You've not provided any numbers? What performance improvements? Under what workloads? You're adding a bunch of complexity here, so I feel like we need to see some numbers, background, etc.? Thanks! > > Barry Song (3): > mm: set folio swapbacked iff folios are dirty in try_to_unmap_one > mm: Support tlbbatch flush for a range of PTEs > mm: Support batched unmap for lazyfree large folios during reclamation > > arch/arm64/include/asm/tlbflush.h | 26 ++++---- > arch/arm64/mm/contpte.c | 2 +- > arch/x86/include/asm/tlbflush.h | 3 +- > mm/rmap.c | 103 ++++++++++++++++++++---------- > 4 files changed, 85 insertions(+), 49 deletions(-) > > -- > 2.39.3 (Apple Git-146) >