On Tue, Jan 7, 2025 at 6:28 AM Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> wrote: > > On Mon, Jan 06, 2025 at 04:17:08PM +1300, Barry Song wrote: > > From: Barry Song <v-songbaohua@xxxxxxxx> > > > > Commit 735ecdfaf4e80 ("mm/vmscan: avoid splitting lazyfree THP during > > shrink_folio_list()") prevents the splitting of MADV_FREE'd THP in madvise.c. > > However, those folios are still added to the deferred_split list in > > try_to_unmap_one() because we are unmapping PTEs and removing rmap entries > > one by one. This approach is not only slow but also increases the risk of a > > race condition where lazyfree folios are incorrectly set back to swapbacked, > > as a speculative folio_get may occur in the shrinker's callback. > > > > This patchset addresses the issue by only marking truly dirty folios as > > swapbacked as suggested by David and shifting to batched unmapping of the > > entire folio in try_to_unmap_one(). As a result, we've observed > > deferred_split dropping to zero and significant performance improvements > > in memory reclamation. > > You've not provided any numbers? What performance improvements? Under what > workloads? The number can be found in patch 3/3 at the following link: https://lore.kernel.org/linux-mm/20250106031711.82855-4-21cnbao@xxxxxxxxx/ Reclaiming lazyfree mTHP will now be significantly faster. Additionally, this patch addresses the issue with the misleading split_deferred counter. The split_deferred counter was intended to track operations like unaligned unmap/madvise, but in practice, the majority of split_deferred cases result from memory reclamation of aligned lazyfree mTHP. This discrepancy rendered the split_deferred counter highly misleading. > > You're adding a bunch of complexity here, so I feel like we need to see > some numbers, background, etc.? I agree that I can provide more details in v2. In the meantime, you can find additional background information here: https://lore.kernel.org/linux-mm/CAGsJ_4wOL6TLa3FKQASdrGfuqqu=14EuxAtpKmnebiGLm0dnfA@xxxxxxxxxxxxxx/ > > Thanks! > > > > > Barry Song (3): > > mm: set folio swapbacked iff folios are dirty in try_to_unmap_one > > mm: Support tlbbatch flush for a range of PTEs > > mm: Support batched unmap for lazyfree large folios during reclamation > > > > arch/arm64/include/asm/tlbflush.h | 26 ++++---- > > arch/arm64/mm/contpte.c | 2 +- > > arch/x86/include/asm/tlbflush.h | 3 +- > > mm/rmap.c | 103 ++++++++++++++++++++---------- > > 4 files changed, 85 insertions(+), 49 deletions(-) > > > > -- > > 2.39.3 (Apple Git-146) Thanks Barry