Hi All, This is a small series in support of my work to enable the use of large folios for anonymous memory (currently called "FLEXIBLE_THP") [1]. It first makes it possible to add large, non-pmd-mappable folios to the deferred split queue. Then it modifies zap_pte_range() to batch-remove spans of physically contiguous pages from the rmap, which means that in the common case, we elide the need to ever put the folio on the deferred split queue, thus reducing lock contention and improving performance. This becomes more visible once we have lots of large anonymous folios in the system, and Huang Ying has suggested solving this needs to be a prerequisit for merging the main FLEXIBLE_THP work. The series applies on top of v6.5-rc2 and a branch is available at [2]. I don't have a full test run with the latest versions of all the patches on top of the latest baseline, so not posting results formally. I can get these if people feel they are neccessary though. But anecdotally, for the kernel compilation workload, this series reduces kernel time by ~4% and reduces real-time by ~0.4%, compared with [1]. [1] https://lore.kernel.org/linux-mm/20230714160407.4142030-1-ryan.roberts@xxxxxxx/ [2] https://gitlab.arm.com/linux-arm/linux-rr/-/tree/features/granule_perf/deferredsplit-lkml_v1 Thanks, Ryan Ryan Roberts (3): mm: Allow deferred splitting of arbitrary large anon folios mm: Implement folio_remove_rmap_range() mm: Batch-zap large anonymous folio PTE mappings include/linux/rmap.h | 2 + mm/memory.c | 119 +++++++++++++++++++++++++++++++++++++++++++ mm/rmap.c | 67 +++++++++++++++++++++++- 3 files changed, 187 insertions(+), 1 deletion(-) -- 2.25.1