Here is the third series of patches to mm (and a few architectures), based on v6.4-rc3 with the preceding two series applied: in which khugepaged takes advantage of pte_offset_map[_lock]() allowing for pmd transitions. This follows on from the "arch: allow pte_offset_map[_lock]() to fail" https://lore.kernel.org/linux-mm/77a5d8c-406b-7068-4f17-23b7ac53bc83@xxxxxxxxxx/ series of 23 posted on 2023-05-09, and the "mm: allow pte_offset_map[_lock]() to fail" https://lore.kernel.org/linux-mm/68a97fbe-5c1e-7ac6-72c-7b9c6290b370@xxxxxxxxxx/ series of 31 posted on 2023-05-21. Those two series were "independent": neither depending for build or correctness on the other, but both series needed before this third one can safely make the effective changes. I'll send v2 of those two series in a couple of days, incorporating Acks and Revieweds and the minor fixes. What is it all about? Some mmap_lock avoidance i.e. latency reduction. Initially just for the case of collapsing shmem or file pages to THPs: the usefulness of MADV_COLLAPSE on shmem is being limited by that mmap_write_lock it currently requires. Likely to be relied upon later in other contexts e.g. freeing of empty page tables (but that's not work I'm doing). mmap_write_lock avoidance when collapsing to anon THPs? Perhaps, but again that's not work I've done: a quick attempt was not as easy as the shmem/file case. These changes (though of course not these exact patches) have been in Google's data centre kernel for three years now: we do rely upon them. Based on the preceding two series over v6.4-rc3, but good over v6.4-rc[1-4], current mm-everything or current linux-next. 01/12 mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s 02/12 mm/pgtable: add PAE safety to __pte_offset_map() 03/12 arm: adjust_pte() use pte_offset_map_nolock() 04/12 powerpc: assert_pte_locked() use pte_offset_map_nolock() 05/12 powerpc: add pte_free_defer() for pgtables sharing page 06/12 sparc: add pte_free_defer() for pgtables sharing page 07/12 s390: add pte_free_defer(), with use of mmdrop_async() 08/12 mm/pgtable: add pte_free_defer() for pgtable as page 09/12 mm/khugepaged: retract_page_tables() without mmap or vma lock 10/12 mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock() 11/12 mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps() 12/12 mm: delete mmap_write_trylock() and vma_try_start_write() arch/arm/mm/fault-armv.c | 3 +- arch/powerpc/include/asm/pgalloc.h | 4 + arch/powerpc/mm/pgtable-frag.c | 18 ++ arch/powerpc/mm/pgtable.c | 16 +- arch/s390/include/asm/pgalloc.h | 4 + arch/s390/mm/pgalloc.c | 34 +++ arch/sparc/include/asm/pgalloc_64.h | 4 + arch/sparc/mm/init_64.c | 16 ++ include/linux/mm.h | 17 -- include/linux/mm_types.h | 2 +- include/linux/mmap_lock.h | 10 - include/linux/pgtable.h | 6 +- include/linux/sched/mm.h | 1 + kernel/fork.c | 2 +- mm/khugepaged.c | 425 ++++++++---------------------- mm/pgtable-generic.c | 44 +++- 16 files changed, 253 insertions(+), 353 deletions(-) Hugh