I wonder if we could make some improvements to zapping pages to reduce TLB flushes under PTL, and to single threaded pte updates to reduce atomic operations. This might require some changes to arch code, particularly the last patch. I'd just like to see if I've missed something fundamental with the mm or with pte/tlb behaviour. Thanks, Nick Nicholas Piggin (4): mm: munmap optimise single threaded page freeing mm: zap_pte_range only flush under ptl if a dirty shared page was unmapped mm: zap_pte_range optimise fullmm handling for dirty shared pages mm: optimise flushing and pte manipulation for single threaded access include/asm-generic/tlb.h | 3 +++ mm/huge_memory.c | 4 ++-- mm/madvise.c | 4 ++-- mm/memory.c | 40 ++++++++++++++++++++++++++++++++------- 4 files changed, 40 insertions(+), 11 deletions(-) -- 2.17.0