Add a new callback 'ptep_prepare_range' to allow the architecture code to optimize the modification of multiple page table entries. The background for the callback is an instruction found on s390. The IPTE-range instruction can be used to invalidate up to 256 ptes with a single IPI, including the flush of the TLB entries associated to the address range. This has similarities to the arch_[enter|leave]_lazy_mmu_mode, but for a more specific situation. ptep_prepare_range is called for the update of a block of ptes. ptep_prepare_range is called optimistically, the callback may choose to do nothing. In this case the individual single pte operation and the arch_[enter|leave]_lazy_mmu_mode mechanics need to deal with the invalidation and the associated TLB flush. Signed-off-by: Martin Schwidefsky <schwidefsky@xxxxxxxxxx> --- include/asm-generic/pgtable.h | 4 ++++ mm/memory.c | 2 ++ mm/mprotect.c | 1 + mm/mremap.c | 1 + 4 files changed, 8 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 9401f48..b29f360 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -192,6 +192,10 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres } #endif +#ifndef ptep_prepare_range +#define ptep_prepare_range(mm, start, end, ptep, full) do {} while (0) +#endif + #ifndef __HAVE_ARCH_PMDP_SET_WRPROTECT #ifdef CONFIG_TRANSPARENT_HUGEPAGE static inline void pmdp_set_wrprotect(struct mm_struct *mm, diff --git a/mm/memory.c b/mm/memory.c index 07493e3..eeecb92 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -934,6 +934,7 @@ again: orig_src_pte = src_pte; orig_dst_pte = dst_pte; arch_enter_lazy_mmu_mode(); + ptep_prepare_range(src_mm, addr, end, src_pte, 0); do { /* @@ -1114,6 +1115,7 @@ again: start_pte = pte_offset_map_lock(mm, pmd, addr, &ptl); pte = start_pte; arch_enter_lazy_mmu_mode(); + ptep_prepare_range(mm, addr, end, pte, tlb->fullmm); do { pte_t ptent = *pte; if (pte_none(ptent)) { diff --git a/mm/mprotect.c b/mm/mprotect.c index b650c54..3fa15b5 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -74,6 +74,7 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, return 0; arch_enter_lazy_mmu_mode(); + ptep_prepare_range(mm, addr, end, pte, 0); do { oldpte = *pte; if (pte_present(oldpte)) { diff --git a/mm/mremap.c b/mm/mremap.c index 3fa0a467..5f4d0af 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -135,6 +135,7 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, if (new_ptl != old_ptl) spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING); arch_enter_lazy_mmu_mode(); + ptep_prepare_range(mm, old_addr, old_end, old_pte, 0); for (; old_addr < old_end; old_pte++, old_addr += PAGE_SIZE, new_pte++, new_addr += PAGE_SIZE) { -- 2.6.6 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>