From: Jérôme Glisse <jglisse@xxxxxxxxxx> When notifying change for a range use MMU_NOTIFIER_USE_CHANGE_PTE flag for page table update that use set_pte_at_notify() and where the we are going either from read and write to read only with same pfn or read only to read and write with new pfn. Note that set_pte_at_notify() itself should only be use in rare cases ie we do not want to use it when we are updating a significant range of virtual addresses and thus a significant number of pte. Instead for those cases the event provided to mmu notifer invalidate_range_start() callback should be use for optimization. Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Peter Xu <peterx@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> Cc: kvm@xxxxxxxxxxxxxxx --- include/linux/mmu_notifier.h | 13 +++++++++++++ mm/ksm.c | 6 ++++-- mm/memory.c | 3 ++- 3 files changed, 19 insertions(+), 3 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index d7a35975c2bd..0885bf33dc9c 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -43,6 +43,19 @@ enum mmu_notifier_event { }; #define MMU_NOTIFIER_EVENT_BITS order_base_2(MMU_NOTIFY_EVENT_MAX) +/* + * Set MMU_NOTIFIER_USE_CHANGE_PTE only when the page table it updated with the + * set_pte_at_notify() and when pte is updated from read and write to read only + * with same pfn or from read only to read and write with different pfn. It is + * illegal to set in any other circumstances. + * + * Note that set_pte_at_notify() should not be use outside of the above cases. + * When updating a range in batch (like write protecting a range) it is better + * to rely on invalidate_range_start() and struct mmu_notifier_range to infer + * the kind of update that is happening (as an example you can look at the + * mmu_notifier_range_update_to_read_only() function). + */ +#define MMU_NOTIFIER_USE_CHANGE_PTE (1 << MMU_NOTIFIER_EVENT_BITS) #ifdef CONFIG_MMU_NOTIFIER diff --git a/mm/ksm.c b/mm/ksm.c index 97757c5fa15f..b7fb7b560cc0 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1051,7 +1051,8 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page, BUG_ON(PageTransCompound(page)); - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, vma, mm, + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR | + MMU_NOTIFIER_USE_CHANGE_PTE, vma, mm, pvmw.address, pvmw.address + PAGE_SIZE); mmu_notifier_invalidate_range_start(&range); @@ -1140,7 +1141,8 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, if (!pmd) goto out; - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, vma, mm, addr, + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR | + MMU_NOTIFIER_USE_CHANGE_PTE, vma, mm, addr, addr + PAGE_SIZE); mmu_notifier_invalidate_range_start(&range); diff --git a/mm/memory.c b/mm/memory.c index a8c6922526f6..daf4b0f92af8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2275,7 +2275,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) __SetPageUptodate(new_page); - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, vma, mm, + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR | + MMU_NOTIFIER_USE_CHANGE_PTE, vma, mm, vmf->address & PAGE_MASK, (vmf->address & PAGE_MASK) + PAGE_SIZE); mmu_notifier_invalidate_range_start(&range); -- 2.17.1