+ mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm/mmu_notifier: contextual information for event triggering invalidation v2
has been added to the -mm tree.  Its filename is
     mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Jérôme Glisse <jglisse@xxxxxxxxxx>
Subject: mm/mmu_notifier: contextual information for event triggering invalidation v2

CPU page table update can happens for many reasons, not only as a result
of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also as
a result of kernel activities (memory compression, reclaim, migration,
...).

Users of mmu notifier API track changes to the CPU page table and take
specific action for them.  While current API only provide range of virtual
address affected by the change, not why the changes is happening.

This patchset adds event information so that users of mmu notifier can
differentiate among broad category:
    - UNMAP: munmap() or mremap()
    - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
    - PROTECTION_VMA: change in access protections for the range
    - PROTECTION_PAGE: change in access protections for page in the range
    - SOFT_DIRTY: soft dirtyness tracking

Being able to identify munmap() and mremap() from other reasons why the
page table is cleared is important to allow user of mmu notifier to
update their own internal tracking structure accordingly (on munmap or
mremap it is not longer needed to track range of virtual address as it
becomes invalid).

Link: http://lkml.kernel.org/r/20181205053628.3210-4-jglisse@xxxxxxxxxx
Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx>
Acked-by: Christian König <christian.koenig@xxxxxxx>
Cc: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>
Cc: Ross Zwisler <zwisler@xxxxxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Radim Krcmar <rkrcmar@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Felix Kuehling <felix.kuehling@xxxxxxx>
Cc: Ralph Campbell <rcampbell@xxxxxxxxxx>
Cc: John Hubbard <jhubbard@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 fs/dax.c                     |    7 ++++++
 fs/proc/task_mmu.c           |    3 +-
 include/linux/mmu_notifier.h |   35 +++++++++++++++++++++++++++++++--
 kernel/events/uprobes.c      |    3 +-
 mm/huge_memory.c             |   12 +++++++----
 mm/hugetlb.c                 |   10 +++++----
 mm/khugepaged.c              |    3 +-
 mm/ksm.c                     |    6 +++--
 mm/madvise.c                 |    3 +-
 mm/memory.c                  |   18 +++++++++++-----
 mm/migrate.c                 |    5 ++--
 mm/mprotect.c                |    3 +-
 mm/mremap.c                  |    3 +-
 mm/oom_kill.c                |    2 -
 mm/rmap.c                    |    6 +++--
 15 files changed, 90 insertions(+), 29 deletions(-)

--- a/fs/dax.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/fs/dax.c
@@ -769,6 +769,13 @@ static void dax_entry_mkclean(struct add
 		address = pgoff_address(index, vma);
 
 		/*
+		 * All the field are populated by follow_pte_pmd() except
+		 * the event field.
+		 */
+		mmu_notifier_range_init(&range, NULL, 0, -1UL,
+					MMU_NOTIFY_PROTECTION_PAGE);
+
+		/*
 		 * Note because we provide start/end to follow_pte_pmd it will
 		 * call mmu_notifier_invalidate_range_start() on our behalf
 		 * before taking any lock.
--- a/fs/proc/task_mmu.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/fs/proc/task_mmu.c
@@ -1153,7 +1153,8 @@ static ssize_t clear_refs_write(struct f
 				break;
 			}
 
-			mmu_notifier_range_init(&range, mm, 0, -1UL);
+			mmu_notifier_range_init(&range, mm, 0, -1UL,
+						MMU_NOTIFY_SOFT_DIRTY);
 			mmu_notifier_invalidate_range_start(&range);
 		}
 		walk_page_range(0, mm->highest_vm_end, &clear_refs_walk);
--- a/include/linux/mmu_notifier.h~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/include/linux/mmu_notifier.h
@@ -23,10 +23,39 @@ struct mmu_notifier_mm {
 	spinlock_t lock;
 };
 
+/**
+ * enum mmu_notifier_event - reason for the mmu notifier callback
+ * @MMU_NOTIFY_UNMAP: either munmap() that unmap the range or a mremap() that
+ * move the range
+ *
+ * @MMU_NOTIFY_CLEAR: clear page table entry (many reasons for this like
+ * madvise() or replacing a page by another one, ...).
+ *
+ * @MMU_NOTIFY_PROTECTION_VMA: update is due to protection change for the range
+ * ie using the vma access permission (vm_page_prot) to update the whole range
+ * is enough no need to inspect changes to the CPU page table (mprotect()
+ * syscall)
+ *
+ * @MMU_NOTIFY_PROTECTION_PAGE: update is due to change in read/write flag for
+ * pages in the range so to mirror those changes the user must inspect the CPU
+ * page table (from the end callback).
+ *
+ * @MMU_NOTIFY_SOFT_DIRTY: soft dirty accounting (still same page and same
+ * access flags)
+ */
+enum mmu_notifier_event {
+	MMU_NOTIFY_UNMAP = 0,
+	MMU_NOTIFY_CLEAR,
+	MMU_NOTIFY_PROTECTION_VMA,
+	MMU_NOTIFY_PROTECTION_PAGE,
+	MMU_NOTIFY_SOFT_DIRTY,
+};
+
 struct mmu_notifier_range {
 	struct mm_struct *mm;
 	unsigned long start;
 	unsigned long end;
+	enum mmu_notifier_event event;
 	bool blockable;
 };
 
@@ -320,11 +349,13 @@ static inline void mmu_notifier_mm_destr
 static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
 					   struct mm_struct *mm,
 					   unsigned long start,
-					   unsigned long end)
+					   unsigned long end,
+					   enum mmu_notifier_event event)
 {
 	range->mm = mm;
 	range->start = start;
 	range->end = end;
+	range->event = event;
 }
 
 #define ptep_clear_flush_young_notify(__vma, __address, __ptep)		\
@@ -452,7 +483,7 @@ static inline void _mmu_notifier_range_i
 	range->end = end;
 }
 
-#define mmu_notifier_range_init(range, mm, start, end) \
+#define mmu_notifier_range_init(range, mm, start, end, event) \
 	_mmu_notifier_range_init(range, start, end)
 
 
--- a/kernel/events/uprobes.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/kernel/events/uprobes.c
@@ -174,7 +174,8 @@ static int __replace_page(struct vm_area
 	struct mmu_notifier_range range;
 	struct mem_cgroup *memcg;
 
-	mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE);
+	mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE,
+				MMU_NOTIFY_CLEAR);
 
 	VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page);
 
--- a/mm/huge_memory.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/huge_memory.c
@@ -1173,7 +1173,8 @@ static vm_fault_t do_huge_pmd_wp_page_fa
 	}
 
 	mmu_notifier_range_init(&range, vma->vm_mm, haddr,
-				haddr + HPAGE_PMD_SIZE);
+				haddr + HPAGE_PMD_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 
 	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
@@ -1337,7 +1338,8 @@ alloc:
 	__SetPageUptodate(new_page);
 
 	mmu_notifier_range_init(&range, vma->vm_mm, haddr,
-				haddr + HPAGE_PMD_SIZE);
+				haddr + HPAGE_PMD_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 
 	spin_lock(vmf->ptl);
@@ -2015,7 +2017,8 @@ void __split_huge_pud(struct vm_area_str
 	struct mmu_notifier_range range;
 
 	mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PUD_MASK,
-				(address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE);
+				(address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 	ptl = pud_lock(vma->vm_mm, pud);
 	if (unlikely(!pud_trans_huge(*pud) && !pud_devmap(*pud)))
@@ -2234,7 +2237,8 @@ void __split_huge_pmd(struct vm_area_str
 	struct mmu_notifier_range range;
 
 	mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PMD_MASK,
-				(address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE);
+				(address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 	ptl = pmd_lock(vma->vm_mm, pmd);
 
--- a/mm/hugetlb.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/hugetlb.c
@@ -3248,7 +3248,7 @@ int copy_hugetlb_page_range(struct mm_st
 
 	if (cow) {
 		mmu_notifier_range_init(&range, src, vma->vm_start,
-					vma->vm_end);
+					vma->vm_end, MMU_NOTIFY_CLEAR);
 		mmu_notifier_invalidate_range_start(&range);
 	}
 
@@ -3375,7 +3375,7 @@ void __unmap_hugepage_range(struct mmu_g
 	/*
 	 * If sharing possible, alert mmu notifiers of worst case.
 	 */
-	mmu_notifier_range_init(&range, mm, start, end);
+	mmu_notifier_range_init(&range, mm, start, end, MMU_NOTIFY_CLEAR);
 	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
 	mmu_notifier_invalidate_range_start(&range);
 	address = start;
@@ -3643,7 +3643,8 @@ retry_avoidcopy:
 	__SetPageUptodate(new_page);
 	set_page_huge_active(new_page);
 
-	mmu_notifier_range_init(&range, mm, haddr, haddr + huge_page_size(h));
+	mmu_notifier_range_init(&range, mm, haddr, haddr + huge_page_size(h),
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 
 	/*
@@ -4383,7 +4384,8 @@ unsigned long hugetlb_change_protection(
 	 * start/end.  Set range.start/range.end to cover the maximum possible
 	 * range if PMD sharing is possible.
 	 */
-	mmu_notifier_range_init(&range, mm, start, end);
+	mmu_notifier_range_init(&range, mm, start, end,
+				MMU_NOTIFY_PROTECTION_VMA);
 	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
 
 	BUG_ON(address >= end);
--- a/mm/khugepaged.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/khugepaged.c
@@ -1016,7 +1016,8 @@ static void collapse_huge_page(struct mm
 	pte = pte_offset_map(pmd, address);
 	pte_ptl = pte_lockptr(mm, pmd);
 
-	mmu_notifier_range_init(&range, mm, address, address + HPAGE_PMD_SIZE);
+	mmu_notifier_range_init(&range, mm, address, address + HPAGE_PMD_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 	pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
 	/*
--- a/mm/ksm.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/ksm.c
@@ -1065,7 +1065,8 @@ static int write_protect_page(struct vm_
 	BUG_ON(PageTransCompound(page));
 
 	mmu_notifier_range_init(&range, mm, pvmw.address,
-				pvmw.address + PAGE_SIZE);
+				pvmw.address + PAGE_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 
 	if (!page_vma_mapped_walk(&pvmw))
@@ -1152,7 +1153,8 @@ static int replace_page(struct vm_area_s
 	if (!pmd)
 		goto out;
 
-	mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE);
+	mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 
 	ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
--- a/mm/madvise.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/madvise.c
@@ -472,7 +472,8 @@ static int madvise_free_single_vma(struc
 	range.end = min(vma->vm_end, end_addr);
 	if (range.end <= vma->vm_start)
 		return -EINVAL;
-	mmu_notifier_range_init(&range, mm, range.start, range.end);
+	mmu_notifier_range_init(&range, mm, range.start, range.end,
+				MMU_NOTIFY_CLEAR);
 
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm, range.start, range.end);
--- a/mm/memory.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/memory.c
@@ -1009,7 +1009,8 @@ int copy_page_range(struct mm_struct *ds
 	is_cow = is_cow_mapping(vma->vm_flags);
 
 	if (is_cow) {
-		mmu_notifier_range_init(&range, src_mm, addr, end);
+		mmu_notifier_range_init(&range, src_mm, addr, end,
+					MMU_NOTIFY_PROTECTION_PAGE);
 		mmu_notifier_invalidate_range_start(&range);
 	}
 
@@ -1333,7 +1334,8 @@ void unmap_vmas(struct mmu_gather *tlb,
 {
 	struct mmu_notifier_range range;
 
-	mmu_notifier_range_init(&range, vma->vm_mm, start_addr, end_addr);
+	mmu_notifier_range_init(&range, vma->vm_mm, start_addr,
+				end_addr, MMU_NOTIFY_UNMAP);
 	mmu_notifier_invalidate_range_start(&range);
 	for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next)
 		unmap_single_vma(tlb, vma, start_addr, end_addr, NULL);
@@ -1355,7 +1357,8 @@ void zap_page_range(struct vm_area_struc
 	struct mmu_gather tlb;
 
 	lru_add_drain();
-	mmu_notifier_range_init(&range, vma->vm_mm, start, start + size);
+	mmu_notifier_range_init(&range, vma->vm_mm, start,
+				start + size, MMU_NOTIFY_CLEAR);
 	tlb_gather_mmu(&tlb, vma->vm_mm, start, range.end);
 	update_hiwater_rss(vma->vm_mm);
 	mmu_notifier_invalidate_range_start(&range);
@@ -1381,7 +1384,8 @@ static void zap_page_range_single(struct
 	struct mmu_gather tlb;
 
 	lru_add_drain();
-	mmu_notifier_range_init(&range, vma->vm_mm, address, address + size);
+	mmu_notifier_range_init(&range, vma->vm_mm, address,
+				address + size, MMU_NOTIFY_CLEAR);
 	tlb_gather_mmu(&tlb, vma->vm_mm, address, range.end);
 	update_hiwater_rss(vma->vm_mm);
 	mmu_notifier_invalidate_range_start(&range);
@@ -2272,7 +2276,8 @@ static vm_fault_t wp_page_copy(struct vm
 	__SetPageUptodate(new_page);
 
 	mmu_notifier_range_init(&range, mm, vmf->address & PAGE_MASK,
-				(vmf->address & PAGE_MASK) + PAGE_SIZE);
+				(vmf->address & PAGE_MASK) + PAGE_SIZE,
+				MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 
 	/*
@@ -4061,7 +4066,8 @@ static int __follow_pte_pmd(struct mm_st
 
 		if (range) {
 			mmu_notifier_range_init(range, mm, address & PMD_MASK,
-					     (address & PMD_MASK) + PMD_SIZE);
+					     (address & PMD_MASK) + PMD_SIZE,
+					     range->event);
 			mmu_notifier_invalidate_range_start(range);
 		}
 		*ptlp = pmd_lock(mm, pmd);
--- a/mm/migrate.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/migrate.c
@@ -2323,7 +2323,7 @@ static void migrate_vma_collect(struct m
 	mm_walk.private = migrate;
 
 	mmu_notifier_range_init(&range, mm_walk.mm, migrate->start,
-				migrate->end);
+				migrate->end, MMU_NOTIFY_CLEAR);
 	mmu_notifier_invalidate_range_start(&range);
 	walk_page_range(migrate->start, migrate->end, &mm_walk);
 	mmu_notifier_invalidate_range_end(&range);
@@ -2732,7 +2732,8 @@ static void migrate_vma_pages(struct mig
 				notified = true;
 
 				mmu_notifier_range_init(&range, mm, addr,
-							migrate->end);
+							migrate->end,
+							MMU_NOTIFY_CLEAR);
 				mmu_notifier_invalidate_range_start(&range);
 			}
 			migrate_vma_insert_page(migrate, addr, newpage,
--- a/mm/mprotect.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/mprotect.c
@@ -185,7 +185,8 @@ static inline unsigned long change_pmd_r
 
 		/* invoke the mmu notifier if the pmd is populated */
 		if (!range.start) {
-			mmu_notifier_range_init(&range, vma->vm_mm, addr, end);
+			mmu_notifier_range_init(&range, vma->vm_mm, addr, end,
+						MMU_NOTIFY_PROTECTION_VMA);
 			mmu_notifier_invalidate_range_start(&range);
 		}
 
--- a/mm/mremap.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/mremap.c
@@ -203,7 +203,8 @@ unsigned long move_page_tables(struct vm
 	old_end = old_addr + len;
 	flush_cache_range(vma, old_addr, old_end);
 
-	mmu_notifier_range_init(&range, vma->vm_mm, old_addr, old_end);
+	mmu_notifier_range_init(&range, vma->vm_mm, old_addr,
+				old_end, MMU_NOTIFY_UNMAP);
 	mmu_notifier_invalidate_range_start(&range);
 
 	for (; old_addr < old_end; old_addr += extent, new_addr += extent) {
--- a/mm/oom_kill.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/oom_kill.c
@@ -532,7 +532,7 @@ bool __oom_reap_task_mm(struct mm_struct
 			struct mmu_gather tlb;
 
 			mmu_notifier_range_init(&range, mm, vma->vm_start,
-						vma->vm_end);
+						vma->vm_end, MMU_NOTIFY_CLEAR);
 			tlb_gather_mmu(&tlb, mm, range.start, range.end);
 			if (mmu_notifier_invalidate_range_start_nonblock(&range)) {
 				tlb_finish_mmu(&tlb, range.start, range.end);
--- a/mm/rmap.c~mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2
+++ a/mm/rmap.c
@@ -898,7 +898,8 @@ static bool page_mkclean_one(struct page
 	 */
 	mmu_notifier_range_init(&range, vma->vm_mm, address,
 				min(vma->vm_end, address +
-				    (PAGE_SIZE << compound_order(page))));
+				    (PAGE_SIZE << compound_order(page))),
+				MMU_NOTIFY_PROTECTION_PAGE);
 	mmu_notifier_invalidate_range_start(&range);
 
 	while (page_vma_mapped_walk(&pvmw)) {
@@ -1373,7 +1374,8 @@ static bool try_to_unmap_one(struct page
 	 */
 	mmu_notifier_range_init(&range, vma->vm_mm, vma->vm_start,
 				min(vma->vm_end, vma->vm_start +
-				    (PAGE_SIZE << compound_order(page))));
+				    (PAGE_SIZE << compound_order(page))),
+				MMU_NOTIFY_CLEAR);
 	if (PageHuge(page)) {
 		/*
 		 * If sharing is possible, start and end will be adjusted
_

Patches currently in -mm which might be from jglisse@xxxxxxxxxx are

mm-mmu_notifier-use-structure-for-invalidate_range_start-end-callback.patch
mm-mmu_notifier-use-structure-for-invalidate_range_start-end-calls-v2.patch
mm-mmu_notifier-contextual-information-for-event-triggering-invalidation-v2.patch




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux