[PATCH] mm/khugepaged: Detecting uffd-wp vma more efficiently

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We forbid merging thps for uffd-wp enabled regions, by breaking the khugepaged
scanning right after we detected a uffd-wp armed pte (either present, or swap).

It works, but it's less efficient, because those ptes only exist for VM_UFFD_WP
enabled VMAs.  Checking against the vma flag would be more efficient, and good
enough.  To be explicit, we could still be able to merge some thps for
VM_UFFD_WP regions before this patch as long as they have zero uffd-wp armed
ptes, however that's not a major target for thp collapse anyways.

This mostly reverts commit e1e267c7928fe387e5e1cffeafb0de2d0473663a, but
instead we do the same check at vma level, so it's not a bugfix.

This also paves the way for file-backed uffd-wp support, as the VM_UFFD_WP flag
will work for file-backed too.

After this patch, the error for khugepaged for these regions will switch from
SCAN_PTE_UFFD_WP to SCAN_VMA_CHECK.

Since uffd minor mode should not allow thp as well, do the same thing for minor
mode to stop early on trying to collapse pages in khugepaged.

Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Nadav Amit <nadav.amit@xxxxxxxxx>
Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
---

Axel: as I asked in the other thread, please help check whether minor mode will
work properly with shmem thp enabled.  If not, I feel like this patch could be
part of that effort at last, but it's also possible that I missed something.

Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
---
 include/trace/events/huge_memory.h |  1 -
 mm/khugepaged.c                    | 26 +++-----------------------
 2 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
index 4fdb14a81108..53532f5925c3 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -15,7 +15,6 @@
 	EM( SCAN_EXCEED_SWAP_PTE,	"exceed_swap_pte")		\
 	EM( SCAN_EXCEED_SHARED_PTE,	"exceed_shared_pte")		\
 	EM( SCAN_PTE_NON_PRESENT,	"pte_non_present")		\
-	EM( SCAN_PTE_UFFD_WP,		"pte_uffd_wp")			\
 	EM( SCAN_PAGE_RO,		"no_writable_page")		\
 	EM( SCAN_LACK_REFERENCED_PAGE,	"lack_referenced_page")		\
 	EM( SCAN_PAGE_NULL,		"page_null")			\
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 045cc579f724..3afe66d48db0 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -31,7 +31,6 @@ enum scan_result {
 	SCAN_EXCEED_SWAP_PTE,
 	SCAN_EXCEED_SHARED_PTE,
 	SCAN_PTE_NON_PRESENT,
-	SCAN_PTE_UFFD_WP,
 	SCAN_PAGE_RO,
 	SCAN_LACK_REFERENCED_PAGE,
 	SCAN_PAGE_NULL,
@@ -467,6 +466,9 @@ static bool hugepage_vma_check(struct vm_area_struct *vma,
 		return false;
 	if (vma_is_temporary_stack(vma))
 		return false;
+	/* Don't allow thp merging for wp/minor enabled uffd regions */
+	if (userfaultfd_wp(vma) || userfaultfd_minor(vma))
+		return false;
 	return !(vm_flags & VM_NO_KHUGEPAGED);
 }
 
@@ -1246,15 +1248,6 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 		pte_t pteval = *_pte;
 		if (is_swap_pte(pteval)) {
 			if (++unmapped <= khugepaged_max_ptes_swap) {
-				/*
-				 * Always be strict with uffd-wp
-				 * enabled swap entries.  Please see
-				 * comment below for pte_uffd_wp().
-				 */
-				if (pte_swp_uffd_wp(pteval)) {
-					result = SCAN_PTE_UFFD_WP;
-					goto out_unmap;
-				}
 				continue;
 			} else {
 				result = SCAN_EXCEED_SWAP_PTE;
@@ -1270,19 +1263,6 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 				goto out_unmap;
 			}
 		}
-		if (pte_uffd_wp(pteval)) {
-			/*
-			 * Don't collapse the page if any of the small
-			 * PTEs are armed with uffd write protection.
-			 * Here we can also mark the new huge pmd as
-			 * write protected if any of the small ones is
-			 * marked but that could bring unknown
-			 * userfault messages that falls outside of
-			 * the registered range.  So, just be simple.
-			 */
-			result = SCAN_PTE_UFFD_WP;
-			goto out_unmap;
-		}
 		if (pte_write(pteval))
 			writable = true;
 
-- 
2.31.1






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux