Re: [PATCH 15/33] userfaultfd: hugetlbfs: add __mcopy_atomic_hugetlb for huge page UFFDIO_COPY

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/04/2016 12:36 PM, Andrea Arcangeli wrote:
> This is the current status, I'm sending a full diff against the
> previous submit for review of the latest updates. It's easier to
> review incrementally I think.
> 
> Please test it, I updated the aa.git tree userfault branch in sync
> with this.
> 

Hello Andrea,

I found a couple more issues with hugetlbfs support.  The below patch
describes and addresses the issues.  It is against your aa tree. Do note
that there is a patch going into mm tree that is a pre-req for this
patch.  The patch is "mm/hugetlb: fix huge page reservation leak in
private mapping error paths".
http://marc.info/?l=linux-mm&m=147693310409312&w=2

-- 
Mike Kravetz

From: Mike Kravetz <mike.kravetz@xxxxxxxxxx>

userfaultfd: hugetlbfs: fix __mcopy_atomic_hugetlb retry/error processing

The new routine copy_huge_page_from_user() uses kmap_atomic() to map
PAGE_SIZE pages.  However, this prevents page faults in the subsequent
call to copy_from_user().  This is OK in the case where the routine
is copied with mmap_sema held.  However, in another case we want to
allow page faults.  So, add a new argument allow_pagefault to indicate
if the routine should allow page faults.

A patch (mm/hugetlb: fix huge page reservation leak in private mapping
error paths) was recently submitted and is being added to -mm tree.  It
addresses the issue huge page reservations when a huge page is allocated,
and free'ed before being instantiated in an address space.  This would
typically happen in error paths.  The routine __mcopy_atomic_hugetlb has
such an error path, so it will need to call restore_reserve_on_error()
before free'ing the huge page.  restore_reserve_on_error is currently
only visible in mm/hugetlb.c.  So, add it to a header file so that it
can be used in mm/userfaultfd.c.  Another option would be to move
__mcopy_atomic_hugetlb into mm/hugetlb.c

Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
---
 include/linux/hugetlb.h |  2 ++
 include/linux/mm.h      |  3 ++-
 mm/hugetlb.c            |  7 +++----
 mm/memory.c             | 13 ++++++++++---
 mm/userfaultfd.c        |  6 ++++--
 5 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index fc27b66..bf02b7e 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -101,6 +101,8 @@ u32 hugetlb_fault_mutex_hash(struct hstate *h,
struct mm_struct *mm,
 				struct vm_area_struct *vma,
 				struct address_space *mapping,
 				pgoff_t idx, unsigned long address);
+void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
+				unsigned long address, struct page *page);

 pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t
*pud);

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 39157f5..7c73a05 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2417,7 +2417,8 @@ extern void copy_user_huge_page(struct page *dst,
struct page *src,
 				unsigned int pages_per_huge_page);
 extern long copy_huge_page_from_user(struct page *dst_page,
 				const void __user *usr_src,
-				unsigned int pages_per_huge_page);
+				unsigned int pages_per_huge_page,
+				bool allow_pagefault);
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */

 extern struct page_ext_operations debug_guardpage_ops;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 7bfeee3..9ce8ecb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1935,9 +1935,8 @@ static long vma_add_reservation(struct hstate *h,
  * reserve map here to be consistent with global reserve count adjustments
  * to be made by free_huge_page.
  */
-static void restore_reserve_on_error(struct hstate *h,
-			struct vm_area_struct *vma, unsigned long address,
-			struct page *page)
+void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma,
+				unsigned long address, struct page *page)
 {
 	if (unlikely(PagePrivate(page))) {
 		long rc = vma_needs_reservation(h, vma, address);
@@ -3981,7 +3980,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,

 		ret = copy_huge_page_from_user(page,
 						(const void __user *) src_addr,
-						pages_per_huge_page(h));
+						pages_per_huge_page(h), false);

 		/* fallback to copy_from_user outside mmap_sem */
 		if (unlikely(ret)) {
diff --git a/mm/memory.c b/mm/memory.c
index b911110..0137c4a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4106,7 +4106,8 @@ void copy_user_huge_page(struct page *dst, struct
page *src,

 long copy_huge_page_from_user(struct page *dst_page,
 				const void __user *usr_src,
-				unsigned int pages_per_huge_page)
+				unsigned int pages_per_huge_page,
+				bool allow_pagefault)
 {
 	void *src = (void *)usr_src;
 	void *page_kaddr;
@@ -4114,11 +4115,17 @@ long copy_huge_page_from_user(struct page *dst_page,
 	unsigned long ret_val = pages_per_huge_page * PAGE_SIZE;

 	for (i = 0; i < pages_per_huge_page; i++) {
-		page_kaddr = kmap_atomic(dst_page + i);
+		if (allow_pagefault)
+			page_kaddr = kmap(dst_page + i);
+		else
+			page_kaddr = kmap_atomic(dst_page + i);
 		rc = copy_from_user(page_kaddr,
 				(const void __user *)(src + i * PAGE_SIZE),
 				PAGE_SIZE);
-		kunmap_atomic(page_kaddr);
+		if (allow_pagefault)
+			kunmap(page_kaddr);
+		else
+			kunmap_atomic(page_kaddr);

 		ret_val -= (PAGE_SIZE - rc);
 		if (rc)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index e8d7a89..c8588aa 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -275,7 +275,7 @@ static __always_inline ssize_t
__mcopy_atomic_hugetlb(struct mm_struct *dst_mm,

 			err = copy_huge_page_from_user(page,
 						(const void __user *)src_addr,
-						pages_per_huge_page(h));
+						pages_per_huge_page(h), true);
 			if (unlikely(err)) {
 				err = -EFAULT;
 				goto out;
@@ -302,8 +302,10 @@ static __always_inline ssize_t
__mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
 out_unlock:
 	up_read(&dst_mm->mmap_sem);
 out:
-	if (page)
+	if (page) {
+		restore_reserve_on_error(h, dst_vma, dst_addr, page);
 		put_page(page);
+	}
 	BUG_ON(copied < 0);
 	BUG_ON(err > 0);
 	BUG_ON(!copied && !err);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]