[PATCH] x86/mm/vmfault: Make vmalloc_fault() handle large pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Since 4.1, ioremap() supports large page (pud/pmd) mappings in
x86_64 and PAE.  vmalloc_fault() however assumes that the vmalloc
range is limited to pte mappings.

pgd_ctor() sets the kernel's pgd entries to user's during fork(),
which makes user processes share the same page tables for the
kernel ranges.  When a call to ioremap() is made at run-time that
leads to allocate a new 2nd level table (pud in 64-bit and pmd in
PAE), user process needs to re-sync with the updated kernel pgd
entry with vmalloc_fault().

Following changes are made to vmalloc_fault().

64-bit:
- No change for the sync operation as set_pgd() takes care of
  huge pages as well.
- Add pud_huge() and pmd_huge() to the validation code to
  handle huge pages.
- Change pud_page_vaddr() to pud_pfn() since an ioremap range
  is not directly mapped (although the if-statement still works
  with a bogus addr).
- Change pmd_page() to pmd_pfn() since an ioremap range is not
  backed by struct page table (although the if-statement still
  works with a bogus addr).

PAE:
- No change for the sync operation since the index3 pgd entry
  covers the entire vmalloc range, which is always valid.
  (A separate change will be needed if this assumption gets
  changed regardless of the page size.)
- Add pmd_huge() to the validation code to handle huge pages.
  This is only for completeness since vmalloc_fault() won't
  happen for ioremap'd ranges as its pgd entry is always valid.
  (I was unable to test this part of the changes as a result.)

Reported-by: Henning Schild <henning.schild@xxxxxxxxxxx>
Signed-off-by: Toshi Kani <toshi.kani@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
---
When this patch is accepted, please copy to stable up to 4.1.
---
 arch/x86/mm/fault.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index eef44d9..e830c71 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -287,6 +287,9 @@ static noinline int vmalloc_fault(unsigned long address)
 	if (!pmd_k)
 		return -1;
 
+	if (pmd_huge(*pmd_k))
+		return 0;
+
 	pte_k = pte_offset_kernel(pmd_k, address);
 	if (!pte_present(*pte_k))
 		return -1;
@@ -360,8 +363,6 @@ void vmalloc_sync_all(void)
  * 64-bit:
  *
  *   Handle a fault on the vmalloc area
- *
- * This assumes no large pages in there.
  */
 static noinline int vmalloc_fault(unsigned long address)
 {
@@ -403,17 +404,23 @@ static noinline int vmalloc_fault(unsigned long address)
 	if (pud_none(*pud_ref))
 		return -1;
 
-	if (pud_none(*pud) || pud_page_vaddr(*pud) != pud_page_vaddr(*pud_ref))
+	if (pud_none(*pud) || pud_pfn(*pud) != pud_pfn(*pud_ref))
 		BUG();
 
+	if (pud_huge(*pud))
+		return 0;
+
 	pmd = pmd_offset(pud, address);
 	pmd_ref = pmd_offset(pud_ref, address);
 	if (pmd_none(*pmd_ref))
 		return -1;
 
-	if (pmd_none(*pmd) || pmd_page(*pmd) != pmd_page(*pmd_ref))
+	if (pmd_none(*pmd) || pmd_pfn(*pmd) != pmd_pfn(*pmd_ref))
 		BUG();
 
+	if (pmd_huge(*pmd))
+		return 0;
+
 	pte_ref = pte_offset_kernel(pmd_ref, address);
 	if (!pte_present(*pte_ref))
 		return -1;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]