Since 4.1, ioremap() supports large page (pud/pmd) mappings in x86_64 and PAE. vmalloc_fault() however assumes that the vmalloc range is limited to pte mappings. pgd_ctor() sets the kernel's pgd entries to user's during fork(), which makes user processes share the same page tables for the kernel ranges. When a call to ioremap() is made at run-time that leads to allocate a new 2nd level table (pud in 64-bit and pmd in PAE), user process needs to re-sync with the updated kernel pgd entry with vmalloc_fault(). Following changes are made to vmalloc_fault(). 64-bit: - No change for the sync operation as set_pgd() takes care of huge pages as well. - Add pud_huge() and pmd_huge() to the validation code to handle huge pages. - Change pud_page_vaddr() to pud_pfn() since an ioremap range is not directly mapped (although the if-statement still works with a bogus addr). - Change pmd_page() to pmd_pfn() since an ioremap range is not backed by struct page table (although the if-statement still works with a bogus addr). PAE: - No change for the sync operation since the index3 pgd entry covers the entire vmalloc range, which is always valid. (A separate change will be needed if this assumption gets changed regardless of the page size.) - Add pmd_huge() to the validation code to handle huge pages. This is only for completeness since vmalloc_fault() won't happen for ioremap'd ranges as its pgd entry is always valid. (I was unable to test this part of the changes as a result.) Reported-by: Henning Schild <henning.schild@xxxxxxxxxxx> Signed-off-by: Toshi Kani <toshi.kani@xxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> --- When this patch is accepted, please copy to stable up to 4.1. --- arch/x86/mm/fault.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index eef44d9..e830c71 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -287,6 +287,9 @@ static noinline int vmalloc_fault(unsigned long address) if (!pmd_k) return -1; + if (pmd_huge(*pmd_k)) + return 0; + pte_k = pte_offset_kernel(pmd_k, address); if (!pte_present(*pte_k)) return -1; @@ -360,8 +363,6 @@ void vmalloc_sync_all(void) * 64-bit: * * Handle a fault on the vmalloc area - * - * This assumes no large pages in there. */ static noinline int vmalloc_fault(unsigned long address) { @@ -403,17 +404,23 @@ static noinline int vmalloc_fault(unsigned long address) if (pud_none(*pud_ref)) return -1; - if (pud_none(*pud) || pud_page_vaddr(*pud) != pud_page_vaddr(*pud_ref)) + if (pud_none(*pud) || pud_pfn(*pud) != pud_pfn(*pud_ref)) BUG(); + if (pud_huge(*pud)) + return 0; + pmd = pmd_offset(pud, address); pmd_ref = pmd_offset(pud_ref, address); if (pmd_none(*pmd_ref)) return -1; - if (pmd_none(*pmd) || pmd_page(*pmd) != pmd_page(*pmd_ref)) + if (pmd_none(*pmd) || pmd_pfn(*pmd) != pmd_pfn(*pmd_ref)) BUG(); + if (pmd_huge(*pmd)) + return 0; + pte_ref = pte_offset_kernel(pmd_ref, address); if (!pte_present(*pte_ref)) return -1; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>