This is a note to let you know that I've just added the patch titled mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas to the 3.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: mm-pagewalk.c-walk_page_range-should-avoid-vm_pfnmap-areas.patch and it can be found in the queue-3.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From a9ff785e4437c83d2179161e012f5bdfbd6381f0 Mon Sep 17 00:00:00 2001 From: Cliff Wickman <cpw@xxxxxxx> Date: Fri, 24 May 2013 15:55:36 -0700 Subject: mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas From: Cliff Wickman <cpw@xxxxxxx> commit a9ff785e4437c83d2179161e012f5bdfbd6381f0 upstream. A panic can be caused by simply cat'ing /proc/<pid>/smaps while an application has a VM_PFNMAP range. It happened in-house when a benchmarker was trying to decipher the memory layout of his program. /proc/<pid>/smaps and similar walks through a user page table should not be looking at VM_PFNMAP areas. Certain tests in walk_page_range() (specifically split_huge_page_pmd()) assume that all the mapped PFN's are backed with page structures. And this is not usually true for VM_PFNMAP areas. This can result in panics on kernel page faults when attempting to address those page structures. There are a half dozen callers of walk_page_range() that walk through a task's entire page table (as N. Horiguchi pointed out). So rather than change all of them, this patch changes just walk_page_range() to ignore VM_PFNMAP areas. The logic of hugetlb_vma() is moved back into walk_page_range(), as we want to test any vma in the range. VM_PFNMAP areas are used by: - graphics memory manager gpu/drm/drm_gem.c - global reference unit sgi-gru/grufile.c - sgi special memory char/mspec.c - and probably several out-of-tree modules [akpm@xxxxxxxxxxxxxxxxxxxx: remove now-unused hugetlb_vma() stub] Signed-off-by: Cliff Wickman <cpw@xxxxxxx> Reviewed-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: David Sterba <dsterba@xxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxx> Cc: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- mm/pagewalk.c | 70 +++++++++++++++++++++++++++++----------------------------- 1 file changed, 36 insertions(+), 34 deletions(-) --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -127,28 +127,7 @@ static int walk_hugetlb_range(struct vm_ return 0; } -static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk) -{ - struct vm_area_struct *vma; - - /* We don't need vma lookup at all. */ - if (!walk->hugetlb_entry) - return NULL; - - VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem)); - vma = find_vma(walk->mm, addr); - if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma)) - return vma; - - return NULL; -} - #else /* CONFIG_HUGETLB_PAGE */ -static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk) -{ - return NULL; -} - static int walk_hugetlb_range(struct vm_area_struct *vma, unsigned long addr, unsigned long end, struct mm_walk *walk) @@ -198,30 +177,53 @@ int walk_page_range(unsigned long addr, if (!walk->mm) return -EINVAL; + VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem)); + pgd = pgd_offset(walk->mm, addr); do { - struct vm_area_struct *vma; + struct vm_area_struct *vma = NULL; next = pgd_addr_end(addr, end); /* - * handle hugetlb vma individually because pagetable walk for - * the hugetlb page is dependent on the architecture and - * we can't handled it in the same manner as non-huge pages. + * This function was not intended to be vma based. + * But there are vma special cases to be handled: + * - hugetlb vma's + * - VM_PFNMAP vma's */ - vma = hugetlb_vma(addr, walk); + vma = find_vma(walk->mm, addr); if (vma) { - if (vma->vm_end < next) + /* + * There are no page structures backing a VM_PFNMAP + * range, so do not allow split_huge_page_pmd(). + */ + if ((vma->vm_start <= addr) && + (vma->vm_flags & VM_PFNMAP)) { next = vma->vm_end; + pgd = pgd_offset(walk->mm, next); + continue; + } /* - * Hugepage is very tightly coupled with vma, so - * walk through hugetlb entries within a given vma. + * Handle hugetlb vma individually because pagetable + * walk for the hugetlb page is dependent on the + * architecture and we can't handled it in the same + * manner as non-huge pages. */ - err = walk_hugetlb_range(vma, addr, next, walk); - if (err) - break; - pgd = pgd_offset(walk->mm, next); - continue; + if (walk->hugetlb_entry && (vma->vm_start <= addr) && + is_vm_hugetlb_page(vma)) { + if (vma->vm_end < next) + next = vma->vm_end; + /* + * Hugepage is very tightly coupled with vma, + * so walk through hugetlb entries within a + * given vma. + */ + err = walk_hugetlb_range(vma, addr, next, walk); + if (err) + break; + pgd = pgd_offset(walk->mm, next); + continue; + } } if (pgd_none_or_clear_bad(pgd)) { Patches currently in stable-queue which might be from cpw@xxxxxxx are queue-3.9/mm-pagewalk.c-walk_page_range-should-avoid-vm_pfnmap-areas.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html