On Thu, 11 Feb 2016, Andrew Morton wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Thu, 11 Feb 2016 07:09:04 +0000 bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=112301 > > > > Bug ID: 112301 > > Summary: [bisected] NULL pointer dereference when starting a > > kvm based VM > > Product: Memory Management > > Version: 2.5 > > Kernel Version: 4.5-rcX > > Hardware: All > > OS: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Other > > Assignee: akpm@xxxxxxxxxxxxxxxxxxxx > > Reporter: harn-solo@xxxxxx > > Regression: No > > > > Created attachment 203451 > > --> https://bugzilla.kernel.org/attachment.cgi?id=203451&action=edit > > Call Trace of a NULL pointer dereference at gup_pte_range > > > > Starting a qemu-kvm based VM configured to use hughpages I'm getting the > > following NULL pointer dereference, see attached dmesg section. > > > > The issue was introduced with commit 7d2eba0557c18f7522b98befed98799990dd4fdb > > Author: Ebru Akagunduz <ebru.akagunduz@xxxxxxxxx> > > Date: Thu Jan 14 15:22:19 2016 -0800 > > mm: add tracepoint for scanning pages > > Thanks for the detailed report. Can you please verify that your tree > has 629d9d1cafbd49cb374 ("mm: avoid uninitialized variable in > tracepoint")? > > vfio_pin_pages() doesn't seem to be doing anything crazy. Hugh, Ebru: > could you please take a look? I very much doubt that the uninitialized variable in collapse_huge_page() had anything to do with the crash in gup_pte_range(). Far more likely is that the bisection hit a point in between the introduction of that uninitialized variable and its subsequent fix, the test crashed, and the bisector didn't notice that it was crashing for a different reason. Comparing the "Code:" of the gup_pte_range() crash with disassembly of gup_pte_range() here, it looks as if it's crashing in pte_page(). And, yes, that pte_page() looks broken in 4.5-rc: please try this patch. [PATCH] mm, x86: fix pte_page() crash in gup_pte_range() Commit 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings") has moved up the pte_page(pte) in x86's fast gup_pte_range(), for no discernible reason: put it back where it belongs, after the pte_flags check and the pfn_valid cross-check. That may be the cause of the NULL pointer dereference in gup_pte_range(), seen when vfio called vaddr_get_pfn() when starting a qemu-kvm based VM. Reported-by: Michael Long <Harn-Solo@xxxxxx> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> --- arch/x86/mm/gup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- 4.5-rc3/arch/x86/mm/gup.c 2016-01-24 14:54:51.359500642 -0800 +++ linux/arch/x86/mm/gup.c 2016-02-12 12:15:36.460501324 -0800 @@ -102,7 +102,6 @@ static noinline int gup_pte_range(pmd_t return 0; } - page = pte_page(pte); if (pte_devmap(pte)) { pgmap = get_dev_pagemap(pte_pfn(pte), pgmap); if (unlikely(!pgmap)) { @@ -115,6 +114,7 @@ static noinline int gup_pte_range(pmd_t return 0; } VM_BUG_ON(!pfn_valid(pte_pfn(pte))); + page = pte_page(pte); get_page(page); put_dev_pagemap(pgmap); SetPageReferenced(page); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>