On Fri, Feb 12, 2016 at 1:10 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote: > On Thu, 11 Feb 2016, Andrew Morton wrote: >> >> (switched to email. Please respond via emailed reply-to-all, not via the >> bugzilla web interface). >> >> On Thu, 11 Feb 2016 07:09:04 +0000 bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: >> >> > https://bugzilla.kernel.org/show_bug.cgi?id=112301 >> > >> > Bug ID: 112301 >> > Summary: [bisected] NULL pointer dereference when starting a >> > kvm based VM >> > Product: Memory Management >> > Version: 2.5 >> > Kernel Version: 4.5-rcX >> > Hardware: All >> > OS: Linux >> > Tree: Mainline >> > Status: NEW >> > Severity: normal >> > Priority: P1 >> > Component: Other >> > Assignee: akpm@xxxxxxxxxxxxxxxxxxxx >> > Reporter: harn-solo@xxxxxx >> > Regression: No >> > >> > Created attachment 203451 >> > --> https://bugzilla.kernel.org/attachment.cgi?id=203451&action=edit >> > Call Trace of a NULL pointer dereference at gup_pte_range >> > >> > Starting a qemu-kvm based VM configured to use hughpages I'm getting the >> > following NULL pointer dereference, see attached dmesg section. >> > >> > The issue was introduced with commit 7d2eba0557c18f7522b98befed98799990dd4fdb >> > Author: Ebru Akagunduz <ebru.akagunduz@xxxxxxxxx> >> > Date: Thu Jan 14 15:22:19 2016 -0800 >> > mm: add tracepoint for scanning pages >> >> Thanks for the detailed report. Can you please verify that your tree >> has 629d9d1cafbd49cb374 ("mm: avoid uninitialized variable in >> tracepoint")? >> >> vfio_pin_pages() doesn't seem to be doing anything crazy. Hugh, Ebru: >> could you please take a look? > > I very much doubt that the uninitialized variable in collapse_huge_page() > had anything to do with the crash in gup_pte_range(). Far more likely > is that the bisection hit a point in between the introduction of that > uninitialized variable and its subsequent fix, the test crashed, and > the bisector didn't notice that it was crashing for a different reason. > > Comparing the "Code:" of the gup_pte_range() crash with disassembly of > gup_pte_range() here, it looks as if it's crashing in pte_page(). And, > yes, that pte_page() looks broken in 4.5-rc: please try this patch. > > [PATCH] mm, x86: fix pte_page() crash in gup_pte_range() > > Commit 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings") > has moved up the pte_page(pte) in x86's fast gup_pte_range(), for no > discernible reason: put it back where it belongs, after the pte_flags > check and the pfn_valid cross-check. > > That may be the cause of the NULL pointer dereference in gup_pte_range(), > seen when vfio called vaddr_get_pfn() when starting a qemu-kvm based VM. > > Reported-by: Michael Long <Harn-Solo@xxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> Acked-by: Dan Williams <dan.j.williams@xxxxxxxxx> That must have been a merge/rebase error on my part when forward porting the patch to a new -mm baseline because the pte_devmap() check is done before we know that the pfn actually has a corresponding struct page. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>