On Thu, Apr 23, 2015 at 06:23:11PM +0200, Andrea Arcangeli wrote: > On Wed, Apr 22, 2015 at 12:26:55PM -0700, Linus Torvalds wrote: > > On Wed, Apr 22, 2015 at 11:33 AM, Kirill A. Shutemov > > <kirill@xxxxxxxxxxxxx> wrote: > > > > > > Could you try patch below instead? This can give a clue what's going on. > > > > Just FYI, I've done the revert in my tree. > > > > Trying to figure out what is going on despite that is obviously a good > > idea, but I'm hoping that my merge window is winding down, so I am > > trying to make sure it's all "good to go".. > > Sounds safer to defer it, agreed. > > Unfortunately I also can only reproduce it only on a workstation where > it wasn't very handy to debug it as it'd disrupt my workflow and it > isn't equipped with reliable logging either (and the KMS mode didn't > switch to console to show me the oops either). It just got it logged > once in syslog before freezing. > > The problem has to be that there's some get_page/put_page activity > before and after a PageAnon transition and it looks like a tail page > got mapped by hand in userland by some driver using 4k ptes which > isn't normal Compound pages mapped with PTEs predates THP. See f3d48f0373c1. > but apparently safe before the patch was applied. Before > the patch, the tail page accounting would be symmetric regardless of > the PageAnon transition. > > page:ffffea0010226040 count:0 mapcount:1 mapping: (null) index:0x0 > flags: 0x8000000000008010(dirty|tail) > page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) != 0) > ------------[ cut here ]------------ > kernel BUG at mm/swap.c:134! I looked into code a bit more. And the VM_BUG_ON_PAGE() is bogus. See explanation in commit message below. Tail page refcounting is mess. Please consider reviewing my patchset which drops it [1]. ;) Linus, how should we proceed with reverted patch? Should I re-submit it to Andrew? Or you'll re-revert it? [1] lkml.kernel.org/g/1429823043-157133-1-git-send-email-kirill.shutemov@xxxxxxxxxxxxxxx >From 854cdc961b7f83f04a83144ab4f7459ae46b0f3d Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> Date: Fri, 24 Apr 2015 23:49:04 +0300 Subject: [PATCH] mm: drop bogus VM_BUG_ON_PAGE assert in put_page() codepath My patch 8d63d99a5dfb which was merged during 4.1 merge window caused regression: page:ffffea0010a15040 count:0 mapcount:1 mapping: (null) index:0x0 flags: 0x8000000000008014(referenced|dirty|tail) page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) != 0) ------------[ cut here ]------------ kernel BUG at mm/swap.c:134! The problem can be reproduced by playing *two* audio files at the same time and then stopping one of players. I used two mplayers to trigger this. The VM_BUG_ON_PAGE() which triggers the bug is bogus: Sound subsystem uses compound pages for its buffers, but unlike most __GFP_COMP users sound maps compound pages to userspace with PTEs. In our case with two players map the buffer twice and therefore elevates page_mapcount() on tail pages by two. When one of players exits it unmaps the VMA and drops page_mapcount() to one and try to release reference on the page with put_page(). My commit changes which path it takes under put_compound_page(). It hits put_unrefcounted_compound_page() where VM_BUG_ON_PAGE() is. It sees page_mapcount() == 1. The function wrongly assumes that subpages of compound page cannot be be mapped by itself with PTEs. The solution is simply drop the VM_BUG_ON_PAGE(). Note: there's no need to move the check under put_page_testzero(). Allocator will check the mapcount before putting on free list. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Reported-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> Reported-by: Borislav Petkov <bp@xxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/swap.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/swap.c b/mm/swap.c index a7251a8ed532..a3a0a2f1f7c3 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -131,7 +131,6 @@ void put_unrefcounted_compound_page(struct page *page_head, struct page *page) * here, see the comment above this function. */ VM_BUG_ON_PAGE(!PageHead(page_head), page_head); - VM_BUG_ON_PAGE(page_mapcount(page) != 0, page); if (put_page_testzero(page_head)) { /* * If this is the tail of a slab THP page, -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>