Re: [PATCH] mm: fix unevictable page reclaim when calling madvise_pageout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28.10.19 16:08, zhong jiang wrote:
> Recently, I hit the following issue when running in the upstream.
> 
> kernel BUG at mm/vmscan.c:1521!
> invalid opcode: 0000 [#1] SMP KASAN PTI
> CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
> RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521
> Code: de f5 ff ff e8 ab 79 eb ff 4c 89 f7 e8 43 33 0d 00 e9 cc f5 ff ff e8 99 79 eb ff 48 c7 c6 a0 34 2b a0 4c 89 f7 e8 1a 4d 05 00 <0f> 0b e8 83 79 eb ff 48 89 d8 48 c1 e8 03 42 80 3c 38 00 0f 85 74
> RSP: 0018:ffff88819a3df5a0 EFLAGS: 00010286
> RAX: 0000000000040000 RBX: ffffea00061c3980 RCX: ffffffff814fba36
> RDX: 00000000000056f7 RSI: ffffc9000c02c000 RDI: ffff8881f70268cc
> RBP: ffff88819a3df898 R08: ffffed103ee05de0 R09: ffffed103ee05de0
> R10: 0000000000000001 R11: ffffed103ee05ddf R12: ffff88819a3df6f0
> R13: ffff88819a3df6f0 R14: ffffea00061c3980 R15: dffffc0000000000
> FS:  00007f21b9d8e700(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2d621000 CR3: 00000001c8c46004 CR4: 00000000007606f0
> DR0: 0000000020000140 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> PKRU: 55555554
> Call Trace:
>   reclaim_pages+0x499/0x800 mm/vmscan.c:2188
>   madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453
>   walk_pmd_range mm/pagewalk.c:53 [inline]
>   walk_pud_range mm/pagewalk.c:112 [inline]
>   walk_p4d_range mm/pagewalk.c:139 [inline]
>   walk_pgd_range mm/pagewalk.c:166 [inline]
>   __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261
>   walk_page_range+0x179/0x310 mm/pagewalk.c:349
>   madvise_pageout_page_range mm/madvise.c:506 [inline]
>   madvise_pageout+0x1f0/0x330 mm/madvise.c:542
>   madvise_vma mm/madvise.c:931 [inline]
>   __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113
>   do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> madvise_pageout access the specified range of the vma and isolate
> them, then run shrink_page_list to reclaim the memory. But It also
> isolate the unevictable page to reclaim. Hence, we can catch the
> cases in shrink_page_list.
> 
> We can fix it by preventing unevictable page from isolating.
> Another way to fix the issue by removing the condition of
> BUG_ON(PageUnevictable(page)) in shrink_page_list. I think it
> is better  to use the latter. Because We has taken the unevictable
> page and skip it into account in shrink_page_list.

I really don't understand the last sentence. Looks like
something got messed up :)


> 
> Signed-off-by: zhong jiang <zhongjiang@xxxxxxxxxx>
> ---
>   mm/vmscan.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index f7d1301..1c6e959 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1524,7 +1524,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>   		unlock_page(page);
>   keep:
>   		list_add(&page->lru, &ret_pages);
> -		VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
> +		VM_BUG_ON_PAGE(PageLRU(page), page);

So, this comes from

commit b291f000393f5a0b679012b39d79fbc85c018233
Author: Nick Piggin <npiggin@xxxxxxx>
Date:   Sat Oct 18 20:26:44 2008 -0700

    mlock: mlocked pages are unevictable
    
    Make sure that mlocked pages also live on the unevictable LRU, so kswapd
    will not scan them over and over again.


That patch is fairly old. How come we can suddenly trigger this?
Which commit is responsible for that? Was it always broken?

I can see that

commit ad6b67041a45497261617d7a28b15159b202cb5a
Author: Minchan Kim <minchan@xxxxxxxxxx>
Date:   Wed May 3 14:54:13 2017 -0700

    mm: remove SWAP_MLOCK in ttu

Performed some changes in that area. But also some time ago.

-- 

Thanks,

David / dhildenb






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux