On 2019/10/29 0:07, David Hildenbrand wrote: > On 28.10.19 16:45, zhong jiang wrote: >> On 2019/10/28 23:27, David Hildenbrand wrote: >>> On 28.10.19 16:08, zhong jiang wrote: >>>> Recently, I hit the following issue when running in the upstream. >>>> >>>> kernel BUG at mm/vmscan.c:1521! >>>> invalid opcode: 0000 [#1] SMP KASAN PTI >>>> CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1 >>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 >>>> RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521 >>>> Code: de f5 ff ff e8 ab 79 eb ff 4c 89 f7 e8 43 33 0d 00 e9 cc f5 ff ff e8 99 79 eb ff 48 c7 c6 a0 34 2b a0 4c 89 f7 e8 1a 4d 05 00 <0f> 0b e8 83 79 eb ff 48 89 d8 48 c1 e8 03 42 80 3c 38 00 0f 85 74 >>>> RSP: 0018:ffff88819a3df5a0 EFLAGS: 00010286 >>>> RAX: 0000000000040000 RBX: ffffea00061c3980 RCX: ffffffff814fba36 >>>> RDX: 00000000000056f7 RSI: ffffc9000c02c000 RDI: ffff8881f70268cc >>>> RBP: ffff88819a3df898 R08: ffffed103ee05de0 R09: ffffed103ee05de0 >>>> R10: 0000000000000001 R11: ffffed103ee05ddf R12: ffff88819a3df6f0 >>>> R13: ffff88819a3df6f0 R14: ffffea00061c3980 R15: dffffc0000000000 >>>> FS: 00007f21b9d8e700(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> CR2: 0000001b2d621000 CR3: 00000001c8c46004 CR4: 00000000007606f0 >>>> DR0: 0000000020000140 DR1: 0000000000000000 DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 >>>> PKRU: 55555554 >>>> Call Trace: >>>> reclaim_pages+0x499/0x800 mm/vmscan.c:2188 >>>> madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453 >>>> walk_pmd_range mm/pagewalk.c:53 [inline] >>>> walk_pud_range mm/pagewalk.c:112 [inline] >>>> walk_p4d_range mm/pagewalk.c:139 [inline] >>>> walk_pgd_range mm/pagewalk.c:166 [inline] >>>> __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261 >>>> walk_page_range+0x179/0x310 mm/pagewalk.c:349 >>>> madvise_pageout_page_range mm/madvise.c:506 [inline] >>>> madvise_pageout+0x1f0/0x330 mm/madvise.c:542 >>>> madvise_vma mm/madvise.c:931 [inline] >>>> __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113 >>>> do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290 >>>> entry_SYSCALL_64_after_hwframe+0x49/0xbe >>>> >>>> madvise_pageout access the specified range of the vma and isolate >>>> them, then run shrink_page_list to reclaim the memory. But It also >>>> isolate the unevictable page to reclaim. Hence, we can catch the >>>> cases in shrink_page_list. >>>> >>>> We can fix it by preventing unevictable page from isolating. >>>> Another way to fix the issue by removing the condition of >>>> BUG_ON(PageUnevictable(page)) in shrink_page_list. I think it >>>> is better to use the latter. Because We has taken the unevictable >>>> page and skip it into account in shrink_page_list. >>> I really don't understand the last sentence. Looks like >>> something got messed up :) >> I mean that we will check the page_evictable(page) in shrink_page_list, >> if it is unevictable page, we will put the page back to correct lru. >> >> Based on the condition, I make the choice. It seems to more simpler.:-) >> >> Thanks, >> zhong jiang >>> >>>> Signed-off-by: zhong jiang <zhongjiang@xxxxxxxxxx> >>>> --- >>>> mm/vmscan.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>> index f7d1301..1c6e959 100644 >>>> --- a/mm/vmscan.c >>>> +++ b/mm/vmscan.c >>>> @@ -1524,7 +1524,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, >>>> unlock_page(page); >>>> keep: >>>> list_add(&page->lru, &ret_pages); >>>> - VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page); >>>> + VM_BUG_ON_PAGE(PageLRU(page), page); >>> So, this comes from >>> >>> commit b291f000393f5a0b679012b39d79fbc85c018233 >>> Author: Nick Piggin <npiggin@xxxxxxx> >>> Date: Sat Oct 18 20:26:44 2008 -0700 >>> >>> mlock: mlocked pages are unevictable >>> Make sure that mlocked pages also live on the unevictable LRU, so kswapd >>> will not scan them over and over again. >>> >>> >>> That patch is fairly old. How come we can suddenly trigger this? >>> Which commit is responsible for that? Was it always broken? >>> >>> I can see that >>> >>> commit ad6b67041a45497261617d7a28b15159b202cb5a >>> Author: Minchan Kim <minchan@xxxxxxxxxx> >>> Date: Wed May 3 14:54:13 2017 -0700 >>> >>> mm: remove SWAP_MLOCK in ttu >>> >>> Performed some changes in that area. But also some time ago. >> I think the following patch introduce the issue. >> >> commit 1a4e58cce84ee88129d5d49c064bd2852b481357 >> Author: Minchan Kim <minchan@xxxxxxxxxx> >> Date: Wed Sep 25 16:49:15 2019 -0700 >> >> mm: introduce MADV_PAGEOUT >> >> When a process expects no accesses to a certain memory range for a long >> > > CCing Minchan Kim then. > > If this is indeed the introducing patch, you probably reference that patch in your cover mail somehow. (Fixes: does not apply until upstream) > > I am absolutely no expert on vmscan.c, so I'm afraid I can't really comment on the details. > Yep, but still thanks for your concerns and reply. Sincerely, zhong jiang