On Fri, Aug 04, 2023 at 11:14:45AM +0800, Yikebaer Aizezi wrote: > Just patched it, then I rerun the reproduce program, and I got this > output from console: > > BUG: Bad page state in process POC pfn:0eb8d > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > index:0x0 pfn:0xeb8d > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > page_type: 0xffffffff() > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set > page_owner info is not present (never set?) > Modules linked in: > CPU: 0 PID: 7959 Comm: POC Not tainted 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0xd4/0xf0 lib/dump_stack.c:106 > bad_page+0x71/0x1a0 mm/page_alloc.c:533 > free_page_is_bad_report mm/page_alloc.c:974 [inline] > free_page_is_bad mm/page_alloc.c:984 [inline] > free_pages_prepare mm/page_alloc.c:1153 [inline] > free_unref_page_prepare+0x5f3/0xb50 mm/page_alloc.c:2348 > free_unref_page+0x2f/0x3c0 mm/page_alloc.c:2443 > __folio_put_small mm/swap.c:106 [inline] > __folio_put+0xa2/0x110 mm/swap.c:129 > folio_put include/linux/mm.h:1423 [inline] > put_page include/linux/mm.h:1492 [inline] > extract_user_to_sg lib/scatterlist.c:1151 [inline] Ohh. I think this is something Dave Howells has a patch for. > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x11ec/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > > page:ffffea00003ae340 refcount:0 mapcount:0 mapping:0000000000000000 > index:0x0 pfn:0xeb8d > flags: 0xfff00000001000(reserved|node=0|zone=1|lastcpupid=0x7ff) > page_type: 0xffffffff() > raw: 00fff00000001000 ffffea00003ae348 ffffea00003ae348 0000000000000000 > raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > page dumped because: VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0) > page_owner info is not present (never set?) > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 7962 at mm/gup.c:229 try_grab_page+0x307/0x3c0 mm/gup.c:229 > Modules linked in: > CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 > Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 > c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b > eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 > RSP: 0018:ffffc90002927178 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 > RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a > R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 > R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 > FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 > internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 > pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 > iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] > iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 > extract_user_to_sg lib/scatterlist.c:1123 [inline] > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > > Modules linked in: > CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 > Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 > c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b > eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 > RSP: 0018:ffffc90002927178 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 > RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a > R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 > R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 > FS: 00007fbd7961a540(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fbd794d03d0 CR3: 0000000019855000 CR4: 0000000000750ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 > internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 > pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 > iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] > iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 > extract_user_to_sg lib/scatterlist.c:1123 [inline] > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > Kernel panic - not syncing: kernel: panic_on_warn set ... > CPU: 0 PID: 7962 Comm: POC Tainted: G B 6.5.0-rc2 #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0x92/0xf0 lib/dump_stack.c:106 > panic+0x570/0x620 kernel/panic.c:340 > check_panic_on_warn+0x8e/0x90 kernel/panic.c:236 > __warn+0xee/0x340 kernel/panic.c:673 > __report_bug lib/bug.c:199 [inline] > report_bug+0x25d/0x460 lib/bug.c:219 > handle_bug+0x3c/0x70 arch/x86/kernel/traps.c:324 > exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:345 > asm_exc_invalid_op+0x16/0x20 arch/x86/include/asm/idtentry.h:568 > RIP: 0010:try_grab_page+0x307/0x3c0 mm/gup.c:229 > Code: 80 3d 61 0e 82 0b 00 41 bc f4 ff ff ff 75 b4 e8 3f 96 cb ff 48 > c7 c6 40 83 57 89 48 89 ef e8 60 a7 ff ff c6 05 3e 0e 82 0b 01 <0f> 0b > eb 95 e8 20 96 cb ff be 04 00 00 00 4c 89 e7 e8 93 fa 13 00 > RSP: 0018:ffffc90002927178 EFLAGS: 00010293 > RAX: 0000000000000000 RBX: ffffea00003ae340 RCX: 0000000000000000 > RDX: ffff88801ab18000 RSI: ffffffff81ad81e0 RDI: ffffffff8af7ea00 > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffffbfff1a8a74a > R10: ffffffff8d453a57 R11: 6e776f5f65676170 R12: 00000000fffffff4 > R13: 0000000000290000 R14: ffffea00003ae340 R15: ffffea00003ae340 > follow_page_pte+0x18c/0x1610 mm/gup.c:651 > follow_pmd_mask mm/gup.c:727 [inline] > follow_pud_mask mm/gup.c:765 [inline] > follow_p4d_mask mm/gup.c:782 [inline] > follow_page_mask+0x2e4/0xbd0 mm/gup.c:839 > __get_user_pages+0x3fa/0xcf0 mm/gup.c:1256 > __get_user_pages_locked mm/gup.c:1487 [inline] > __gup_longterm_locked+0x5fa/0x1ec0 mm/gup.c:2181 > internal_get_user_pages_fast+0x119b/0x2690 mm/gup.c:3179 > pin_user_pages_fast+0x95/0xe0 mm/gup.c:3285 > iov_iter_extract_user_pages lib/iov_iter.c:1768 [inline] > iov_iter_extract_pages+0x24c/0x1600 lib/iov_iter.c:1831 > extract_user_to_sg lib/scatterlist.c:1123 [inline] > extract_iter_to_sg lib/scatterlist.c:1349 [inline] > extract_iter_to_sg+0x21a/0x1570 lib/scatterlist.c:1339 > hash_sendmsg+0x487/0xf50 crypto/algif_hash.c:119 > sock_sendmsg_nosec net/socket.c:725 [inline] > sock_sendmsg+0xcf/0x170 net/socket.c:748 > ____sys_sendmsg+0x676/0x860 net/socket.c:2494 > ___sys_sendmsg+0x109/0x1a0 net/socket.c:2548 > __sys_sendmsg+0xe4/0x1b0 net/socket.c:2577 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7fbd79539f29 > Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 > RSP: 002b:00007ffeed5b63d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbd79539f29 > RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000004 > RBP: 00007ffeed5b63f0 R08: 00007ffeed5b63f0 R09: 00007ffeed5b63f0 > R10: 00007ffeed5b63f0 R11: 0000000000000246 R12: 000055d8a44b91a0 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > Dumping ftrace buffer: > (ftrace buffer empty) > Kernel Offset: disabled > Rebooting in 1 seconds.. > > --------------------------------------------------------------------------------------------- > > I think the previous question you mentioned about ioctl() is triggered > because of > another crash WARNING in kvm_arch_vcpu_ioctl_run, I think somehow these > two crashes triggered at one time. But I cannot figure out why it happened. > > after I tried to fixed that problem, and rerun C reproducer on this > issue, I got > different output from console as above. > > > Matthew Wilcox <willy@xxxxxxxxxxxxx> 于2023年8月3日周四 21:19写道: > > > > > > On Thu, Aug 03, 2023 at 04:56:03PM +0800, Yikebaer Aizezi wrote: > > > console output: > > > https://drive.google.com/file/d/1Lq71bFwtEDix82PEf_193CLG6uh1Pjj9/view?usp=drive_link > > > > I dug through this, and what I found troubles me. > > > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 > > Modules linked in: > > CPU: 0 PID: 13067 Comm: syz-executor Tainted: G B 6.5.0-rc2 #1 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 > > RIP: 0010:try_grab_page+0x2dd/0x3a0 > > Code: ff be 04 00 00 00 4c 89 e7 e8 cf fa 13 00 f0 41 ff 04 24 e8 65 96 cb ff 45 31 e4 5b 44 89 e0 5d 41 5c 41 5d c3 e8 53 96 cb ff <0f> 0b e8 4c 96 cb ff 41 bc f4 ff ff ff 5b 44 89 e0 5d 41 5c 41 5d > > RSP: 0018:ffffc9000c2777e0 EFLAGS: 00010212 > > RAX: 0000000000000247 RBX: ffffea00003ae340 RCX: ffffc90002bb1000 > > RDX: 0000000000040000 RSI: ffffffff81ad81ed RDI: ffffea00003ae374 > > RBP: ffffea00003ae340 R08: 0000000000000000 R09: fffff94000075c6e > > R10: ffffea00003ae377 R11: 0000000000084001 R12: ffffea00003ae374 > > R13: 0000000000210002 R14: ffffea00003ae340 R15: 000000000eb8d225 > > FS: 00007f5841a13640(0000) GS:ffff888063e00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000000500310 CR3: 0000000018d0c000 CR4: 0000000000750ef0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <TASK> > > ? __warn+0xe2/0x340 > > ? try_grab_page+0x2dd/0x3a0 > > ? report_bug+0x25d/0x460 > > ? handle_bug+0x3c/0x70 > > ? exc_invalid_op+0x14/0x40 > > ? asm_exc_invalid_op+0x16/0x20 > > ? try_grab_page+0x2dd/0x3a0 > > ? try_grab_page+0x2dd/0x3a0 > > follow_page_pte+0x18c/0x1610 > > ? try_grab_page+0x3a0/0x3a0 > > ? rcu_is_watching+0xe/0xb0 > > follow_page_mask+0x2e4/0xbd0 > > __get_user_pages+0x3fa/0xcf0 > > ? follow_page_mask+0xbd0/0xbd0 > > ? down_read_killable+0x146/0x4f0 > > ? down_read_interruptible+0x4f0/0x4f0 > > ? rcu_is_watching+0xe/0xb0 > > __gup_longterm_locked+0x5fa/0x1ec0 > > ? io_schedule_timeout+0x150/0x150 > > ? rcu_is_watching+0xe/0xb0 > > ? get_user_pages_unlocked+0x580/0x580 > > ? lock_release+0x4f7/0x670 > > ? internal_get_user_pages_fast+0xe27/0x2690 > > ? lock_downgrade+0x690/0x690 > > ? preempt_schedule_common+0x45/0xb0 > > ? pud_huge+0x9c/0xe0 > > ? pmd_huge+0xe0/0xe0 > > internal_get_user_pages_fast+0x119b/0x2690 > > ? mtree_load+0x1df/0x980 > > ? __gup_device_huge+0x530/0x530 > > ? rcu_is_watching+0xe/0xb0 > > ? lock_release+0x4f7/0x670 > > get_user_pages_fast+0x95/0xe0 > > ? get_user_pages_fast_only+0xe0/0xe0 > > do_get_mempolicy+0x50c/0xd20 > > ? sp_delete+0xf0/0xf0 > > ? seccomp_notify_ioctl+0xd80/0xd80 > > __x64_sys_get_mempolicy+0x187/0x2a0 > > ? __ia32_sys_migrate_pages+0xf0/0xf0 > > ? __secure_computing+0x1ff/0x360 > > do_syscall_64+0x35/0xb0 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > RIP: 0033:0x47959d > > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b4 ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007f5841a13068 EFLAGS: 00000246 ORIG_RAX: 00000000000000ef > > RAX: ffffffffffffffda RBX: 000000000059c0a0 RCX: 000000000047959d > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: 000000000059c0a0 R08: 0000000000000003 R09: 0000000000000000 > > R10: 0000000020ff9000 R11: 0000000000000246 R12: 000000000059c0ac > > R13: 000000000000000b R14: 0000000000437250 R15: 00007f58419f3000 > > </TASK> > > Kernel panic - not syncing: kernel: panic_on_warn set ... > > > > > WARNING: CPU: 0 PID: 13067 at mm/gup.c:229 try_grab_page+0x2dd/0x3a0 > > > > That's this line: > > if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) > > Called from: > > follow_page_pte+0x18c/0x1610 > > > > That did: > > ptep = pte_offset_map_lock(mm, pmd, address, &ptl); > > pte = ptep_get(ptep); > > page = vm_normal_page(vma, address, pte); > > ret = try_grab_page(page, flags); > > > > So we grabbed the PTE lock, looked up the PTE, translated that into > > a page ... and found a page with a zero (or negative) refcount. > > That's Really Bad. I think it was a zero refcount because r08 is 0 > > and I don't see any other registers which have a plausible negative > > 32-bit number in them. > > > > Yikebaer, could I trouble you to add this: > > > > +++ b/mm/gup.c > > @@ -226,7 +226,7 @@ int __must_check try_grab_page(struct page *page, unsigned int flags) > > { > > struct folio *folio = page_folio(page); > > > > - if (WARN_ON_ONCE(folio_ref_count(folio) <= 0)) > > + if (VM_WARN_ON_ONCE_FOLIO(folio_ref_count(folio) <= 0, folio)) > > return -ENOMEM; > > > > if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) > > > > and rerun the syzkaller? That'll give us some more information about > > what has happened, although it won't tell us why it happened. > > > > We might need to catch someone decrementing the refcount to lower than > > the mapcount to catch this ... which will be tricky, given the other > > things we reuse the mapcount for.