On 23.11.24 08:31, syzbot wrote:
Hello, syzbot found the following issue on: HEAD commit: 9fb2cfa4635a Merge tag 'pull-ufs' of git://git.kernel.org/.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=10042930580000 kernel config: https://syzkaller.appspot.com/x/.config?x=c4515f1b6a4e50b7 dashboard link: https://syzkaller.appspot.com/bug?extid=9f9a7f73fb079b2387a6 compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=105ff2e8580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/7c0c61a15f60/disk-9fb2cfa4.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/3363d84eeb74/vmlinux-9fb2cfa4.xz kernel image: https://storage.googleapis.com/syzbot-assets/2b1a270af550/bzImage-9fb2cfa4.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+9f9a7f73fb079b2387a6@xxxxxxxxxxxxxxxxxxxxxxxxx
Staring at the console output: [ 520.222112][ T7269] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x1403 pfn:0x125be [ 520.362213][ T7269] head: order:9 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 520.411963][ T7269] memcg:ffff88807c73c000 [ 520.492069][ T7269] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) [ 520.499844][ T7269] raw: 00fff00000000000 ffffea0000490001 dead000000000122 dead000000000400 [ 520.551982][ T7269] raw: 00000000000014d0 0000000000000000 00000000ffffffff 0000000000000000 [ 520.560912][ T7269] head: 00fff00000000040 0000000000000000 dead000000000122 0000000000000000 [ 520.672020][ T7269] head: 0000000000001245 0000000000000000 00000001ffffffff ffff88807c73c000 [ 520.735699][ T7269] head: 00fff00000000209 ffffea0000490001 ffffffffffffffff 0000000000000000 [ 520.901989][ T7269] head: 0000000000000200 0000000000000000 00000000ffffffff 0000000000000000 [ 520.991952][ T7269] page dumped because: VM_BUG_ON_PAGE(PageTail(page)) [ 521.086487][ T7269] page_owner tracks the page as allocated [ 521.132208][ T7269] page last allocated via order 0, migratetype Movable, gfp_mask 0x3d24ca(GFP_TRANSHUGE|__GFP_NORETRY| ^order 0 looks wrong, but let;s not get distracted. __GFP_THISNODE), pid 7321, tgid 7321 (syz.1.194), ts 520201520231, free_ts 520193076092 [ 521.272012][ T7269] post_alloc_hook+0x2d1/0x350 [ 521.276977][ T7269] __alloc_pages_direct_compact+0x20e/0x590 [ 521.314428][ T7269] __alloc_pages_noprof+0x182b/0x25a0 [ 521.319975][ T7269] alloc_pages_mpol_noprof+0x282/0x610 [ 521.420092][ T7269] folio_alloc_mpol_noprof+0x36/0xd0 [ 521.483167][ T7269] vma_alloc_folio_noprof+0xee/0x1b0 [ 521.539677][ T7269] do_huge_pmd_anonymous_page+0x258/0x2ae0 ... [ 521.851719][ T7269] page last free pid 7323 tgid 7321 stack trace: [ 521.972611][ T7269] free_unref_folios+0xa87/0x14f0 [ 521.977735][ T7269] folios_put_refs+0x587/0x7b0 [ 522.072508][ T7269] folio_batch_move_lru+0x2c4/0x3b0 [ 522.077794][ T7269] __folio_batch_add_and_move+0x35b/0xc60 [ 522.191992][ T7269] reclaim_folio_list+0x205/0x3a0 [ 522.197131][ T7269] reclaim_pages+0x481/0x650 [ 522.201760][ T7269] madvise_cold_or_pageout_pte_range+0x163b/0x20d0 ... So we allocated a order-9 anonymous folio, but suddenly find it via shmem in the pagecache? Is this some crazy use-after-free / double-free, where we end up freeing a shmem folio that is still in the pagecache? Once freed, it gets merged in the buddy, and we then re-allocate it as part of a PMD THP; but shmem still finds it in the pagecache, and as the it's now suddenly a tail page, the folio checks trigger. Maybe the MADV_COLD / MADV_PAGEOUT is a valid hint. But I'm not able to spot obvious refcount handling issues there.
madvise_pageout_page_range mm/madvise.c:609 [inline] madvise_pageout+0x326/0x820 mm/madvise.c:636 madvise_vma_behavior+0x58c/0x19e0 mm/madvise.c:1045 madvise_walk_vmas+0x1cf/0x2c0 mm/madvise.c:1274 do_madvise+0x29d/0x700 mm/madvise.c:1461 __do_sys_madvise mm/madvise.c:1477 [inline] __se_sys_madvise mm/madvise.c:1475 [inline] __x64_sys_madvise+0xa9/0x110 mm/madvise.c:1475 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:309! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 7269 Comm: syz.1.183 Not tainted 6.12.0-syzkaller-00233-g9fb2cfa4635a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:const_folio_flags.constprop.0+0x12e/0x150 include/linux/page-flags.h:309 Code: 86 cb ff e8 f4 86 cb ff 48 8d 45 ff 48 39 c3 0f 84 38 ff ff ff e8 e2 86 cb ff 48 c7 c6 00 19 58 8b 48 89 df e8 e3 4b 11 00 90 <0f> 0b e8 6b 0d 2d 00 e9 f1 fe ff ff e8 61 0d 2d 00 eb a3 48 89 df RSP: 0018:ffffc9000c55ee30 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea0000496f80 RCX: ffffc9000c55ecd8 RDX: ffff88805f401e00 RSI: ffffffff81c1362d RDI: ffff88805f402244 RBP: 0000000000000001 R08: 0000000000000000 R09: fffffbfff203a591 R10: ffffffff901d2c8f R11: 0000000000000001 R12: 00000000000014df R13: 0000000000000000 R14: dffffc0000000000 R15: 1ffff920018abdf4 FS: 00007f08b31bc6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c0025ff000 CR3: 00000000341ce000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> folio_test_locked include/linux/page-flags.h:509 [inline] next_uptodate_folio+0xac/0x4b0 mm/filemap.c:3505 filemap_map_pages+0x1c6/0x16a0 mm/filemap.c:3647 do_fault_around mm/memory.c:5255 [inline] do_read_fault mm/memory.c:5288 [inline] do_fault mm/memory.c:5431 [inline] do_pte_missing+0xdae/0x3e70 mm/memory.c:3965 handle_pte_fault mm/memory.c:5766 [inline] __handle_mm_fault+0x100a/0x2a10 mm/memory.c:5909 handle_mm_fault+0x3fa/0xaa0 mm/memory.c:6077 faultin_page mm/gup.c:1187 [inline] __get_user_pages+0x8d9/0x3b50 mm/gup.c:1485 __get_user_pages_locked mm/gup.c:1751 [inline] get_dump_page+0xfb/0x220 mm/gup.c:2269 dump_user_range+0x135/0x8c0 fs/coredump.c:943 elf_core_dump+0x2766/0x3840 fs/binfmt_elf.c:2121 do_coredump+0x2c42/0x4160 fs/coredump.c:758 get_signal+0x237c/0x26d0 kernel/signal.c:2903 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] irqentry_exit_to_user_mode+0x13f/0x280 kernel/entry/common.c:231 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 RIP: 0033:0x1000 Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 RSP: 002b:000000000000010c EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00007f08b41363b8 RCX: 00007f08b3f7e759 RDX: ffffffffff600000 RSI: 0000000000000104 RDI: 8000000000000000 RBP: 00007f08b3ff175e R08: 0000000100000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f08b41363b8 R15: 00007fff7656a008 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:const_folio_flags.constprop.0+0x12e/0x150 include/linux/page-flags.h:309 Code: 86 cb ff e8 f4 86 cb ff 48 8d 45 ff 48 39 c3 0f 84 38 ff ff ff e8 e2 86 cb ff 48 c7 c6 00 19 58 8b 48 89 df e8 e3 4b 11 00 90 <0f> 0b e8 6b 0d 2d 00 e9 f1 fe ff ff e8 61 0d 2d 00 eb a3 48 89 df RSP: 0018:ffffc9000c55ee30 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffffea0000496f80 RCX: ffffc9000c55ecd8 RDX: ffff88805f401e00 RSI: ffffffff81c1362d RDI: ffff88805f402244 RBP: 0000000000000001 R08: 0000000000000000 R09: fffffbfff203a591 R10: ffffffff901d2c8f R11: 0000000000000001 R12: 00000000000014df R13: 0000000000000000 R14: dffffc0000000000 R15: 1ffff920018abdf4 FS: 00007f08b31bc6c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff76568ff8 CR3: 00000000341ce000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup
-- Cheers, David / dhildenb