On Thu, 28 Nov 2024 11:52:42 +0100 David Hildenbrand <david@xxxxxxxxxx> > On 23.11.24 08:31, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 9fb2cfa4635a Merge tag 'pull-ufs' of git://git.kernel.org/.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=10042930580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=c4515f1b6a4e50b7 > > dashboard link: https://syzkaller.appspot.com/bug?extid=9f9a7f73fb079b2387a6 > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=105ff2e8580000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/7c0c61a15f60/disk-9fb2cfa4.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/3363d84eeb74/vmlinux-9fb2cfa4.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/2b1a270af550/bzImage-9fb2cfa4.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+9f9a7f73fb079b2387a6@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > Staring at the console output: > > [ 520.222112][ T7269] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x1403 pfn:0x125be ->mapping is cleared for a order9 page > [ 520.362213][ T7269] head: order:9 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 > [ 520.411963][ T7269] memcg:ffff88807c73c000 > [ 520.492069][ T7269] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) > [ 520.499844][ T7269] raw: 00fff00000000000 ffffea0000490001 dead000000000122 dead000000000400 > [ 520.551982][ T7269] raw: 00000000000014d0 0000000000000000 00000000ffffffff 0000000000000000 > [ 520.560912][ T7269] head: 00fff00000000040 0000000000000000 dead000000000122 0000000000000000 > [ 520.672020][ T7269] head: 0000000000001245 0000000000000000 00000001ffffffff ffff88807c73c000 > [ 520.735699][ T7269] head: 00fff00000000209 ffffea0000490001 ffffffffffffffff 0000000000000000 > [ 520.901989][ T7269] head: 0000000000000200 0000000000000000 00000000ffffffff 0000000000000000 > [ 520.991952][ T7269] page dumped because: VM_BUG_ON_PAGE(PageTail(page)) > [ 521.086487][ T7269] page_owner tracks the page as allocated > [ 521.132208][ T7269] page last allocated via order 0, migratetype Movable, gfp_mask 0x3d24ca(GFP_TRANSHUGE|__GFP_NORETRY| > > ^order 0 looks wrong, but let;s not get distracted. > > __GFP_THISNODE), pid 7321, tgid 7321 (syz.1.194), ts 520201520231, free_ts 520193076092 > [ 521.272012][ T7269] post_alloc_hook+0x2d1/0x350 > [ 521.276977][ T7269] __alloc_pages_direct_compact+0x20e/0x590 > [ 521.314428][ T7269] __alloc_pages_noprof+0x182b/0x25a0 > [ 521.319975][ T7269] alloc_pages_mpol_noprof+0x282/0x610 > [ 521.420092][ T7269] folio_alloc_mpol_noprof+0x36/0xd0 > [ 521.483167][ T7269] vma_alloc_folio_noprof+0xee/0x1b0 > [ 521.539677][ T7269] do_huge_pmd_anonymous_page+0x258/0x2ae0 > ... > [ 521.851719][ T7269] page last free pid 7323 tgid 7321 stack trace: > [ 521.972611][ T7269] free_unref_folios+0xa87/0x14f0 > [ 521.977735][ T7269] folios_put_refs+0x587/0x7b0 > [ 522.072508][ T7269] folio_batch_move_lru+0x2c4/0x3b0 > [ 522.077794][ T7269] __folio_batch_add_and_move+0x35b/0xc60 > [ 522.191992][ T7269] reclaim_folio_list+0x205/0x3a0 > [ 522.197131][ T7269] reclaim_pages+0x481/0x650 > [ 522.201760][ T7269] madvise_cold_or_pageout_pte_range+0x163b/0x20d0 > ... > > > So we allocated a order-9 anonymous folio, but suddenly find it via shmem in the pagecache? > > Is this some crazy use-after-free / double-free, where we end up freeing a shmem folio > that is still in the pagecache? Once freed, it gets merged in the buddy, and we then re-allocate > it as part of a PMD THP; but shmem still finds it in the pagecache, and as the it's now suddenly It is not in the pagecache. > a tail page, the folio checks trigger. > > > Maybe the MADV_COLD / MADV_PAGEOUT is a valid hint. But I'm not able to > spot obvious refcount handling issues there. > > > madvise_pageout_page_range mm/madvise.c:609 [inline] > > madvise_pageout+0x326/0x820 mm/madvise.c:636 > > madvise_vma_behavior+0x58c/0x19e0 mm/madvise.c:1045 > > madvise_walk_vmas+0x1cf/0x2c0 mm/madvise.c:1274 > > do_madvise+0x29d/0x700 mm/madvise.c:1461 > > __do_sys_madvise mm/madvise.c:1477 [inline] > > __se_sys_madvise mm/madvise.c:1475 [inline] > > __x64_sys_madvise+0xa9/0x110 mm/madvise.c:1475 > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 > > ------------[ cut here ]------------ > > kernel BUG at include/linux/page-flags.h:309! > > Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > > CPU: 0 UID: 0 PID: 7269 Comm: syz.1.183 Not tainted 6.12.0-syzkaller-00233-g9fb2cfa4635a #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 > > RIP: 0010:const_folio_flags.constprop.0+0x12e/0x150 include/linux/page-flags.h:309 > > Code: 86 cb ff e8 f4 86 cb ff 48 8d 45 ff 48 39 c3 0f 84 38 ff ff ff e8 e2 86 cb ff 48 c7 c6 00 19 58 8b 48 89 df e8 e3 4b 11 00 90 <0f> 0b e8 6b 0d 2d 00 e9 f1 fe ff ff e8 61 0d 2d 00 eb a3 48 89 df > > RSP: 0018:ffffc9000c55ee30 EFLAGS: 00010293 > > RAX: 0000000000000000 RBX: ffffea0000496f80 RCX: ffffc9000c55ecd8 > > RDX: ffff88805f401e00 RSI: ffffffff81c1362d RDI: ffff88805f402244 > > RBP: 0000000000000001 R08: 0000000000000000 R09: fffffbfff203a591 > > R10: ffffffff901d2c8f R11: 0000000000000001 R12: 00000000000014df > > R13: 0000000000000000 R14: dffffc0000000000 R15: 1ffff920018abdf4 > > FS: 00007f08b31bc6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000000c0025ff000 CR3: 00000000341ce000 CR4: 00000000003526f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <TASK> > > folio_test_locked include/linux/page-flags.h:509 [inline] > > next_uptodate_folio+0xac/0x4b0 mm/filemap.c:3505 > > filemap_map_pages+0x1c6/0x16a0 mm/filemap.c:3647 > > do_fault_around mm/memory.c:5255 [inline] > > do_read_fault mm/memory.c:5288 [inline] > > do_fault mm/memory.c:5431 [inline] > > do_pte_missing+0xdae/0x3e70 mm/memory.c:3965 > > handle_pte_fault mm/memory.c:5766 [inline] > > __handle_mm_fault+0x100a/0x2a10 mm/memory.c:5909 > > handle_mm_fault+0x3fa/0xaa0 mm/memory.c:6077 > > faultin_page mm/gup.c:1187 [inline] > > __get_user_pages+0x8d9/0x3b50 mm/gup.c:1485 > > __get_user_pages_locked mm/gup.c:1751 [inline] > > get_dump_page+0xfb/0x220 mm/gup.c:2269 > > dump_user_range+0x135/0x8c0 fs/coredump.c:943 > > elf_core_dump+0x2766/0x3840 fs/binfmt_elf.c:2121 > > do_coredump+0x2c42/0x4160 fs/coredump.c:758 > > get_signal+0x237c/0x26d0 kernel/signal.c:2903 > > arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 > > exit_to_user_mode_loop kernel/entry/common.c:111 [inline] > > exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline] > > irqentry_exit_to_user_mode+0x13f/0x280 kernel/entry/common.c:231 > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 > > RIP: 0033:0x1000 > > Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > RSP: 002b:000000000000010c EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: 00007f08b41363b8 RCX: 00007f08b3f7e759 > > RDX: ffffffffff600000 RSI: 0000000000000104 RDI: 8000000000000000 > > RBP: 00007f08b3ff175e R08: 0000000100000000 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > > R13: 0000000000000000 R14: 00007f08b41363b8 R15: 00007fff7656a008 > > </TASK> > > Modules linked in: > > ---[ end trace 0000000000000000 ]--- > > RIP: 0010:const_folio_flags.constprop.0+0x12e/0x150 include/linux/page-flags.h:309 > > Code: 86 cb ff e8 f4 86 cb ff 48 8d 45 ff 48 39 c3 0f 84 38 ff ff ff e8 e2 86 cb ff 48 c7 c6 00 19 58 8b 48 89 df e8 e3 4b 11 00 90 <0f> 0b e8 6b 0d 2d 00 e9 f1 fe ff ff e8 61 0d 2d 00 eb a3 48 89 df > > RSP: 0018:ffffc9000c55ee30 EFLAGS: 00010293 > > RAX: 0000000000000000 RBX: ffffea0000496f80 RCX: ffffc9000c55ecd8 > > RDX: ffff88805f401e00 RSI: ffffffff81c1362d RDI: ffff88805f402244 > > RBP: 0000000000000001 R08: 0000000000000000 R09: fffffbfff203a591 > > R10: ffffffff901d2c8f R11: 0000000000000001 R12: 00000000000014df > > R13: 0000000000000000 R14: dffffc0000000000 R15: 1ffff920018abdf4 > > FS: 00007f08b31bc6c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007fff76568ff8 CR3: 00000000341ce000 CR4: 00000000003526f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > --- > > This report is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx. > > > > syzbot will keep track of this issue. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > If the report is already addressed, let syzbot know by replying with: > > #syz fix: exact-commit-title > > > > If you want syzbot to run the reproducer, reply with: > > #syz test: git://repo/address.git branch-or-commit-hash > > If you attach or paste a git patch, syzbot will apply it before testing. > > > > If you want to overwrite report's subsystems, reply with: > > #syz set subsystems: new-subsystem > > (See the list of subsystem names on the web dashboard) > > > > If the report is a duplicate of another one, reply with: > > #syz dup: exact-subject-of-another-report > > > > If you want to undo deduplication, reply with: > > #syz undup > > > > > -- > Cheers, > > David / dhildenb