Hello, syzbot found the following issue on: HEAD commit: 1716a175592a Add linux-next specific files for 20230301 git tree: linux-next console+strace: https://syzkaller.appspot.com/x/log.txt?x=1566c97f480000 kernel config: https://syzkaller.appspot.com/x/.config?x=e4da7f0aef5d2eb8 dashboard link: https://syzkaller.appspot.com/bug?extid=534d1c3c0c08473dc853 compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f1717f480000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=130f6874c80000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/0745b94b7a1b/disk-1716a175.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/9a0be79f3fd5/vmlinux-1716a175.xz kernel image: https://storage.googleapis.com/syzbot-assets/438e9e5cf49a/bzImage-1716a175.xz The issue was bisected to: commit 3d7cb67369a08d4933713290acf458990a50b6f9 Author: Suren Baghdasaryan <surenb@xxxxxxxxxx> Date: Mon Feb 27 17:36:28 2023 +0000 x86/mm: try VMA lock-based page fault handling first bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10265502c80000 final oops: https://syzkaller.appspot.com/x/report.txt?x=12265502c80000 console output: https://syzkaller.appspot.com/x/log.txt?x=14265502c80000 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+534d1c3c0c08473dc853@xxxxxxxxxxxxxxxxxxxxxxxxx Fixes: 3d7cb67369a0 ("x86/mm: try VMA lock-based page fault handling first") ====================================================== WARNING: possible circular locking dependency detected 6.2.0-next-20230301-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor115/5084 is trying to acquire lock: ffff888078307a90 (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write include/linux/mm.h:678 [inline] ffff888078307a90 (&vma->vm_lock->lock){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1826 [inline] ffff888078307a90 (&vma->vm_lock->lock){++++}-{3:3}, at: collapse_file+0x4fa5/0x5980 mm/khugepaged.c:2204 but task is already holding lock: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: i_mmap_lock_write include/linux/fs.h:468 [inline] ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1745 [inline] ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: collapse_file+0x3da6/0x5980 mm/khugepaged.c:2204 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&mapping->i_mmap_rwsem){++++}-{3:3}: down_write+0x92/0x200 kernel/locking/rwsem.c:1573 i_mmap_lock_write include/linux/fs.h:468 [inline] dma_resv_lockdep+0x26f/0x5f0 drivers/dma-buf/dma-resv.c:760 do_one_initcall+0x141/0x7d0 init/main.c:1306 do_initcall_level init/main.c:1379 [inline] do_initcalls init/main.c:1395 [inline] do_basic_setup init/main.c:1414 [inline] kernel_init_freeable+0x5ec/0x900 init/main.c:1634 kernel_init+0x1e/0x2c0 init/main.c:1522 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 -> #1 (fs_reclaim){+.+.}-{0:0}: __fs_reclaim_acquire mm/page_alloc.c:4647 [inline] fs_reclaim_acquire+0x11d/0x160 mm/page_alloc.c:4661 might_alloc include/linux/sched/mm.h:299 [inline] prepare_alloc_pages+0x159/0x570 mm/page_alloc.c:5293 __alloc_pages+0x149/0x5c0 mm/page_alloc.c:5511 __folio_alloc+0x16/0x40 mm/page_alloc.c:5554 vma_alloc_folio+0x155/0x850 mm/mempolicy.c:2244 do_anonymous_page mm/memory.c:4062 [inline] handle_pte_fault mm/memory.c:4917 [inline] __handle_mm_fault+0x1857/0x3e70 mm/memory.c:5061 handle_mm_fault+0x2c0/0x9c0 mm/memory.c:5207 do_user_addr_fault+0x2c1/0x1210 arch/x86/mm/fault.c:1349 handle_page_fault arch/x86/mm/fault.c:1534 [inline] exc_page_fault+0x98/0x170 arch/x86/mm/fault.c:1590 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 -> #0 (&vma->vm_lock->lock){++++}-{3:3}: check_prev_add kernel/locking/lockdep.c:3098 [inline] check_prevs_add kernel/locking/lockdep.c:3217 [inline] validate_chain kernel/locking/lockdep.c:3832 [inline] __lock_acquire+0x2ec7/0x5d40 kernel/locking/lockdep.c:5056 lock_acquire.part.0+0x11a/0x370 kernel/locking/lockdep.c:5669 down_write+0x92/0x200 kernel/locking/rwsem.c:1573 vma_start_write include/linux/mm.h:678 [inline] retract_page_tables mm/khugepaged.c:1826 [inline] collapse_file+0x4fa5/0x5980 mm/khugepaged.c:2204 hpage_collapse_scan_file+0xcd3/0x1680 mm/khugepaged.c:2358 madvise_collapse+0x53b/0xca0 mm/khugepaged.c:2818 madvise_vma_behavior+0x649/0x20e0 mm/madvise.c:1086 madvise_walk_vmas+0x1c7/0x2b0 mm/madvise.c:1260 do_madvise.part.0+0x31c/0x470 mm/madvise.c:1439 do_madvise mm/madvise.c:1452 [inline] __do_sys_madvise mm/madvise.c:1452 [inline] __se_sys_madvise mm/madvise.c:1450 [inline] __x64_sys_madvise+0x117/0x150 mm/madvise.c:1450 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Chain exists of: &vma->vm_lock->lock --> fs_reclaim --> &mapping->i_mmap_rwsem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mapping->i_mmap_rwsem); lock(fs_reclaim); lock(&mapping->i_mmap_rwsem); lock(&vma->vm_lock->lock); *** DEADLOCK *** 2 locks held by syz-executor115/5084: #0: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: i_mmap_lock_write include/linux/fs.h:468 [inline] #0: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1745 [inline] #0: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: collapse_file+0x3da6/0x5980 mm/khugepaged.c:2204 #1: ffff88807b06f098 (&mm->mmap_lock){++++}-{3:3}, at: mmap_write_trylock include/linux/mmap_lock.h:120 [inline] #1: ffff88807b06f098 (&mm->mmap_lock){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1797 [inline] #1: ffff88807b06f098 (&mm->mmap_lock){++++}-{3:3}, at: collapse_file+0x4667/0x5980 mm/khugepaged.c:2204 stack backtrace: CPU: 0 PID: 5084 Comm: syz-executor115 Not tainted 6.2.0-next-20230301-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2178 check_prev_add kernel/locking/lockdep.c:3098 [inline] check_prevs_add kernel/locking/lockdep.c:3217 [inline] validate_chain kernel/locking/lockdep.c:3832 [inline] __lock_acquire+0x2ec7/0x5d40 kernel/locking/lockdep.c:5056 lock_acquire.part.0+0x11a/0x370 kernel/locking/lockdep.c:5669 down_write+0x92/0x200 kernel/locking/rwsem.c:1573 vma_start_write include/linux/mm.h:678 [inline] retract_page_tables mm/khugepaged.c:1826 [inline] collapse_file+0x4fa5/0x5980 mm/khugepaged.c:2204 hpage_collapse_scan_file+0xcd3/0x1680 mm/khugepaged.c:2358 madvise_collapse+0x53b/0xca0 mm/khugepaged.c:2818 madvise_vma_behavior+0x649/0x20e0 mm/madvise.c:1086 madvise_walk_vmas+0x1c7/0x2b0 mm/madvise.c:1260 do_madvise.part.0+0x31c/0x470 mm/madvise.c:1439 do_madvise mm/madvise.c:1452 [inline] __do_sys_madvise mm/madvise.c:1452 [inline] __se_sys_madvise mm/madvise.c:1450 [inline] __x64_sys_madvise+0x117/0x150 mm/madvise.c:1450 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fcffa4a4b29 Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffe20f24e68 EFLAGS: 00000246 ORIG_RAX: 000000000000001c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcffa4a4b29 RDX: 0000000000000019 RSI: 0000000000600003 RDI: 0000000020000000 RBP: 00007fcffa468cd0 R08: 0000000000000000 R09: 0000000000000000 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. For information about bisection process see: https://goo.gl/tpsmEJ#bisection syzbot can test patches for this issue, for details see: https://goo.gl/tpsmEJ#testing-patches