On Sun, Mar 12, 2023 at 11:21:04AM +0800, Pengfei Xu wrote: > Hi Dave Chinner and kernel expert, > > Greeting! <cough> Please cc the xfs list with xfs bug reports. Have you tried 6.3-rc2? --D > Platform: x86 platforms > > There is BUG: soft lockup in __cleanup_mnt in v6.3-rc1 kernel. > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230311_022102___cleanup_mnt > Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/repro.c > v6.3-rc1 issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/fe15c26ee26efa11741a7b632e9f23b01aca4cc6_dmesg.log > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/kconfig_origin > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/bisect_info.log > > " > [ 28.803497] XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x67d/0x690 (fs/xfs/xfs_buf.c:1553). Shutting down filesystem. > [ 28.804766] XFS (loop0): Please unmount the filesystem and rectify the problem(s) > [ 56.372468] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [repro:523] > [ 56.372847] Modules linked in: > [ 56.372985] irq event stamp: 22822 > [ 56.373141] hardirqs last enabled at (22821): [<ffffffff8300086e>] irqentry_exit+0x3e/0xa0 > [ 56.373512] hardirqs last disabled at (22822): [<ffffffff82ffe853>] sysvec_apic_timer_interrupt+0x13/0xe0 > [ 56.373923] softirqs last enabled at (9032): [<ffffffff83023a3c>] __do_softirq+0x31c/0x49c > [ 56.374276] softirqs last disabled at (8975): [<ffffffff8112b5a4>] irq_exit_rcu+0xc4/0x100 > [ 56.374623] CPU: 1 PID: 523 Comm: repro Not tainted 6.3.0-rc1-fe15c26ee26e+ #1 > [ 56.374924] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > [ 56.375386] RIP: 0010:rcu_read_lock_sched_held+0x4b/0x80 > [ 56.375621] Code: 8d 7d f7 e8 c7 fe ff ff 84 c0 74 19 0f b6 45 f7 48 8b 4d f8 65 48 33 0c 25 28 00 00 00 75 38 c9 c3 cc cc cc cc be ff ff ff ff <48> c7 c7 20 3e f6 83 e8 89 ec de 01 89 c2 b8 01 00 00 00 85 d2 75 > [ 56.376376] RSP: 0018:ffffc9000102fbe0 EFLAGS: 00000246 > [ 56.376597] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 > [ 56.376891] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000 > [ 56.377192] RBP: ffffc9000102fbf0 R08: 0000000000000001 R09: 0000000000000000 > [ 56.377486] R10: ffff88800d01d418 R11: 0000000000000000 R12: ffff88800b0d1158 > [ 56.377780] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 56.378082] FS: 00007fc887d10740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 > [ 56.378414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 56.378655] CR2: 00007faaae519018 CR3: 000000000d3e2004 CR4: 0000000000770ee0 > [ 56.378950] PKRU: 55555554 > [ 56.379069] Call Trace: > [ 56.379178] <TASK> > [ 56.379276] lock_acquire+0x2ad/0x320 > [ 56.379444] ? lock_release+0x141/0x2e0 > [ 56.379615] ? write_comp_data+0x2f/0x90 > [ 56.379799] _raw_spin_lock+0x36/0x50 > [ 56.379962] ? xfs_ail_push_all_sync+0x73/0x110 > [ 56.380169] xfs_ail_push_all_sync+0x73/0x110 > [ 56.380360] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 56.380591] xfs_reclaim_inodes+0xb8/0x140 > [ 56.380779] xfs_unmount_flush_inodes+0x63/0x80 > [ 56.380987] xfs_unmountfs+0x69/0x1f0 > [ 56.381152] xfs_fs_put_super+0x5a/0x120 > [ 56.381331] ? __pfx_xfs_fs_put_super+0x10/0x10 > [ 56.381534] generic_shutdown_super+0xac/0x240 > [ 56.381740] kill_block_super+0x43/0x80 > [ 56.381917] deactivate_locked_super+0x52/0xb0 > [ 56.382112] deactivate_super+0xb3/0xd0 > [ 56.382282] cleanup_mnt+0x15e/0x1e0 > [ 56.382451] __cleanup_mnt+0x1f/0x30 > [ 56.382613] task_work_run+0xb6/0x120 > [ 56.382783] exit_to_user_mode_prepare+0x200/0x210 > [ 56.383000] syscall_exit_to_user_mode+0x2d/0x60 > [ 56.383205] do_syscall_64+0x4a/0x90 > [ 56.383368] entry_SYSCALL_64_after_hwframe+0x72/0xdc > [ 56.383593] RIP: 0033:0x7fc887e3c7db > [ 56.383751] Code: 96 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 96 0c 00 f7 d8 64 89 01 48 > [ 56.384500] RSP: 002b:00007ffc1a5bd1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6 > [ 56.384815] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fc887e3c7db > [ 56.385114] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007ffc1a5bd290 > [ 56.385407] RBP: 00007ffc1a5be2d0 R08: 0000000001d0a333 R09: 0000000000000009 > [ 56.385700] R10: 0000000000404071 R11: 0000000000000206 R12: 00000000004012c0 > [ 56.385994] R13: 00007ffc1a5be410 R14: 0000000000000000 R15: 0000000000000000 > [ 56.386299] </TASK> > [ 56.386400] Kernel panic - not syncing: softlockup: hung tasks > [ 56.386643] CPU: 1 PID: 523 Comm: repro Tainted: G L 6.3.0-rc1-fe15c26ee26e+ #1 > [ 56.387000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > [ 56.387461] Call Trace: > [ 56.387569] <IRQ> > [ 56.387662] dump_stack_lvl+0xa9/0x110 > [ 56.387834] dump_stack+0x19/0x20 > [ 56.387980] panic+0x4fd/0x520 > [ 56.388119] ? watchdog_timer_fn+0x31e/0x390 > [ 56.388315] watchdog_timer_fn+0x337/0x390 > [ 56.388497] ? __pfx_watchdog_timer_fn+0x10/0x10 > [ 56.388710] __hrtimer_run_queues+0x2ed/0x820 > [ 56.388924] hrtimer_interrupt+0x134/0x2d0 > [ 56.389111] __sysvec_apic_timer_interrupt+0x9c/0x290 > [ 56.389331] sysvec_apic_timer_interrupt+0xab/0xe0 > [ 56.389535] </IRQ> > [ 56.389628] <TASK> > [ 56.389722] asm_sysvec_apic_timer_interrupt+0x1f/0x30 > [ 56.389940] RIP: 0010:rcu_read_lock_sched_held+0x4b/0x80 > [ 56.390166] Code: 8d 7d f7 e8 c7 fe ff ff 84 c0 74 19 0f b6 45 f7 48 8b 4d f8 65 48 33 0c 25 28 00 00 00 75 38 c9 c3 cc cc cc cc be ff ff ff ff <48> c7 c7 20 3e f6 83 e8 89 ec de 01 89 c2 b8 01 00 00 00 85 d2 75 > [ 56.390909] RSP: 0018:ffffc9000102fbe0 EFLAGS: 00000246 > [ 56.391129] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 > [ 56.391422] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000 > [ 56.391715] RBP: ffffc9000102fbf0 R08: 0000000000000001 R09: 0000000000000000 > [ 56.392009] R10: ffff88800d01d418 R11: 0000000000000000 R12: ffff88800b0d1158 > [ 56.392303] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 56.392607] lock_acquire+0x2ad/0x320 > [ 56.392773] ? lock_release+0x141/0x2e0 > [ 56.392941] ? write_comp_data+0x2f/0x90 > [ 56.393122] _raw_spin_lock+0x36/0x50 > [ 56.393285] ? xfs_ail_push_all_sync+0x73/0x110 > [ 56.393483] xfs_ail_push_all_sync+0x73/0x110 > [ 56.393672] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 56.393900] xfs_reclaim_inodes+0xb8/0x140 > [ 56.394087] xfs_unmount_flush_inodes+0x63/0x80 > [ 56.394285] xfs_unmountfs+0x69/0x1f0 > [ 56.394450] xfs_fs_put_super+0x5a/0x120 > [ 56.394627] ? __pfx_xfs_fs_put_super+0x10/0x10 > [ 56.394829] generic_shutdown_super+0xac/0x240 > [ 56.395030] kill_block_super+0x43/0x80 > [ 56.395207] deactivate_locked_super+0x52/0xb0 > [ 56.395403] deactivate_super+0xb3/0xd0 > [ 56.395574] cleanup_mnt+0x15e/0x1e0 > [ 56.395739] __cleanup_mnt+0x1f/0x30 > [ 56.395902] task_work_run+0xb6/0x120 > [ 56.396071] exit_to_user_mode_prepare+0x200/0x210 > [ 56.396282] syscall_exit_to_user_mode+0x2d/0x60 > [ 56.396487] do_syscall_64+0x4a/0x90 > [ 56.396648] entry_SYSCALL_64_after_hwframe+0x72/0xdc > [ 56.396865] RIP: 0033:0x7fc887e3c7db > [ 56.397020] Code: 96 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 96 0c 00 f7 d8 64 89 01 48 > [ 56.397761] RSP: 002b:00007ffc1a5bd1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6 > [ 56.398072] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fc887e3c7db > [ 56.398365] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007ffc1a5bd290 > [ 56.398658] RBP: 00007ffc1a5be2d0 R08: 0000000001d0a333 R09: 0000000000000009 > [ 56.398951] R10: 0000000000404071 R11: 0000000000000206 R12: 00000000004012c0 > [ 56.399244] R13: 00007ffc1a5be410 R14: 0000000000000000 R15: 0000000000000000 > [ 56.399546] </TASK> > [ 56.399792] Kernel Offset: disabled > [ 56.399972] ---[ end Kernel panic - not syncing: softlockup: hung tasks ]--- > " > > I used the key word "__cleanup_mnt" to bisect, I knew there was other BUG > info in dmesg with "bisect good" commit log. > > Anyway, just provided the bad commit and hope it's helpful: > " > bd4f5d09cc93c8ca51e4efea86ac90a4bb553d6e > xfs: refactor the filestreams allocator pick functions > " > It's just a suspected commit, reverted above commit on top of v6.3-rc1 and > no "__cleanup_mnt" related BUG but generated the other BUG info in dmesg. > > There was no this issue report yet in syzbot: > https://syzkaller.appspot.com/upstream > > I tried that, installed the xfs fixed patch for below issue on top of v6.3-rc1: > https://lore.kernel.org/lkml/20230226223403.GU360264@xxxxxxxxxxxxxxxxxxx/ > "__cleanup_mnt" BUG issue still existed. > Related dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/v6.3-rc1_xfs_patch_another_cleanup_mnt_issue.log > So it should be different than previous xfs related issue. > > --- > > If you don't need the following environment to reproduce the problem or if you > already have one, please ignore the following information. > > How to reproduce: > git clone https://gitlab.com/xupengfe/repro_vm_env.git > cd repro_vm_env > tar -xvf repro_vm_env.tar.gz > cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0 > // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel > // You could change the bzImage_xxx as you want > You could use below command to log in, there is no password for root. > ssh -p 10023 root@localhost > > After login vm(virtual machine) successfully, you could transfer reproduced > binary to the vm by below way, and reproduce the problem in vm: > gcc -pthread -o repro repro.c > scp -P 10023 repro root@localhost:/root/ > > Get the bzImage for target kernel: > Please use target kconfig and copy it to kernel_src/.config > make olddefconfig > make -jx bzImage //x should equal or less than cpu num your pc has > > Fill the bzImage file into above start3.sh to load the target kernel in vm. > > > Tips: > If you already have qemu-system-x86_64, please ignore below info. > If you want to install qemu v7.1.0 version: > git clone https://github.com/qemu/qemu.git > cd qemu > git checkout -f v7.1.0 > mkdir build > cd build > yum install -y ninja-build.x86_64 > ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl > make > make install > > Thanks! > BR.