Re: [Syzkaller & bisect] There is BUG: soft lockup in __cleanup_mnt in v6.3-rc1 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 12, 2023 at 11:21:04AM +0800, Pengfei Xu wrote:
> Hi Dave Chinner and kernel expert,
> 
> Greeting!

<cough> Please cc the xfs list with xfs bug reports.

Have you tried 6.3-rc2?

--D

> Platform: x86 platforms
> 
> There is BUG: soft lockup in __cleanup_mnt in v6.3-rc1 kernel.
> All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230311_022102___cleanup_mnt
> Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/repro.c
> v6.3-rc1 issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/fe15c26ee26efa11741a7b632e9f23b01aca4cc6_dmesg.log
> Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/kconfig_origin
> Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/bisect_info.log
> 
> "
> [   28.803497] XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply+0x67d/0x690 (fs/xfs/xfs_buf.c:1553).  Shutting down filesystem.
> [   28.804766] XFS (loop0): Please unmount the filesystem and rectify the problem(s)
> [   56.372468] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [repro:523]
> [   56.372847] Modules linked in:
> [   56.372985] irq event stamp: 22822
> [   56.373141] hardirqs last  enabled at (22821): [<ffffffff8300086e>] irqentry_exit+0x3e/0xa0
> [   56.373512] hardirqs last disabled at (22822): [<ffffffff82ffe853>] sysvec_apic_timer_interrupt+0x13/0xe0
> [   56.373923] softirqs last  enabled at (9032): [<ffffffff83023a3c>] __do_softirq+0x31c/0x49c
> [   56.374276] softirqs last disabled at (8975): [<ffffffff8112b5a4>] irq_exit_rcu+0xc4/0x100
> [   56.374623] CPU: 1 PID: 523 Comm: repro Not tainted 6.3.0-rc1-fe15c26ee26e+ #1
> [   56.374924] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   56.375386] RIP: 0010:rcu_read_lock_sched_held+0x4b/0x80
> [   56.375621] Code: 8d 7d f7 e8 c7 fe ff ff 84 c0 74 19 0f b6 45 f7 48 8b 4d f8 65 48 33 0c 25 28 00 00 00 75 38 c9 c3 cc cc cc cc be ff ff ff ff <48> c7 c7 20 3e f6 83 e8 89 ec de 01 89 c2 b8 01 00 00 00 85 d2 75
> [   56.376376] RSP: 0018:ffffc9000102fbe0 EFLAGS: 00000246
> [   56.376597] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> [   56.376891] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
> [   56.377192] RBP: ffffc9000102fbf0 R08: 0000000000000001 R09: 0000000000000000
> [   56.377486] R10: ffff88800d01d418 R11: 0000000000000000 R12: ffff88800b0d1158
> [   56.377780] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [   56.378082] FS:  00007fc887d10740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
> [   56.378414] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   56.378655] CR2: 00007faaae519018 CR3: 000000000d3e2004 CR4: 0000000000770ee0
> [   56.378950] PKRU: 55555554
> [   56.379069] Call Trace:
> [   56.379178]  <TASK>
> [   56.379276]  lock_acquire+0x2ad/0x320
> [   56.379444]  ? lock_release+0x141/0x2e0
> [   56.379615]  ? write_comp_data+0x2f/0x90
> [   56.379799]  _raw_spin_lock+0x36/0x50
> [   56.379962]  ? xfs_ail_push_all_sync+0x73/0x110
> [   56.380169]  xfs_ail_push_all_sync+0x73/0x110
> [   56.380360]  ? __pfx_autoremove_wake_function+0x10/0x10
> [   56.380591]  xfs_reclaim_inodes+0xb8/0x140
> [   56.380779]  xfs_unmount_flush_inodes+0x63/0x80
> [   56.380987]  xfs_unmountfs+0x69/0x1f0
> [   56.381152]  xfs_fs_put_super+0x5a/0x120
> [   56.381331]  ? __pfx_xfs_fs_put_super+0x10/0x10
> [   56.381534]  generic_shutdown_super+0xac/0x240
> [   56.381740]  kill_block_super+0x43/0x80
> [   56.381917]  deactivate_locked_super+0x52/0xb0
> [   56.382112]  deactivate_super+0xb3/0xd0
> [   56.382282]  cleanup_mnt+0x15e/0x1e0
> [   56.382451]  __cleanup_mnt+0x1f/0x30
> [   56.382613]  task_work_run+0xb6/0x120
> [   56.382783]  exit_to_user_mode_prepare+0x200/0x210
> [   56.383000]  syscall_exit_to_user_mode+0x2d/0x60
> [   56.383205]  do_syscall_64+0x4a/0x90
> [   56.383368]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [   56.383593] RIP: 0033:0x7fc887e3c7db
> [   56.383751] Code: 96 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 96 0c 00 f7 d8 64 89 01 48
> [   56.384500] RSP: 002b:00007ffc1a5bd1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
> [   56.384815] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fc887e3c7db
> [   56.385114] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007ffc1a5bd290
> [   56.385407] RBP: 00007ffc1a5be2d0 R08: 0000000001d0a333 R09: 0000000000000009
> [   56.385700] R10: 0000000000404071 R11: 0000000000000206 R12: 00000000004012c0
> [   56.385994] R13: 00007ffc1a5be410 R14: 0000000000000000 R15: 0000000000000000
> [   56.386299]  </TASK>
> [   56.386400] Kernel panic - not syncing: softlockup: hung tasks
> [   56.386643] CPU: 1 PID: 523 Comm: repro Tainted: G             L     6.3.0-rc1-fe15c26ee26e+ #1
> [   56.387000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   56.387461] Call Trace:
> [   56.387569]  <IRQ>
> [   56.387662]  dump_stack_lvl+0xa9/0x110
> [   56.387834]  dump_stack+0x19/0x20
> [   56.387980]  panic+0x4fd/0x520
> [   56.388119]  ? watchdog_timer_fn+0x31e/0x390
> [   56.388315]  watchdog_timer_fn+0x337/0x390
> [   56.388497]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [   56.388710]  __hrtimer_run_queues+0x2ed/0x820
> [   56.388924]  hrtimer_interrupt+0x134/0x2d0
> [   56.389111]  __sysvec_apic_timer_interrupt+0x9c/0x290
> [   56.389331]  sysvec_apic_timer_interrupt+0xab/0xe0
> [   56.389535]  </IRQ>
> [   56.389628]  <TASK>
> [   56.389722]  asm_sysvec_apic_timer_interrupt+0x1f/0x30
> [   56.389940] RIP: 0010:rcu_read_lock_sched_held+0x4b/0x80
> [   56.390166] Code: 8d 7d f7 e8 c7 fe ff ff 84 c0 74 19 0f b6 45 f7 48 8b 4d f8 65 48 33 0c 25 28 00 00 00 75 38 c9 c3 cc cc cc cc be ff ff ff ff <48> c7 c7 20 3e f6 83 e8 89 ec de 01 89 c2 b8 01 00 00 00 85 d2 75
> [   56.390909] RSP: 0018:ffffc9000102fbe0 EFLAGS: 00000246
> [   56.391129] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
> [   56.391422] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
> [   56.391715] RBP: ffffc9000102fbf0 R08: 0000000000000001 R09: 0000000000000000
> [   56.392009] R10: ffff88800d01d418 R11: 0000000000000000 R12: ffff88800b0d1158
> [   56.392303] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [   56.392607]  lock_acquire+0x2ad/0x320
> [   56.392773]  ? lock_release+0x141/0x2e0
> [   56.392941]  ? write_comp_data+0x2f/0x90
> [   56.393122]  _raw_spin_lock+0x36/0x50
> [   56.393285]  ? xfs_ail_push_all_sync+0x73/0x110
> [   56.393483]  xfs_ail_push_all_sync+0x73/0x110
> [   56.393672]  ? __pfx_autoremove_wake_function+0x10/0x10
> [   56.393900]  xfs_reclaim_inodes+0xb8/0x140
> [   56.394087]  xfs_unmount_flush_inodes+0x63/0x80
> [   56.394285]  xfs_unmountfs+0x69/0x1f0
> [   56.394450]  xfs_fs_put_super+0x5a/0x120
> [   56.394627]  ? __pfx_xfs_fs_put_super+0x10/0x10
> [   56.394829]  generic_shutdown_super+0xac/0x240
> [   56.395030]  kill_block_super+0x43/0x80
> [   56.395207]  deactivate_locked_super+0x52/0xb0
> [   56.395403]  deactivate_super+0xb3/0xd0
> [   56.395574]  cleanup_mnt+0x15e/0x1e0
> [   56.395739]  __cleanup_mnt+0x1f/0x30
> [   56.395902]  task_work_run+0xb6/0x120
> [   56.396071]  exit_to_user_mode_prepare+0x200/0x210
> [   56.396282]  syscall_exit_to_user_mode+0x2d/0x60
> [   56.396487]  do_syscall_64+0x4a/0x90
> [   56.396648]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [   56.396865] RIP: 0033:0x7fc887e3c7db
> [   56.397020] Code: 96 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 96 0c 00 f7 d8 64 89 01 48
> [   56.397761] RSP: 002b:00007ffc1a5bd1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
> [   56.398072] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fc887e3c7db
> [   56.398365] RDX: 0000000000000000 RSI: 000000000000000a RDI: 00007ffc1a5bd290
> [   56.398658] RBP: 00007ffc1a5be2d0 R08: 0000000001d0a333 R09: 0000000000000009
> [   56.398951] R10: 0000000000404071 R11: 0000000000000206 R12: 00000000004012c0
> [   56.399244] R13: 00007ffc1a5be410 R14: 0000000000000000 R15: 0000000000000000
> [   56.399546]  </TASK>
> [   56.399792] Kernel Offset: disabled
> [   56.399972] ---[ end Kernel panic - not syncing: softlockup: hung tasks ]---
> "
> 
> I used the key word "__cleanup_mnt" to bisect, I knew there was other BUG
> info in dmesg with "bisect good" commit log.
> 
> Anyway, just provided the bad commit and hope it's helpful:
> "
> bd4f5d09cc93c8ca51e4efea86ac90a4bb553d6e
> xfs: refactor the filestreams allocator pick functions
> "
> It's just a suspected commit, reverted above commit on top of v6.3-rc1 and
> no "__cleanup_mnt" related BUG but generated the other BUG info in dmesg.
> 
> There was no this issue report yet in syzbot:
> https://syzkaller.appspot.com/upstream
> 
> I tried that, installed the xfs fixed patch for below issue on top of v6.3-rc1:
> https://lore.kernel.org/lkml/20230226223403.GU360264@xxxxxxxxxxxxxxxxxxx/
> "__cleanup_mnt" BUG issue still existed.
> Related dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230311_022102___cleanup_mnt/v6.3-rc1_xfs_patch_another_cleanup_mnt_issue.log
> So it should be different than previous xfs related issue.
> 
> ---
> 
> If you don't need the following environment to reproduce the problem or if you
> already have one, please ignore the following information.
> 
> How to reproduce:
> git clone https://gitlab.com/xupengfe/repro_vm_env.git
> cd repro_vm_env
> tar -xvf repro_vm_env.tar.gz
> cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
>    // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
>    // You could change the bzImage_xxx as you want
> You could use below command to log in, there is no password for root.
> ssh -p 10023 root@localhost
> 
> After login vm(virtual machine) successfully, you could transfer reproduced
> binary to the vm by below way, and reproduce the problem in vm:
> gcc -pthread -o repro repro.c
> scp -P 10023 repro root@localhost:/root/
> 
> Get the bzImage for target kernel:
> Please use target kconfig and copy it to kernel_src/.config
> make olddefconfig
> make -jx bzImage           //x should equal or less than cpu num your pc has
> 
> Fill the bzImage file into above start3.sh to load the target kernel in vm.
> 
> 
> Tips:
> If you already have qemu-system-x86_64, please ignore below info.
> If you want to install qemu v7.1.0 version:
> git clone https://github.com/qemu/qemu.git
> cd qemu
> git checkout -f v7.1.0
> mkdir build
> cd build
> yum install -y ninja-build.x86_64
> ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl
> make
> make install
> 
> Thanks!
> BR.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux