Re: [Syzkaller & bisect] There is "soft lockup in __cleanup_mnt" in v6.4-rc3 kernel

Eric Sandeen <sandeen@xxxxxxxxxxx> · Thu, 25 May 2023 09:17:57 -0500

On 5/25/23 12:44 AM, Pengfei Xu wrote:
On 2023-05-24 at 22:51:27 -0500, Eric Sandeen wrote:
On 5/24/23 9:59 PM, Pengfei Xu wrote:
Hi Dave,

Greeting!

Platform: Alder lake
There is "soft lockup in __cleanup_mnt" in v6.4-rc3 kernel.

Syzkaller analysis repro.report and bisect detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230524_140757___cleanup_mnt
Guest machine info: https://github.com/xupengfe/syzkaller_logs/blob/main/230524_140757___cleanup_mnt/machineInfo0
Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230524_140757___cleanup_mnt/repro.c
Reproduced syscall: https://github.com/xupengfe/syzkaller_logs/blob/main/230524_140757___cleanup_mnt/repro.prog
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230524_140757___cleanup_mnt/bisect_info.log
Kconfig origin: https://github.com/xupengfe/syzkaller_logs/blob/main/230524_140757___cleanup_mnt/kconfig_origin

There was a lot of discussion yesterday about how turning the crank on
syzkaller and throwing un-triaged bug reports over the wall at stressed-out
xfs developers isn't particularly helpful.

There was also a very specific concern raised in that discussion:

IOWs, the bug report is deficient and not complete, and so I'm
forced to spend unnecessary time trying to work out how to extract
the filesystem image from a weird syzkaller report that is basically
just a bunch of undocumented blobs in a github tree.

but here we are again, with another undocumented blob in a github tree, and
no meaningful attempt at triage.

Syzbot at least is now providing filesystem images[1], which relieves some
of the burden on the filesystem developers you're expecting to fix these
bugs.

Perhaps before you send the /next/ filesystem-related syzkaller report, you
can at least work out how to provide a standard filesystem image as part of
the reproducer, one that can be examined with normal filesystem development
and debugging tools?

   There is a standard filesystem image after

git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
image is named as centos8_3.img, and will boot by start3.sh.

Honestly, this suggests to me that you don't really have much 
understanding at all about the bugs you're reporting.

There is bzImage v6.4-rc3 in link: https://github.com/xupengfe/syzkaller_logs/blob/main/230524_140757___cleanup_mnt/bzImage_v64rc3
You could use it to boot v6.4-rc3 kernel.

./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Then you could reproduce this issue easily in above environment.

You seem to be suggesting that the xfs developers should go do /more 
work/ to get to the bare minimum of a decent fuzzed filesystem bug 
report, instead of you doing a little bit of prep work yourself by 
providing the fuzzed filesystem image itself?

Your github account says you are "looking to collaborate on Linux kernel 
learning" - tossing auto-generated and difficult-to-triage bug reports 
at other developers is not collaboration. Wouldn't it be more 
interesting to take the time to understand the reports you're 
generating, find ways to make them more accessible/debuggable, and/or 
take some time to look into the problems yourself, in order to learn 
about the code you're turning the crank on?

Thanks!
BR.

[1]
https://lore.kernel.org/lkml/0000000000001f239205fb969174@xxxxxxxxxx/T/