[Bug 217224] New: [Syzkaller & bisect] There is "xfs_btree_lookup_get_block" general protection BUG in v6.3-rc3 kernel

bugzilla-daemon@xxxxxxxxxx · Tue, 21 Mar 2023 06:16:29 +0000

https://bugzilla.kernel.org/show_bug.cgi?id=217224

            Bug ID: 217224
           Summary: [Syzkaller & bisect] There is
                    "xfs_btree_lookup_get_block" general protection BUG in
                    v6.3-rc3 kernel
           Product: File System
           Version: 2.5
    Kernel Version: v6.3-rc3
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@xxxxxxxxxxxxxxxxxxxxxx
          Reporter: pengfei.xu@xxxxxxxxx
        Regression: No

Platform: x86 platforms
There is "xfs_btree_lookup_get_block" general protection BUG in v6.3-rc3
kernel.

All detailed info:
https://github.com/xupengfe/syzkaller_logs/tree/main/230321_082343_xfs_btree_lookup_get_block
Reproduced code:
https://github.com/xupengfe/syzkaller_logs/blob/main/230321_082343_xfs_btree_lookup_get_block/repro.c
Kconfig:
https://github.com/xupengfe/syzkaller_logs/blob/main/230321_082343_xfs_btree_lookup_get_block/kconfig_origin
v6.3-rc3 issue log:
https://github.com/xupengfe/syzkaller_logs/blob/main/230321_082343_xfs_btree_lookup_get_block/e8d018dd0257f744ca50a729e3d042cf2ec9da65_dmesg.log
Bisect info:
https://github.com/xupengfe/syzkaller_logs/blob/main/230321_082343_xfs_btree_lookup_get_block/bisect_info.log

Bisected and found the bad commit:
"
7993f1a431bc5271369d359941485a9340658ac3
xfs: only run COW extent recovery when there are no live extents "
It's just suspected commit, because reverted the above commit on top of
v6.3-rc3 and made kernel failed, could not double confirm for this issue's
verification with reverted kernel.

"
[   29.020016] XFS (loop3): Error -5 reserving per-AG metadata reserve pool.
[   29.022919] BUG: kernel NULL pointer dereference, address: 000000000000022b
[   29.023777] #PF: supervisor read access in kernel mode
[   29.024413] #PF: error_code(0x0000) - not-present page
[   29.025081] PGD 12947067 P4D 12947067 PUD 12976067 PMD 0
[   29.025825] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   29.026465] CPU: 0 PID: 544 Comm: repro Not tainted 6.3.0-rc3-e8d018dd0257+
#1
[   29.026826] XFS (loop5): Starting recovery (logdev: internal)
[   29.027468] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   29.029009] XFS (loop5): Metadata corruption detected at
xfs_btree_lookup_get_block+0x27a/0x300, xfs_refcountbt block 0x28
[   29.029843] RIP: 0010:xfs_btree_lookup_get_block+0xc4/0x300
[   29.031365] XFS (loop5): Unmount and run xfs_repair
[   29.032089] Code: ff ff 31 ff 41 89 c7 89 c6 e8 48 3d 8a ff 45 85 ff 0f 85
1d 01 00 00 e8 5a 3b 8a ff 4c 8b 75 c0 4d 85 f6 74 37 e8 4c 3b 8a ff <49> 8b 96
28 02 00 00 48 8b 4d c8 48 8b 12 48 89 cf 48 89 4d b0 48
[   29.032779] XFS (loop5): Failed to recover leftover CoW staging extents, err
-117.
[   29.035256] RSP: 0018:ffffc9000108b910 EFLAGS: 00010246
[   29.035267] RAX: 0000000000000000 RBX: ffff888013f10000 RCX:
ffffffff81a32768
[   29.036298] XFS (loop5): Filesystem has been shut down due to log error
(0x2).
[   29.037030] RDX: 0000000000000000 RSI: ffff888013eba340 RDI:
0000000000000002
[   29.037994] XFS (loop5): Please unmount the filesystem and rectify the
problem(s).
[   29.038973] RBP: ffffc9000108b968 R08: ffffc9000108bb88 R09:
ffff888013f10000
[   29.038982] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000007
[   29.038989] R13: ffffc9000108b9a8 R14: 0000000000000003 R15:
0000000000000000
[   29.038997] FS:  00007f44ba73a740(0000) GS:ffff88807dc00000(0000)
knlGS:0000000000000000
[   29.041082] XFS (loop7): Starting recovery (logdev: internal)
[   29.041985] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   29.043907] XFS (loop7): Metadata corruption detected at
xfs_btree_lookup_get_block+0x27a/0x300, xfs_refcountbt block 0x28
[   29.043955] CR2: 000000000000022b CR3: 000000000e56e003 CR4:
0000000000770ef0
[   29.045085] XFS (loop7): Unmount and run xfs_repair
[   29.045870] PKRU: 55555554
[   29.046689] XFS (loop7): Failed to recover leftover CoW staging extents, err
-117.
[   29.048110] Call Trace:
[   29.049079] XFS (loop7): Filesystem has been shut down due to log error
(0x2).
[   29.049753]  <TASK>
[   29.050160] XFS (loop7): Please unmount the filesystem and rectify the
problem(s).
[   29.051195]  xfs_btree_lookup+0xfe/0x800
[   29.053556] XFS (loop1): Metadata corruption detected at
xfs_btree_lookup_get_block+0x27a/0x300, xfs_refcountbt block 0x28
[   29.053882]  ? __this_cpu_preempt_check+0x20/0x30
[   29.054487] XFS (loop1): Unmount and run xfs_repair
[   29.055921]  ? __pfx_xfs_refcount_recover_extent+0x10/0x10
[   29.056575] XFS (loop1): Failed to recover leftover CoW staging extents, err
-117.
[   29.057249]  ? __pfx_xfs_refcount_recover_extent+0x10/0x10
[   29.057993] XFS (loop1): Filesystem has been shut down due to log error
(0x2).
[   29.059020]  xfs_btree_simple_query_range+0x54/0x280
[   29.059038]  ? write_comp_data+0x2f/0x90
[   29.059784] XFS (loop1): Please unmount the filesystem and rectify the
problem(s).
[   29.060778]  ? __pfx_xfs_refcount_recover_extent+0x10/0x10
[   29.062080] XFS (loop4): Metadata corruption detected at
xfs_btree_lookup_get_block+0x27a/0x300, xfs_refcountbt block 0x28
[   29.063021]  xfs_btree_query_range+0x18a/0x1a0
[   29.063773] XFS (loop4): Unmount and run xfs_repair
[   29.065266]  ? xfs_refcountbt_init_common+0x3b/0x90
[   29.065891] XFS (loop4): Failed to recover leftover CoW staging extents, err
-117.
[   29.066560]  xfs_refcount_recover_cow_leftovers+0x18c/0x4a0
[   29.066583]  ? xfs_perag_grab+0x143/0x340
[   29.067266] XFS (loop4): Filesystem has been shut down due to log error
(0x2).
[   29.068299]  xfs_reflink_recover_cow+0x79/0xf0
[   29.069063] XFS (loop4): Please unmount the filesystem and rectify the
problem(s).
[   29.069623]  xlog_recover_finish+0x136/0x420
[   29.071179] XFS (loop7): Ending recovery (logdev: internal)
[   29.071227]  ? queue_delayed_work_on+0x9f/0xf0
[   29.072328] XFS (loop7): Error -5 reserving per-AG metadata reserve pool.
[   29.072876]  xfs_log_mount_finish+0x187/0x1d0
[   29.073692] XFS (loop1): Ending recovery (logdev: internal)
[   29.074252]  xfs_mountfs+0x76e/0xce0
[   29.074271]  xfs_fs_fill_super+0x7aa/0xdc0
[   29.075574] XFS (loop1): Error -5 reserving per-AG metadata reserve pool.
[   29.075809]  get_tree_bdev+0x24b/0x350
[   29.076634] XFS (loop5): Ending recovery (logdev: internal)
[   29.077084]  ? __pfx_xfs_fs_fill_super+0x10/0x10
[   29.077676] XFS (loop5): Error -5 reserving per-AG metadata reserve pool.
[   29.078560]  xfs_fs_get_tree+0x25/0x30
[   29.078581]  vfs_get_tree+0x3b/0x140
[   29.079464] XFS (loop4): Ending recovery (logdev: internal)
[   29.079871]  path_mount+0x769/0x10f0
[   29.080533] XFS (loop4): Error -5 reserving per-AG metadata reserve pool.
[   29.081442]  ? write_comp_data+0x2f/0x90
[   29.085330]  do_mount+0xaf/0xd0
[   29.085811] XFS (loop2): Starting recovery (logdev: internal)
[   29.085825]  __x64_sys_mount+0x14b/0x160
[   29.087213]  do_syscall_64+0x3b/0x90
[   29.087534] XFS (loop2): Metadata corruption detected at
xfs_btree_lookup_get_block+0x27a/0x300, xfs_refcountbt block 0x28
[   29.087758]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   29.089249] XFS (loop2): Unmount and run xfs_repair
[   29.089936] RIP: 0033:0x7f44ba8673ae
[   29.090639] XFS (loop2): Failed to recover leftover CoW staging extents, err
-117.
[   29.091112] Code: 48 8b 0d f5 8a 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e
0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d c2 8a 0c 00 f7 d8 64 89 01 48
[   29.092123] XFS (loop2): Filesystem has been shut down due to log error
(0x2).
[   29.094617] RSP: 002b:00007fffeaff1b78 EFLAGS: 00000206 ORIG_RAX:
00000000000000a5
[   29.094632] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
00007f44ba8673ae
[   29.094640] RDX: 0000000020009580 RSI: 00000000200095c0 RDI:
00007fffeaff1cb0
[   29.095630] XFS (loop2): Please unmount the filesystem and rectify the
problem(s).
[   29.096664] RBP: 00007fffeaff1d40 R08: 00007fffeaff1bb0 R09:
0000000000000000
[   29.099193] XFS (loop2): Ending recovery (logdev: internal)
[   29.099660] R10: 0000000000000800 R11: 0000000000000206 R12:
0000000000401260
[   29.100702] XFS (loop2): Error -5 reserving per-AG metadata reserve pool.
[   29.101426] R13: 00007fffeaff1e80 R14: 0000000000000000 R15:
0000000000000000
[   29.104290]  </TASK>
[   29.104623] Modules linked in:
[   29.105081] CR2: 000000000000022b
[   29.105560] ---[ end trace 0000000000000000 ]---
[   29.106207] RIP: 0010:xfs_btree_lookup_get_block+0xc4/0x300
[   29.106985] Code: ff ff 31 ff 41 89 c7 89 c6 e8 48 3d 8a ff 45 85 ff 0f 85
1d 01 00 00 e8 5a 3b 8a ff 4c 8b 75 c0 4d 85 f6 74 37 e8 4c 3b 8a ff <49> 8b 96
28 02 00 00 48 8b 4d c8 48 8b 12 48 89 cf 48 89 4d b0 48
[   29.109472] RSP: 0018:ffffc9000108b910 EFLAGS: 00010246
[   29.110190] RAX: 0000000000000000 RBX: ffff888013f10000 RCX:
ffffffff81a32768
[   29.111153] RDX: 0000000000000000 RSI: ffff888013eba340 RDI:
0000000000000002
[   29.112114] RBP: ffffc9000108b968 R08: ffffc9000108bb88 R09:
ffff888013f10000
[   29.113072] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000007
[   29.114030] R13: ffffc9000108b9a8 R14: 0000000000000003 R15:
0000000000000000
[   29.114989] FS:  00007f44ba73a740(0000) GS:ffff88807dc00000(0000)
knlGS:0000000000000000
[   29.116069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   29.116860] CR2: 000000000000022b CR3: 000000000e56e003 CR4:
0000000000770ef0
[   29.117831] PKRU: 55555554
[   29.118220] note: repro[544] exited with irqs disabled
[   29.452491] loop0: detected capacity change from 0 to 32768
"

I see this similar issue in syzbot link:
https://syzkaller.appspot.com/bug?id=e2907149c69cbccae0842eb502b8af4f6fac52a0
But it didn't provide the bisect commit info due to bisect failure.

I hope above info is helpful.

Thanks!

---

If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I used v7.1.0
   // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65
v6.2-rc5 kernel
   // You could change the bzImage_xxx as you want You could use below command
to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config make olddefconfig
make -jx bzImage           //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.

Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git cd qemu git checkout -f v7.1.0 mkdir
build cd build yum install -y ninja-build.x86_64 ../configure
--target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk
--enable-sdl make make install

Thanks!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.