Hi, On 2023-09-18 18:07:54 -0400, Theodore Ts'o wrote: > On Mon, Sep 18, 2023 at 08:13:30PM +0530, Shreeya Patel wrote: > > When I tried to reproduce this issue on mainline linux kernel using the > > reproducer provided by syzbot, I see an endless loop of following errors :- > > > > [ 89.689751][ T3922] loop1: detected capacity change from 0 to 512 > > [ 89.690514][ T3916] EXT4-fs error (device loop4): ext4_map_blocks:577: > > inode #3: block 9: comm poc: lblock 0 mapped to illegal pblock 9 (length 1) > > [ 89.694709][ T3890] EXT4-fs error (device loop0): ext4_map_blocks:577: > > Please note that maliciously corrupted file system is considered low > priority by ext4 developers. FWIW, I am seeing occasional hangs in ext4_fallocate with 6.6-rc4 as well, just doing database development on my laptop. Oct 03 06:48:30 alap5 kernel: INFO: task postgres:132485 blocked for more than 122 seconds. Oct 03 06:48:30 alap5 kernel: Not tainted 6.6.0-rc4-andres-00009-ge38f6d0c8360 #74 Oct 03 06:48:30 alap5 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 03 06:48:30 alap5 kernel: task:postgres state:D stack:0 pid:132485 ppid:132017 flags:0x00020006 Oct 03 06:48:30 alap5 kernel: Call Trace: Oct 03 06:48:30 alap5 kernel: <TASK> Oct 03 06:48:30 alap5 kernel: __schedule+0x388/0x13e0 Oct 03 06:48:30 alap5 kernel: ? nvme_queue_rqs+0x1e6/0x290 Oct 03 06:48:30 alap5 kernel: schedule+0x5f/0xe0 Oct 03 06:48:30 alap5 kernel: inode_dio_wait+0xd5/0x100 Oct 03 06:48:30 alap5 kernel: ? membarrier_register_private_expedited+0xa0/0xa0 Oct 03 06:48:30 alap5 kernel: ext4_fallocate+0x12f/0x1040 Oct 03 06:48:30 alap5 kernel: ? io_submit_sqes+0x392/0x660 Oct 03 06:48:30 alap5 kernel: vfs_fallocate+0x135/0x360 Oct 03 06:48:30 alap5 kernel: __x64_sys_fallocate+0x42/0x70 Oct 03 06:48:30 alap5 kernel: do_syscall_64+0x38/0x80 Oct 03 06:48:30 alap5 kernel: entry_SYSCALL_64_after_hwframe+0x46/0xb0 Oct 03 06:48:30 alap5 kernel: RIP: 0033:0x7fda32415f82 Oct 03 06:48:30 alap5 kernel: RSP: 002b:00007ffedbea0828 EFLAGS: 00000246 ORIG_RAX: 000000000000011d Oct 03 06:48:30 alap5 kernel: RAX: ffffffffffffffda RBX: 00000000002a6000 RCX: 00007fda32415f82 Oct 03 06:48:30 alap5 kernel: RDX: 0000000014b22000 RSI: 0000000000000000 RDI: 0000000000000016 Oct 03 06:48:30 alap5 kernel: RBP: 0000000014b22000 R08: 0000000014b22000 R09: 0000558f70eabed0 Oct 03 06:48:30 alap5 kernel: R10: 00000000002a6000 R11: 0000000000000246 R12: 00000000000000e0 Oct 03 06:48:30 alap5 kernel: R13: 000000000a000013 R14: 0000000000000004 R15: 0000558f709d4e70 Oct 03 06:48:30 alap5 kernel: </TASK> cat /proc/132485/stack [<0>] inode_dio_wait+0xd5/0x100 [<0>] ext4_fallocate+0x12f/0x1040 [<0>] vfs_fallocate+0x135/0x360 [<0>] __x64_sys_fallocate+0x42/0x70 [<0>] do_syscall_64+0x38/0x80 [<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 cat /sys/kernel/debug/block/nvme1n1/hctx*/busy doesn't show any active IO. Nor does the issue resolve when resetting the controller - so I don't think this is a disk/block level issue. This is on 8f1b4600373f, with one local, unrelated, commit ontop. Unfortunately, so far, I've only reproduced this every couple hours of interactive work, so bisecting isn't really feasible. I've not hit it on 6.5 over a considerably longer time, so I am reasonably confident this isn't an older issue. Greetings, Andres Freund