I tried the latest linux-block/for-next branch tip (git hash b43fadb6631f and observed a process hang during blktests block/005 run on a NVMe device. Kernel message reported "INFO: task check:1224 blocked for more than 122 seconds." with call trace [1]. So far, the hang is 100% reproducible with my system. This hang is not observed with HDDs or null_blk devices. I bisected and found the commit 4f5022453acd ("nvme: wire up completion batching for the IRQ path") triggers the hang. When I revert this commit from the for-next branch tip, the hang disappears. The block/005 test case does IO scheduler switch during IO, and the completion path change by the commit looks affecting the scheduler switch. Comments for solution will be appreciated. [1] [ 121.677521] run blktests block/005 at 2021-11-01 16:24:21 [ 371.819391] INFO: task check:1224 blocked for more than 122 seconds. [ 371.826660] Not tainted 5.15.0-rc6+ #41 [ 371.831786] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 371.840349] task:check state:D stack: 0 pid: 1224 ppid: 1223 flags:0x00000000 [ 371.849471] Call Trace: [ 371.852680] __schedule+0x9e2/0x2240 [ 371.857052] ? io_schedule_timeout+0x190/0x190 [ 371.862330] ? _raw_spin_unlock_irqrestore+0x37/0x40 [ 371.868070] schedule+0xdd/0x280 [ 371.872062] blk_mq_freeze_queue_wait+0xc0/0xf0 [ 371.877351] ? blk_mq_queue_inflight+0x70/0x70 [ 371.882554] ? blk_mq_run_hw_queues+0x13b/0x410 [ 371.887856] ? finish_wait+0x270/0x270 [ 371.892363] elevator_switch+0x4a/0xa0 [ 371.896857] elv_iosched_store+0x16e/0x3c0 [ 371.901698] ? elevator_init_mq+0x3f0/0x3f0 [ 371.906643] ? lock_is_held_type+0xe0/0x110 [ 371.911556] ? lock_is_held_type+0xe0/0x110 [ 371.916473] queue_attr_store+0x8b/0xd0 [ 371.921050] ? sysfs_file_ops+0x170/0x170 [ 371.925789] kernfs_fop_write_iter+0x2c7/0x460 [ 371.930965] new_sync_write+0x359/0x5e0 [ 371.935533] ? new_sync_read+0x5d0/0x5d0 [ 371.940177] ? ksys_write+0xe9/0x1b0 [ 371.944483] ? lock_release+0x690/0x690 [ 371.949046] ? __cond_resched+0x15/0x30 [ 371.953604] ? inode_security+0x56/0xf0 [ 371.958171] ? lock_is_held_type+0xe0/0x110 [ 371.963075] vfs_write+0x5e4/0x8e0 [ 371.967196] ksys_write+0xe9/0x1b0 [ 371.971321] ? __ia32_sys_read+0xa0/0xa0 [ 371.975962] ? syscall_enter_from_user_mode+0x21/0x70 [ 371.981733] do_syscall_64+0x3b/0x90 [ 371.986028] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 371.991799] RIP: 0033:0x7f24adda7387 [ 371.996085] RSP: 002b:00007ffd9c80f2e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 372.004366] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f24adda7387 [ 372.012213] RDX: 0000000000000005 RSI: 0000562156826120 RDI: 0000000000000001 [ 372.020060] RBP: 0000562156826120 R08: 000000000000000a R09: 00007f24ade3d4e0 [ 372.027901] R10: 00007f24ade3d3e0 R11: 0000000000000246 R12: 0000000000000005 [ 372.035746] R13: 00007f24ade7a520 R14: 0000000000000005 R15: 00007f24ade7a700 [ 372.043629] Showing all locks held in the system: [ 372.051235] 1 lock held by khungtaskd/62: [ 372.055960] #0: ffffffffa64483c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x269 [ 372.065649] 1 lock held by systemd-journal/584: [ 372.070902] 4 locks held by check/1224: [ 372.075452] #0: ffff888118fc4460 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xe9/0x1b0 [ 372.084177] #1: ffff888120717088 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x216/0x460 [ 372.093771] #2: ffff8881480a0a00 (kn->active#131){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x239/0x460 [ 372.103707] #3: ffff888148d66858 (&q->sysfs_lock){+.+.}-{3:3}, at: queue_attr_store+0x5d/0xd0 [ 372.115258] ============================================= -- Best Regards, Shin'ichiro Kawasaki