Hi Paul, > > I've done some more digging into this. I still would really appreciate any advice that folks have on how to root cause and fix this bug. > > I have access to a core file from a system that was taken while the system was suffering from this issue. In that core dump, we can see that the thread in __transport_wait_for_tasks is waiting for the LUN_RESET command to complete. This lead me to realize that in the syslog output, the LUN_RESET message that occured when the issue first happened is different from the other LUN_RESET commands I see: We never get the "LUN_RESET: TMR for [iblock] Complete" message. That lead me to look for the thread that is blocked in processing the LUN_RESET command. That thread's stack trace looks like this: > > 0xffff9416b0fa2080 UNINTERRUPTIBLE 4 > __schedule+0x2bd > ... > target_put_cmd_and_wait+0x5a > core_tmr_drain_state_list > core_tmr_lun_reset+0x4e3 > target_tmr_work+0xd1 > ... > > The command *that* thread is waiting for has a t_state of TRANSPORT_WRITE_PENDING, and it's transport_state is CMD_T_ABORTED. However, it still has a cmd_kref value of 2, which is why the LUN_RESET command can't proceed. It looks like it's a write command (execute_cmd is sbc_execute_rw and data_direction is DMA_TO_DEVICE). I'm still investigating further to try to understand how this state of offairs could occur. Any insight or information anyone could provide would be greatly appreciated. 5.15 is too old kernel for iSCSI, there were plenty of patches that fix commands hanging there. Definitely you need this patchset for the beginning: https://lore.kernel.org/all/20230319015620.96006-1-michael.christie@xxxxxxxxxx/ BR, Dmitry