On 2023/9/22 23:23, Bart Van Assche wrote:
On 9/22/23 02:36, Wenchao Hao wrote:
SDEV_CANCEL is set when removing device and scsi_device_online() should
return false if sdev_state is SDEV_CANCEL.
IO hang would be caused if return true when state is SDEV_CANCEL with
following order:
T1: T2:scsi_error_handler
__scsi_remove_device()
scsi_device_set_state(sdev, SDEV_CANCEL)
scsi_eh_flush_done_q()
if (scsi_device_online(sdev))
scsi_queue_insert(scmd,...)
The command added by scsi_queue_insert() would never be handled any
more.
Why not? I think the blk_mq_destroy_queue() call in __scsi_remove_device() will cause it to fail.
Thanks,
Bart.
Sorry, I did not describe in detail, the __scsi_remove_device() would be blocked
in blk_mq_freeze_queue_wait() to wait all block requests finished, so
blk_mq_destroy_queue() would not be called, and the task which try to remove
scsi_device would be hung.