On 11/19/21 7:35 AM, James Bottomley wrote: > On Fri, 2021-11-05 at 17:10 -0500, Mike Christie wrote: >> This fixes a regression added with: >> >> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after >> offlinining device") >> >> The problem is that after iSCSI recovery, iscsid will call into the >> kernel >> to set the dev's state to running, and with that patch we now call >> scsi_rescan_device with the state_mutex held. If the scsi error >> handler >> thread is just starting to test the device in scsi_send_eh_cmnd then >> it's >> going to try to grab the state_mutex. >> >> We are then stuck, because when scsi_rescan_device tries to send its >> IO >> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery >> which will return true (the host state is still in recovery) and IO >> will >> just be requeued. scsi_send_eh_cmnd will then never be able to grab >> the >> state_mutex to finish error handling. >> >> To prevent the deadlock this moves the rescan related code to after >> we >> drop the state_mutex. >> >> This also adds a check for if we are already in the running state. >> This >> prevents extra scans and helps the iscsid case where if the transport >> class has already onlined the device during it's recovery process >> then we >> don't need userspace to do it again plus possibly block that daemon. >> >> Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after >> offlinining device") >> Cc: Bart Van Assche <bvanassche@xxxxxxx> >> Cc: lijinlin <lijinlin3@xxxxxxxxxx> >> Cc: Wu Bo <wubo40@xxxxxxxxxx> >> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> >> --- >> drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++----------- >> 1 file changed, 19 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >> index a35841b34bfd..53e23a7bc0d3 100644 >> --- a/drivers/scsi/scsi_sysfs.c >> +++ b/drivers/scsi/scsi_sysfs.c >> @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct >> device_attribute *attr, >> int i, ret; >> struct scsi_device *sdev = to_scsi_device(dev); >> enum scsi_device_state state = 0; >> + bool rescan_dev = false; >> >> for (i = 0; i < ARRAY_SIZE(sdev_states); i++) { >> const int len = strlen(sdev_states[i].name); >> @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct >> device_attribute *attr, >> } >> >> mutex_lock(&sdev->state_mutex); >> - ret = scsi_device_set_state(sdev, state); >> - /* >> - * If the device state changes to SDEV_RUNNING, we need to >> - * run the queue to avoid I/O hang, and rescan the device >> - * to revalidate it. Running the queue first is necessary >> - * because another thread may be waiting inside >> - * blk_mq_freeze_queue_wait() and because that call may be >> - * waiting for pending I/O to finish. >> - */ >> - if (ret == 0 && state == SDEV_RUNNING) { >> + if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) >> { >> + ret = count; > > This looks wrong because of this > > [...] >> return ret == 0 ? count : -EINVAL; > > Don't we now return EINVAL on idempotent set state running because the > count is always non-zero? > > I think the first statement should be 'ret = 0;' instead to cause > idempotent state setting to succeed as a nop. > You're right. I'll resend this patchset with James's comment handled.