On Fri, 2021-11-05 at 17:10 -0500, Mike Christie wrote: > This fixes a regression added with: > > commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after > offlinining device") > > The problem is that after iSCSI recovery, iscsid will call into the > kernel > to set the dev's state to running, and with that patch we now call > scsi_rescan_device with the state_mutex held. If the scsi error > handler > thread is just starting to test the device in scsi_send_eh_cmnd then > it's > going to try to grab the state_mutex. > > We are then stuck, because when scsi_rescan_device tries to send its > IO > scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery > which will return true (the host state is still in recovery) and IO > will > just be requeued. scsi_send_eh_cmnd will then never be able to grab > the > state_mutex to finish error handling. > > To prevent the deadlock this moves the rescan related code to after > we > drop the state_mutex. > > This also adds a check for if we are already in the running state. > This > prevents extra scans and helps the iscsid case where if the transport > class has already onlined the device during it's recovery process > then we > don't need userspace to do it again plus possibly block that daemon. > > Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after > offlinining device") > Cc: Bart Van Assche <bvanassche@xxxxxxx> > Cc: lijinlin <lijinlin3@xxxxxxxxxx> > Cc: Wu Bo <wubo40@xxxxxxxxxx> > Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx> > --- > drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++----------- > 1 file changed, 19 insertions(+), 11 deletions(-) > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index a35841b34bfd..53e23a7bc0d3 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct > device_attribute *attr, > int i, ret; > struct scsi_device *sdev = to_scsi_device(dev); > enum scsi_device_state state = 0; > + bool rescan_dev = false; > > for (i = 0; i < ARRAY_SIZE(sdev_states); i++) { > const int len = strlen(sdev_states[i].name); > @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct > device_attribute *attr, > } > > mutex_lock(&sdev->state_mutex); > - ret = scsi_device_set_state(sdev, state); > - /* > - * If the device state changes to SDEV_RUNNING, we need to > - * run the queue to avoid I/O hang, and rescan the device > - * to revalidate it. Running the queue first is necessary > - * because another thread may be waiting inside > - * blk_mq_freeze_queue_wait() and because that call may be > - * waiting for pending I/O to finish. > - */ > - if (ret == 0 && state == SDEV_RUNNING) { > + if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) > { > + ret = count; This looks wrong because of this [...] > return ret == 0 ? count : -EINVAL; Don't we now return EINVAL on idempotent set state running because the count is always non-zero? I think the first statement should be 'ret = 0;' instead to cause idempotent state setting to succeed as a nop. James