Re: [PATCH 2/2] scsi: Fix hang when device state is set via sysfs

Lee Duncan <lduncan@xxxxxxxx> · Sat, 6 Nov 2021 11:17:39 -0700



On 11/5/21 3:10 PM, Mike Christie wrote:
> This fixes a regression added with:
> 
> commit f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> 
> The problem is that after iSCSI recovery, iscsid will call into the kernel
> to set the dev's state to running, and with that patch we now call
> scsi_rescan_device with the state_mutex held. If the scsi error handler
> thread is just starting to test the device in scsi_send_eh_cmnd then it's
> going to try to grab the state_mutex.
> 
> We are then stuck, because when scsi_rescan_device tries to send its IO
> scsi_queue_rq calls -> scsi_host_queue_ready -> scsi_host_in_recovery
> which will return true (the host state is still in recovery) and IO will
> just be requeued. scsi_send_eh_cmnd will then never be able to grab the
> state_mutex to finish error handling.
> 
> To prevent the deadlock this moves the rescan related code to after we
> drop the state_mutex.
> 
> This also adds a check for if we are already in the running state. This
> prevents extra scans and helps the iscsid case where if the transport
> class has already onlined the device during it's recovery process then we
> don't need userspace to do it again plus possibly block that daemon.
> 
> Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after
> offlinining device")
> Cc: Bart Van Assche <bvanassche@xxxxxxx>
> Cc: lijinlin <lijinlin3@xxxxxxxxxx>
> Cc: Wu Bo <wubo40@xxxxxxxxxx>
> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
> ---
>  drivers/scsi/scsi_sysfs.c | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index a35841b34bfd..53e23a7bc0d3 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -797,6 +797,7 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>  	int i, ret;
>  	struct scsi_device *sdev = to_scsi_device(dev);
>  	enum scsi_device_state state = 0;
> +	bool rescan_dev = false;
>  
>  	for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
>  		const int len = strlen(sdev_states[i].name);
> @@ -815,20 +816,27 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>  	}
>  
>  	mutex_lock(&sdev->state_mutex);
> -	ret = scsi_device_set_state(sdev, state);
> -	/*
> -	 * If the device state changes to SDEV_RUNNING, we need to
> -	 * run the queue to avoid I/O hang, and rescan the device
> -	 * to revalidate it. Running the queue first is necessary
> -	 * because another thread may be waiting inside
> -	 * blk_mq_freeze_queue_wait() and because that call may be
> -	 * waiting for pending I/O to finish.
> -	 */
> -	if (ret == 0 && state == SDEV_RUNNING) {
> +	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
> +		ret = count;
> +	} else {
> +		ret = scsi_device_set_state(sdev, state);
> +		if (ret == 0 && state == SDEV_RUNNING)
> +			rescan_dev = true;
> +	}
> +	mutex_unlock(&sdev->state_mutex);
> +
> +	if (rescan_dev) {
> +		/*
> +		 * If the device state changes to SDEV_RUNNING, we need to
> +		 * run the queue to avoid I/O hang, and rescan the device
> +		 * to revalidate it. Running the queue first is necessary
> +		 * because another thread may be waiting inside
> +		 * blk_mq_freeze_queue_wait() and because that call may be
> +		 * waiting for pending I/O to finish.
> +		 */
>  		blk_mq_run_hw_queues(sdev->request_queue, true);
>  		scsi_rescan_device(dev);
>  	}
> -	mutex_unlock(&sdev->state_mutex);
>  
>  	return ret == 0 ? count : -EINVAL;
>  }
> 

Reviewed-by: Lee Duncan <lduncan@xxxxxxxx>