Re: [PATCH] block: Fix a race in the runtime power management code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-08-28 08:37, Alan Stern wrote:
> On Thu, Aug 27, 2020 at 08:27:49PM -0700, Bart Van Assche wrote:
>> On 2020-08-27 13:33, Alan Stern wrote:
>>> It may not need to be that complicated.  what about something like this?
> 
>> I think this patch will break SCSI domain validation. The SCSI domain
>> validation code calls scsi_device_quiesce() and that function in turn calls
>> blk_set_pm_only(). The SCSI domain validation code submits SCSI commands with
>> the BLK_MQ_REQ_PREEMPT flag. Since the above code postpones such requests
>> while blk_set_pm_only() is in effect, I think the above patch will cause the
>> SCSI domain validation code to deadlock.
> 
> Yes, you're right.
> 
> There may be an even simpler solution: Ensure that SCSI domain 
> validation is mutually exclusive with runtime PM.  It's already mutually 
> exclusive with system PM, so this makes sense.
> 
> What do you think of the patch below?
> 
> Alan Stern
> 
> 
> Index: usb-devel/drivers/scsi/scsi_transport_spi.c
> ===================================================================
> --- usb-devel.orig/drivers/scsi/scsi_transport_spi.c
> +++ usb-devel/drivers/scsi/scsi_transport_spi.c
> @@ -1001,7 +1001,7 @@ spi_dv_device(struct scsi_device *sdev)
>  	 * Because this function and the power management code both call
>  	 * scsi_device_quiesce(), it is not safe to perform domain validation
>  	 * while suspend or resume is in progress. Hence the
> -	 * lock/unlock_system_sleep() calls.
> +	 * lock/unlock_system_sleep() and scsi_autopm_get/put_device() calls.
>  	 */
>  	lock_system_sleep();
>  
> @@ -1018,10 +1018,13 @@ spi_dv_device(struct scsi_device *sdev)
>  	if (unlikely(!buffer))
>  		goto out_put;
>  
> +	if (scsi_autopm_get_device(sdev))
> +		goto out_free;
> +
>  	/* We need to verify that the actual device will quiesce; the
>  	 * later target quiesce is just a nice to have */
>  	if (unlikely(scsi_device_quiesce(sdev)))
> -		goto out_free;
> +		goto out_autopm_put;
>  
>  	scsi_target_quiesce(starget);
>  
> @@ -1041,6 +1044,8 @@ spi_dv_device(struct scsi_device *sdev)
>  
>  	spi_initial_dv(starget) = 1;
>  
> + out_autopm_put:
> +	scsi_autopm_put_device(sdev);
>   out_free:
>  	kfree(buffer);
>   out_put:

Hi Alan,

I think this is only a part of the solution. scsi_target_quiesce() invokes
scsi_device_quiesce() and that function in turn calls blk_set_pm_only(). So
I think that the above is not sufficient to fix the deadlock mentioned in
my previous email.

Maybe it is possible to fix this by creating a new request queue and by
submitting the DV SCSI commands to that new request queue. There may be
better solutions.

Bart.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux