Re: [PATCH] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2024-08-01 at 14:42 +0900, Shin'ichiro Kawasaki wrote:
> Commit 804e498e0496 ("sd: convert to the atomic queue limits API")
> introduced pairs of function calls to queue_limits_start_update() and
> queue_limits_commit_update(). These two functions lock and unlock
> q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after
> queue_limits_start_update() call and before
> queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock
> and &q->sysfs_lock. Then new lock dependencies were created between
> q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows:
> 
> sd_revalidate_disk
>   queue_limits_start_update
>     mutex_lock(&q->limits_lock)
>   sd_read_cpr
>     disk_set_independent_access_ranges
>       mutex_lock(&q->sysfs_dir_lock)
>       mutex_lock(&q->sysfs_lock)
>       mutex_unlock(&q->sysfs_lock)
>       mutex_unlock(&q->sysfs_dir_lock)
>   queue_limits_commit_update
>     mutex_unlock(&q->limits_lock)
> 
> However, the three locks already had reversed dependencies in other
> places. Then the new dependencies triggered the lockdep WARN "possible
> circular locking dependency detected" [1]. This WARN was observed by
> running the blktests test case srp/002.
> 
> To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk()
> after the queue_limits_commit_update() call. In other words, move the
> sd_read_cpr() call out of the q->limits_lock region.
> 
> [1] https://lore.kernel.org/linux-scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/
> 
> Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
> ---
>  drivers/scsi/sd.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index adeaa8ab9951..08cbe3815006 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3753,7 +3753,6 @@ static int sd_revalidate_disk(struct gendisk *disk)
>  			sd_read_block_limits_ext(sdkp);
>  			sd_read_block_characteristics(sdkp, &lim);
>  			sd_zbc_read_zones(sdkp, &lim, buffer);
> -			sd_read_cpr(sdkp);
>  		}
>  
>  		sd_print_capacity(sdkp, old_capacity);
> @@ -3808,6 +3807,14 @@ static int sd_revalidate_disk(struct gendisk *disk)
>  	if (err)
>  		return err;
>  
> +	/*
> +	 * Query concurrent positioning ranges after
> +	 * queue_limits_commit_update() unlocked q->limits_lock to avoid
> +	 * deadlock with q->sysfs_dir_lock and q->sysfs_lock.
> +	 */
> +	if (sdkp->media_present && scsi_device_supports_vpd(sdp))
> +		sd_read_cpr(sdkp);
> +
>  	/*
>  	 * For a zoned drive, revalidating the zones can be done only once
>  	 * the gendisk capacity is set. So if this fails, set back the gendisk

This seems to do the trick! At least on our setups we're not seeing the
deadlock issue anymore.

Thanks, Shinichiro!

Tested-by: Luca Coelho <luciano.coelho@xxxxxxxxx>

--
Cheers,
Luca.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux