Re: [PATCH] block: loop: fix deadlock between open and remove

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/06/2021 15:09, Christoph Hellwig wrote:
> Commit c76f48eb5c08 ("block: take bd_mutex around delete_partitions in
> del_gendisk") adds disk->part0->bd_mutex in del_gendisk(), this way
> causes the following AB/BA deadlock between removing loop and opening
> loop:
> 
>  1) loop_control_ioctl(LOOP_CTL_REMOVE)
>      -> mutex_lock(&loop_ctl_mutex)
>      -> del_gendisk
>          -> mutex_lock(&disk->part0->bd_mutex)
> 
>  2) blkdev_get_by_dev
>      -> mutex_lock(&disk->part0->bd_mutex)
>      -> lo_open
>          -> mutex_lock(&loop_ctl_mutex)
> 
> Add a new Lo_deleting state to remove the need for clearing
> ->private_data and thus holding loop_ctl_mutex in the ioctl
> LOOP_CTL_REMOVE path.
> 
> Based on an analysis and earlier patch from
> Ming Lei <ming.lei@xxxxxxxxxx>.
> 
> Reported-by: Colin Ian King <colin.king@xxxxxxxxxxxxx>
> Fixes: c76f48eb5c08 ("block: take bd_mutex around delete_partitions in del_gendisk")
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---
>  drivers/block/loop.c | 25 +++++++------------------
>  drivers/block/loop.h |  1 +
>  2 files changed, 8 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index d58d68f3c7cd..76e12f3482a9 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1879,29 +1879,18 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
>  
>  static int lo_open(struct block_device *bdev, fmode_t mode)
>  {
> -	struct loop_device *lo;
> +	struct loop_device *lo = bdev->bd_disk->private_data;
>  	int err;
>  
> -	/*
> -	 * take loop_ctl_mutex to protect lo pointer from race with
> -	 * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce contention
> -	 * release it prior to updating lo->lo_refcnt.
> -	 */
> -	err = mutex_lock_killable(&loop_ctl_mutex);
> -	if (err)
> -		return err;
> -	lo = bdev->bd_disk->private_data;
> -	if (!lo) {
> -		mutex_unlock(&loop_ctl_mutex);
> -		return -ENXIO;
> -	}
>  	err = mutex_lock_killable(&lo->lo_mutex);
> -	mutex_unlock(&loop_ctl_mutex);
>  	if (err)
>  		return err;
> -	atomic_inc(&lo->lo_refcnt);
> +	if (lo->lo_state == Lo_deleting)
> +		err = -ENXIO;
> +	else
> +		atomic_inc(&lo->lo_refcnt);
>  	mutex_unlock(&lo->lo_mutex);
> -	return 0;
> +	return err;
>  }
>  
>  static void lo_release(struct gendisk *disk, fmode_t mode)
> @@ -2285,7 +2274,7 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
>  			mutex_unlock(&lo->lo_mutex);
>  			break;
>  		}
> -		lo->lo_disk->private_data = NULL;
> +		lo->lo_state = Lo_deleting;
>  		mutex_unlock(&lo->lo_mutex);
>  		idr_remove(&loop_index_idr, lo->lo_number);
>  		loop_remove(lo);
> diff --git a/drivers/block/loop.h b/drivers/block/loop.h
> index a3c04f310672..5beb959b94d3 100644
> --- a/drivers/block/loop.h
> +++ b/drivers/block/loop.h
> @@ -22,6 +22,7 @@ enum {
>  	Lo_unbound,
>  	Lo_bound,
>  	Lo_rundown,
> +	Lo_deleting,
>  };
>  
>  struct loop_func_table;
> 

Tested and no longer hangs with the stress-ng loop test. Thank you for
the speedy turnaround on the fix.

Tested-by: Colin Ian King <colin.king@xxxxxxxxxxxxx>

Colin



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux