On Wed 16-03-22 09:45:18, Christoph Hellwig wrote: > lo_refcount counts how many openers a loop device has, but that count > is already provided by the block layer in the bd_openers field of the > whole-disk block_device. Remove lo_refcount and allow opens to > succeed even on devices beeing deleted - now that ->free_disk is > implemented we can handle that race gracefull and all I/O on it will > just fail. Similarly there is a small race window now where > loop_control_remove does not synchronize the delete vs the remove > due do bd_openers not being under lo_mutex protection, but we can > handle that just as gracefully. Honestly, I'm a bit uneasy about these races and ability to have open removed loop device (given use of loop devices in container environments potential problems could have security implications). But I guess it is no different to having open hot-unplugged scsi disk. There may be just an expectation from userspace that your open either blocks loop device removal or open fails. But I guess we can deal with that if some real breakage happens - it does not seem that hard to solve - we just need loop_control_remove() to grab disk->open_mutex to make transition to Lo_deleting state safe and keep Lo_deleting check in lo_open(). Plus we'd need to use READ_ONCE / WRITE_ONCE for lo_state accesses. > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- > drivers/block/loop.c | 36 +++++++----------------------------- > drivers/block/loop.h | 1 - > 2 files changed, 7 insertions(+), 30 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index cbaa18bcad1fe..c270f3715d829 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1244,7 +1244,7 @@ static int loop_clr_fd(struct loop_device *lo) > * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d > * command to fail with EBUSY. > */ > - if (atomic_read(&lo->lo_refcnt) > 1) { > + if (lo->lo_disk->part0->bd_openers > 1) { But bd_openers can be read safely only under disk->open_mutex. So for this to be safe against compiler playing nasty tricks with optimizations, we need to either make bd_openers atomic_t or use READ_ONCE / WRITE_ONCE when accessing it. > @@ -1724,33 +1724,15 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode, > } > #endif > > -static int lo_open(struct block_device *bdev, fmode_t mode) > -{ > - struct loop_device *lo = bdev->bd_disk->private_data; > - int err; > - > - err = mutex_lock_killable(&lo->lo_mutex); > - if (err) > - return err; > - if (lo->lo_state == Lo_deleting) > - err = -ENXIO; > - else > - atomic_inc(&lo->lo_refcnt); > - mutex_unlock(&lo->lo_mutex); > - return err; > -} > - Tetsuo has observed [1] that not grabbing lo_mutex when opening loop device tends to break systemd-udevd because in loop_configure() we send DISK_EVENT_MEDIA_CHANGE event before the device is fully configured (but the configuration gets finished before we release the lo_mutex) and so systemd-udev gets spurious IO errors when probing new loop devices and is unhappy. So I think this is the right way to go but we need to reshuffle loop_configure() a bit first. [1] https://lore.kernel.org/all/a72c59c6-298b-e4ba-b1f5-2275afab49a1@xxxxxxxxxxxxxxxxxxx/T/#u Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR