RE: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop detach and loop open

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kuai,

> -----Original Message-----
> From: Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>
> Sent: Wednesday, May 22, 2024 8:01 AM
> To: Jens Axboe <axboe@xxxxxxxxx>; Gulam Mohamed
> <gulam.mohamed@xxxxxxxxxx>; linux-block@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Cc: shinichiro.kawasaki@xxxxxxx; chaitanyak@xxxxxxxxxx; hch@xxxxxx;
> yukuai (C) <yukuai3@xxxxxxxxxx>
> Subject: Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop
> detach and loop open
> 
> Hi,
> 
> 在 2024/05/22 9:39, Jens Axboe 写道:
> > On 5/21/24 4:42 PM, Gulam Mohamed wrote:
> >> Description
> >> ===========
> >>
> >> 1. Userspace sends the command "losetup -d" which uses the open() call
> >>     to open the device
> >> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
> >>     function loop_clr_fd()
> >> 3. If LOOP_CLR_FD is the first command received at the time, then the
> >>     AUTOCLEAR flag is not set and deletion of the
> >>     loop device proceeds ahead and scans the partitions (drop/add
> >>     partitions)
> >>
> >> 	if (disk_openers(lo->lo_disk) > 1) {
> >> 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> >> 		loop_global_unlock(lo, true);
> >> 		return 0;
> >> 	}
> >>
> >>   4. Before scanning partitions, it will check to see if any partition of
> >>      the loop device is currently opened
> >>   5. If any partition is opened, then it will return EBUSY:
> >>
> >>      if (disk->open_partitions)
> >> 		return -EBUSY;
> >>   6. So, after receiving the "LOOP_CLR_FD" command and just before the
> above
> >>      check for open_partitions, if any other command
> >>      (like blkid) opens any partition of the loop device, then the partition
> >>      scan will not proceed and EBUSY is returned as shown in above code
> >>   7. But in "__loop_clr_fd()", this EBUSY error is not propagated
> >>   8. We have noticed that this is causing the partitions of the loop to
> >>      remain stale even after the loop device is detached resulting in the
> >>      IO errors on the partitions
> >>
> >> Fix
> >> ---
> >> Re-introduce the lo_open() call to restrict any process to open the
> >> loop device when its being detached
> >>
> >> Test case
> >> =========
> >> Test case involves the following two scripts:
> >>
> >> script1.sh
> >> ----------
> >> while [ 1 ];
> >> do
> >> 	losetup -P -f /home/opt/looptest/test10.img
> >> 	blkid /dev/loop0p1
> >> done
> >>
> >> script2.sh
> >> ----------
> >> while [ 1 ];
> >> do
> >> 	losetup -d /dev/loop0
> >> done
> >>
> >> Without fix, the following IO errors have been observed:
> >>
> >> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> >> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
> >>          phys_seg 1 prio class 0
> >> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
> >>          phys_seg 1 prio class 0
> >> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
> >>          read
> >>
> >> V1->V2:
> >> 	Added a test case, 010, in blktests in tests/loop/
> >> Signed-off-by: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx>
> >> ---
> >>   drivers/block/loop.c | 19 +++++++++++++++++++
> >>   1 file changed, 19 insertions(+)
> >>
> >> diff --git a/drivers/block/loop.c b/drivers/block/loop.c index
> >> 28a95fd366fe..9a235d8c062d 100644
> >> --- a/drivers/block/loop.c
> >> +++ b/drivers/block/loop.c
> >> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device
> *bdev, blk_mode_t mode,
> >>   }
> >>   #endif
> >>
> >> +static int lo_open(struct gendisk *disk, blk_mode_t mode) {
> >> +        struct loop_device *lo = disk->private_data;
> >> +        int err;
> >> +
> >> +        if (!lo)
> >> +                return -ENXIO;
> >> +
> >> +        err = mutex_lock_killable(&lo->lo_mutex);
> >> +        if (err)
> >> +                return err;
> >> +
> >> +        if (lo->lo_state == Lo_rundown)
> >> +                err = -ENXIO;
> >> +        mutex_unlock(&lo->lo_mutex);
> 
> This doesn't fix the problem completely, there is still a race window.
> 
> lo_release
>   if (disk_openers(disk) > 0)
>    reutrn
>    -> no openers now
> 		lo_open
> 		 if (lo->lo_state == Lo_rundown)
> 		 -> no set yet
> 		 open succeed
>   mutex_lock(lo_mutex)
>   lo->lo_state = Lo_rundown
>   mutex_unlock(lo_mutex)
>   __loop_clr_fd
We have noticed that, at block layer, both open() and release() are protected by gendisk->open_mutex.
So, this race may not happen. Can you please let me know if I understand correctly?
> 
> And with the respect, loop_clr_fd() has the same problem.
> 
> I think probably loop need a open counter for itself.
We are looking to see how to handle this case
> 
> Thanks,
> Kuai
> 
> >> +	return err;
> >> +}
> >
> > Most of this function uses spaces rather than tabs.
> >





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux