On 2021/06/12 11:35, Tetsuo Handa wrote: > On 2021/06/12 0:49, Tetsuo Handa wrote: >> On 2021/06/12 0:18, Pavel Tatashin wrote: >>>>> Well, I made commit 310ca162d779efee ("block/loop: Use global lock for ioctl() operation.") >>>>> because per device lock was not sufficient. Did commit 6cc8e7430801fa23 ("loop: scale loop >>>>> device by introducing per device lock") take this problem into account? >>>> >>>> This was my intention when I wrote 6cc8e7430801fa23 ("loop: scale loop >>>> device by introducing per device lock"). This is why this change does >>>> not simply revert 310ca162d779efee ("block/loop: Use global lock for >>>> ioctl() operation."), but keeps loop_ctl_mutex to protect the global >>>> accesses. loop_control_ioctl() is still locked by global >>>> loop_ctl_mutex. >> >> No, loop_control_ioctl() (i.e. /dev/loop-control) is irrelevant here. >> What 310ca162d779efee addressed but (I worry) 6cc8e7430801fa23 broke is >> lo_ioctl() (i.e. /dev/loop$num). >> >> syzbot was reporting NULL pointer dereference which is caused by >> race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus >> ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other >> loop devices at loop_validate_file() without holding corresponding >> lo->lo_mutex lock. > > Here is a reproducer and a race window widening patch. > Since loop_validate_file() traverses on other loop devices, > changing fd of loop device needs to be protected using a global lock. > At least LOOP_SET_FD, LOOP_CONFIGURE, LOOP_CHANGE_FD, LOOP_CLR_FD needs to be serialized using a global lock because loop_validate_file() traverses on other loop devices which can cause NULL pointer dereference like syzbot has reported in the past. And I think LOOP_SET_CAPACITY, LOOP_SET_DIRECT_IO, LOOP_SET_BLOCK_SIZE also needs to be serialized using a global lock because loop_change_fd() checks capacity and dio state of other loop device which is not serialized. I'd like to apply [PATCH] loop: drop loop_ctl_mutex around del_gendisk() in loop_remove() as soon as possible because it is current top crasher (crashing on every 39 seconds). I suggest simply reverting commit 6cc8e7430801fa23 ("loop: scale loop device by introducing per device lock") for now. Do you want to revert 6cc8e7430801fa23 before my patch? If yes, I'll update my patch. If no, I'll just ask Jens to send my patch to Linus.