On 2018/09/24 21:31, Jan Kara wrote: > On Mon 24-09-18 19:29:10, Tetsuo Handa wrote: >> On 2018/09/24 7:03, Ming Lei wrote: >>> On Sat, Sep 22, 2018 at 09:39:02PM +0900, Tetsuo Handa wrote: >>>> Hello, Ming Lei. >>>> >>>> I'd like to hear your comment on this patch regarding the ordering of >>>> stopping kernel thread. >>>> >>>> > In order to enforce this strategy, this patch inversed >>>> > loop_reread_partitions() and loop_unprepare_queue() in loop_clr_fd(). >>>> > I don't know whether it breaks something, but I don't have testcases. >>>> >>>> Until 3.19, kthread_stop(lo->lo_thread) was called before >>>> ioctl_by_bdev(bdev, BLKRRPART, 0) is called. >>>> During 4.0 to 4.3, the loop module was using "kloopd" workqueue. >>>> But since 4.4, loop_reread_partitions(lo, bdev) is called before >>>> loop_unprepare_queue(lo) is called. And this patch is trying to change to >>>> call loop_unprepare_queue() before loop_reread_partitions() is called. >>>> Is there some reason we need to preserve current ordering? >>> >>> IMO, both the two orders are fine, and what matters is that 'lo->lo_state' >>> is updated before loop_reread_partitions(), then any IO from loop_reread_partitions >>> will be failed, so it shouldn't be a big deal wrt. the order between >>> loop_reread_partitions() and loop_unprepare_queue(). >> >> OK. Thank you. Here is v4 patch (only changelog was updated). >> Andrew, can we test this patch in the -mm tree? >> >> From 2278250ac8c5b912f7eb7af55e36ed40e2f7116b Mon Sep 17 00:00:00 2001 >> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> >> Date: Mon, 24 Sep 2018 18:58:37 +0900 >> Subject: [PATCH v4] block/loop: Serialize ioctl operations. >> >> syzbot is reporting NULL pointer dereference [1] which is caused by >> race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus >> ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other >> loop devices without holding corresponding locks. >> >> syzbot is also reporting circular locking dependency between bdev->bd_mutex >> and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part() >> with lock held. > > Thanks for looking into the loop crashes Tetsuo. I was looking into the > loop code and trying to understand how your patch fixes them but I've > failed. Can you please elaborate a bit on how exactly LOOP_CLR_FD and > LOOP_SET_FD race to hit NULL pointer dereference? I don't really see the > code traversing other loop devices as you mention in your changelog so I'm > probably missing something. Thanks. > > Honza > That is explained in a discussion for [1] at https://groups.google.com/forum/#!msg/syzkaller-bugs/c8KUcTAzTvA/3o_7g6-tAwAJ . In the current code, the location of dangerous traversal is in loop_validate_file().