Re: [PATCH v4] block/loop: Serialize ioctl operations.

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Mon, 24 Sep 2018 22:05:20 +0900

On 2018/09/24 21:31, Jan Kara wrote:
> On Mon 24-09-18 19:29:10, Tetsuo Handa wrote:
>> On 2018/09/24 7:03, Ming Lei wrote:
>>> On Sat, Sep 22, 2018 at 09:39:02PM +0900, Tetsuo Handa wrote:
>>>> Hello, Ming Lei.
>>>>
>>>> I'd like to hear your comment on this patch regarding the ordering of
>>>> stopping kernel thread.
>>>>
>>>>   > In order to enforce this strategy, this patch inversed
>>>>   > loop_reread_partitions() and loop_unprepare_queue() in loop_clr_fd().
>>>>   > I don't know whether it breaks something, but I don't have testcases.
>>>>
>>>> Until 3.19, kthread_stop(lo->lo_thread) was called before
>>>> ioctl_by_bdev(bdev, BLKRRPART, 0) is called.
>>>> During 4.0 to 4.3, the loop module was using "kloopd" workqueue.
>>>> But since 4.4, loop_reread_partitions(lo, bdev) is called before
>>>> loop_unprepare_queue(lo) is called. And this patch is trying to change to
>>>> call loop_unprepare_queue() before loop_reread_partitions() is called.
>>>> Is there some reason we need to preserve current ordering?
>>>
>>> IMO, both the two orders are fine, and what matters is that 'lo->lo_state'
>>> is updated before loop_reread_partitions(), then any IO from loop_reread_partitions
>>> will be failed, so it shouldn't be a big deal wrt. the order between
>>> loop_reread_partitions() and loop_unprepare_queue().
>>
>> OK. Thank you. Here is v4 patch (only changelog was updated).
>> Andrew, can we test this patch in the -mm tree?
>>
>> From 2278250ac8c5b912f7eb7af55e36ed40e2f7116b Mon Sep 17 00:00:00 2001
>> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
>> Date: Mon, 24 Sep 2018 18:58:37 +0900
>> Subject: [PATCH v4] block/loop: Serialize ioctl operations.
>>
>> syzbot is reporting NULL pointer dereference [1] which is caused by
>> race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
>> ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other
>> loop devices without holding corresponding locks.
>>
>> syzbot is also reporting circular locking dependency between bdev->bd_mutex
>> and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part()
>> with lock held.
> 
> Thanks for looking into the loop crashes Tetsuo. I was looking into the
> loop code and trying to understand how your patch fixes them but I've
> failed. Can you please elaborate a bit on how exactly LOOP_CLR_FD and
> LOOP_SET_FD race to hit NULL pointer dereference? I don't really see the
> code traversing other loop devices as you mention in your changelog so I'm
> probably missing something. Thanks.
> 
> 								Honza
> 

That is explained in a discussion for [1] at
https://groups.google.com/forum/#!msg/syzkaller-bugs/c8KUcTAzTvA/3o_7g6-tAwAJ .
In the current code, the location of dangerous traversal is in loop_validate_file().