Re: False lockdep completion splats with loop device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 6, 2017 at 9:01 AM, Byungchul Park <byungchul.park@xxxxxxx> wrote:
> On 12/6/2017 3:31 PM, Amir Goldstein wrote:
>>
>> On Wed, Dec 6, 2017 at 7:01 AM, Byungchul Park <byungchul.park@xxxxxxx>
>> wrote:
>>>
>>> On 12/6/2017 6:09 AM, Dave Chinner wrote:
>>>>
>>>>
>>>> On Tue, Dec 05, 2017 at 10:07:41AM -0500, Theodore Ts'o wrote:
>>>>>
>>>>>
>>>>> On Tue, Dec 05, 2017 at 02:16:45PM +0900, Byungchul Park wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I believe that the commit e319e1fbd9d42 "block, locking/lockdep:
>>>>>> Assign
>>>>>> a lock_class per gendisk used for wait_for_completion()" solved the
>>>>>> false positive.
>>>>>>
>>>>>> Could you tell me if it doesn't handle it, with the report? Then, I
>>>>>> will follow up and try to solve it.
>>>>>
>>>>>
>>>>>
>>>>> No, it doesn't handle it.  And there was some discussion in the linked
>>>>> thread on the xfs mailing list that seemed to indicate that it was not
>>>>> a complete fix.
>>>>
>>>>
>>>>
>>>> Well, it uses a static key hidden inside the macro
>>>> alloc_disk_node(), so every disk allocated from the same callsite
>>>> points to the same static lockdep map. IOWs, every caller of
>>>> alloc_disk() (i.e. the vast majority of storage devices) are
>>>> configured to point at the same static lockdep map, regardless of
>>>> their location in the storage stack.
>>>>
>>>> The loop device uses alloc_disk().
>>>>
>>>> IOWs, it doesn't address the problem of false positives due to
>>>> layering in the IO stack at all because we can still have
>>>> filesystems both above and below the lockdep map that has been
>>>> attached to the devices...
>>>
>>>
>>>
>>> Hello,
>>>
>>> What if we assign different classes to different loop devices? Do
>>> you think it helps to avoid such false positives?
>>>
>>> I'm not sure since I'm unfamiliar with the stacking stuff. It would
>>> be appriciated if you let me know your opinion about it.
>>>
>
> Hello Amir,
>
> Thanks a lot for your opinion! See the below.
>
>> I believe the problem in the tests is that loop is mistaken for the same
>> class/key as another (underlying) blockdev, where the fs of the loop file
>> resides.
>>
>> For the first order of resolution, it would be enough to assign a
>> different
>> key to different types of block devices, similar to the struct
>> file_system_type
>> lock class keys.
>>
>> This would be enough to cover to common case of ext4 over loop inside
>> ext4 over real disk (at least for the completion lockdep warnings).
>
>
> This was already done by the commit e319e1fbd9d42 "block,
> locking/lockdep: Assign a lock_class per gendisk used for
> wait_for_completion()".
>

Right...

>> For the second order of resolution it may be desired to assign a different
>> key/class to an instance of loop devices and other stackable block devices
>> (e.g. dm, nbd)  to cover the case of a deeper nested blockdev stack,
>> such as: ext4 over loop inside ext4 over loop inside ext4 over real disk.
>
>
> This is what I should do.
>

But that won't help Ted. The test use case doesn't include doubly nested loop,
very few setups will do that. Maybe xfstests running within a container over
loop image. I doubt it that setup is common, so until there are nested container
implementations out there that are using loop image mount, this use case
is not very interesting.

>
>> That should be enough to cover the completion lockdep validation,
>> but it still leaves to problem of the file_system_type lock key, which
>> is still the same for the 2 instances of ext4 in the stack above.
>> For that I suggested a solution similar to
>> b1eaa950f7e9 ovl: lockdep annotate of nested stacked overlayfs inode lock
>> but that will require propagating a "bdev_stack_depth" throughout the
>> block and fs stack and increment it when creating loop or nbd device.
>>

So this in the problem Ted is reporting and it needs to be solved inside
ext4 and any other fs that can be loop mounted on a file inside same fs type.

The lockdep chain is: ext4/loop -> loop completion -> ext4/real
but all the internal ext4 locks (i.e. &meta_group_info[i]->alloc_sem) are
annotated statically for the ext4 fs type, so to fix that need:
1. ext4 to know it is mounted over loop/nbd
2. ext4 to annotate all internal locks that surround blockdev I/O as
    "loop nested"/"not loop nested" like overlayfs does with the vfs locks
3. if multi nested looped fs is important (not really) then loop/nbd will
    need to know if its file is on a looped fs and propagate nesting level
    to ext4

Cheers,
Amir.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux