Re: False lockdep completion splats with loop device

Amir Goldstein <amir73il@xxxxxxxxx> · Wed, 6 Dec 2017 08:31:51 +0200

On Wed, Dec 6, 2017 at 7:01 AM, Byungchul Park <byungchul.park@xxxxxxx> wrote:
> On 12/6/2017 6:09 AM, Dave Chinner wrote:
>>
>> On Tue, Dec 05, 2017 at 10:07:41AM -0500, Theodore Ts'o wrote:
>>>
>>> On Tue, Dec 05, 2017 at 02:16:45PM +0900, Byungchul Park wrote:
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I believe that the commit e319e1fbd9d42 "block, locking/lockdep: Assign
>>>> a lock_class per gendisk used for wait_for_completion()" solved the
>>>> false positive.
>>>>
>>>> Could you tell me if it doesn't handle it, with the report? Then, I
>>>> will follow up and try to solve it.
>>>
>>>
>>> No, it doesn't handle it.  And there was some discussion in the linked
>>> thread on the xfs mailing list that seemed to indicate that it was not
>>> a complete fix.
>>
>>
>> Well, it uses a static key hidden inside the macro
>> alloc_disk_node(), so every disk allocated from the same callsite
>> points to the same static lockdep map. IOWs, every caller of
>> alloc_disk() (i.e. the vast majority of storage devices) are
>> configured to point at the same static lockdep map, regardless of
>> their location in the storage stack.
>>
>> The loop device uses alloc_disk().
>>
>> IOWs, it doesn't address the problem of false positives due to
>> layering in the IO stack at all because we can still have
>> filesystems both above and below the lockdep map that has been
>> attached to the devices...
>
>
> Hello,
>
> What if we assign different classes to different loop devices? Do
> you think it helps to avoid such false positives?
>
> I'm not sure since I'm unfamiliar with the stacking stuff. It would
> be appriciated if you let me know your opinion about it.
>

I believe the problem in the tests is that loop is mistaken for the same
class/key as another (underlying) blockdev, where the fs of the loop file
resides.

For the first order of resolution, it would be enough to assign a different
key to different types of block devices, similar to the struct file_system_type
lock class keys.

This would be enough to cover to common case of ext4 over loop inside
ext4 over real disk (at least for the completion lockdep warnings).

For the second order of resolution it may be desired to assign a different
key/class to an instance of loop devices and other stackable block devices
(e.g. dm, nbd)  to cover the case of a deeper nested blockdev stack,
such as: ext4 over loop inside ext4 over loop inside ext4 over real disk.

That should be enough to cover the completion lockdep validation,
but it still leaves to problem of the file_system_type lock key, which
is still the same for the 2 instances of ext4 in the stack above.
For that I suggested a solution similar to
b1eaa950f7e9 ovl: lockdep annotate of nested stacked overlayfs inode lock
but that will require propagating a "bdev_stack_depth" throughout the
block and fs stack and increment it when creating loop or nbd device.

Cheers,
Amir.