Re: False lockdep completion splats with loop device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/7/2017 1:18 PM, Amir Goldstein wrote:
On Thu, Dec 7, 2017 at 4:46 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
On Wed, Dec 6, 2017 at 9:01 AM, Byungchul Park <byungchul.park@xxxxxxx> wrote:
On 12/6/2017 3:31 PM, Amir Goldstein wrote:

On Wed, Dec 6, 2017 at 7:01 AM, Byungchul Park <byungchul.park@xxxxxxx>
wrote:

On 12/6/2017 6:09 AM, Dave Chinner wrote:


On Tue, Dec 05, 2017 at 10:07:41AM -0500, Theodore Ts'o wrote:


On Tue, Dec 05, 2017 at 02:16:45PM +0900, Byungchul Park wrote:



Hello,

I believe that the commit e319e1fbd9d42 "block, locking/lockdep:
Assign
a lock_class per gendisk used for wait_for_completion()" solved the
false positive.

Could you tell me if it doesn't handle it, with the report? Then, I
will follow up and try to solve it.



No, it doesn't handle it.  And there was some discussion in the linked
thread on the xfs mailing list that seemed to indicate that it was not
a complete fix.



Well, it uses a static key hidden inside the macro
alloc_disk_node(), so every disk allocated from the same callsite
points to the same static lockdep map. IOWs, every caller of
alloc_disk() (i.e. the vast majority of storage devices) are
configured to point at the same static lockdep map, regardless of
their location in the storage stack.

The loop device uses alloc_disk().

IOWs, it doesn't address the problem of false positives due to
layering in the IO stack at all because we can still have
filesystems both above and below the lockdep map that has been
attached to the devices...



Hello,

What if we assign different classes to different loop devices? Do
you think it helps to avoid such false positives?

I'm not sure since I'm unfamiliar with the stacking stuff. It would
be appriciated if you let me know your opinion about it.


Hello Amir,

Thanks a lot for your opinion! See the below.

I believe the problem in the tests is that loop is mistaken for the same
class/key as another (underlying) blockdev, where the fs of the loop file
resides.

For the first order of resolution, it would be enough to assign a
different
key to different types of block devices, similar to the struct
file_system_type
lock class keys.

This would be enough to cover to common case of ext4 over loop inside
ext4 over real disk (at least for the completion lockdep warnings).


This was already done by the commit e319e1fbd9d42 "block,
locking/lockdep: Assign a lock_class per gendisk used for
wait_for_completion()".


Right...

For the second order of resolution it may be desired to assign a different
key/class to an instance of loop devices and other stackable block devices
(e.g. dm, nbd)  to cover the case of a deeper nested blockdev stack,
such as: ext4 over loop inside ext4 over loop inside ext4 over real disk.


This is what I should do.


But that won't help Ted. The test use case doesn't include doubly nested loop,
very few setups will do that. Maybe xfstests running within a container over
loop image. I doubt it that setup is common, so until there are nested container
implementations out there that are using loop image mount, this use case
is not very interesting.


That should be enough to cover the completion lockdep validation,
but it still leaves to problem of the file_system_type lock key, which
is still the same for the 2 instances of ext4 in the stack above.
For that I suggested a solution similar to
b1eaa950f7e9 ovl: lockdep annotate of nested stacked overlayfs inode lock
but that will require propagating a "bdev_stack_depth" throughout the
block and fs stack and increment it when creating loop or nbd device.


So this in the problem Ted is reporting and it needs to be solved inside
ext4 and any other fs that can be loop mounted on a file inside same fs type.

The lockdep chain is: ext4/loop -> loop completion -> ext4/real
but all the internal ext4 locks (i.e. &meta_group_info[i]->alloc_sem) are
annotated statically for the ext4 fs type, so to fix that need:
1. ext4 to know it is mounted over loop/nbd
2. ext4 to annotate all internal locks that surround blockdev I/O as
     "loop nested"/"not loop nested" like overlayfs does with the vfs locks
3. if multi nested looped fs is important (not really) then loop/nbd will
     need to know if its file is on a looped fs and propagate nesting level
     to ext4


So maybe a macro for filesystems to annotate internal locks with
underlying blockdev's gendisk lockdep_map?

Does that make sense?

Hello,

I think it can be a way. But I'm thinking about a more general way..

Thanks for the proposal.

--
Thanks,
Byungchul



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux