On Wed, Jan 13, 2021 at 02:17:29PM +0200, Nikolay Borisov wrote: > > > On 13.01.21 г. 14:09 ч., Christoph Hellwig wrote: > > On Wed, Jan 13, 2021 at 01:41:09PM +0200, Nikolay Borisov wrote: > >> > >> > >> On 13.01.21 ??. 13:27 ??., Christoph Hellwig wrote: > >>> Pavel has looked into this before and got stuck on the allocator > >>> workqueue offloads: > >>> > >>> [PATCH v13 0/4] xfs: Remove wrappers for some semaphores > >> > >> I haven't looked into his series but I fail to see how lifting > >> rwsemaphore out of the nested structure can change the behavior ? It > >> just removes a level of indirection. My patches are semantically > >> identical to the original code. > > > > mrlocks have the mr_writer field that annotate that the is a writer > > locking the lock. The XFS asserts use it to assert that the lock that > > the current thread holds it for exclusive protection, which isn't > > actually what the field says, and this breaks when XFS uses synchronous > > execution of work_struct as basically an extension of the kernel stack. > > I'm still failing to see what's the problem of checking the last bit of > the rwsem ->count field. It is set when the sem is held for writing > (identical to what mr_write does). As I mention in the cover letter this > might be considered a bit hacky because it exposes an internal detail of > the rwsem i.e that the bit of interest is bit 0. I don't want to tear into the implementation details of rwsems if I can avoid it. Just because we all have one big happy address space doesn't mean shortcuts won't hose everyone. > But I believe the same > can be achieved using lockdep_is_held_type(xx, 0/1) and making XFS's > debug routines depend on lockdep being on. Pavel Reichl tried that two months ago in: https://lore.kernel.org/linux-xfs/20201102194135.174806-2-preichl@xxxxxxxxxx/ which resulted in fstests regressions: https://lore.kernel.org/linux-xfs/20201104005345.GC7115@magnolia/ TLDR: the ILOCK semaphore is data-centric, but lockdep's debugging chains are task-centric, which causes incorrect lock validation reports. The solutions as I see them are: (a) figure out if we really still need to defer bmbt splits to workers to avoid overflowing the kernel stack; or (b) making it possible to transfer rwsem ownership to shut up lockdep; or (c) fix the is_held predicate to ignore ownership. --D > > >