Re: [PATCH, RFC] xfs: remove i_iolock and use i_rwsem in the VFS inode instead

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 9 Sep 2016 11:06:01 +1000

On Thu, Sep 08, 2016 at 08:54:42AM +0200, Peter Zijlstra wrote:
> On Thu, Sep 08, 2016 at 07:45:36AM +1000, Dave Chinner wrote:
> > On Mon, Sep 05, 2016 at 05:15:29PM +0200, Christoph Hellwig wrote:
> > > Hi Dave,
> > > 
> > > I looked into killing the mrlock and ran into an unexpected problem.
> > > 
> > > Currently mr_writer tracks that there is someone holding a write lock,
> > > lockdep on the other hand checks if the calling thread has that lock.
> > > 
> > > While that generally is the right semantic, our hack to offload
> > > btree splits to a work item offends lockdep.  E.g. this callstack
> > > now asserts:
> > 
> > It's a semaphore, not a mutex. Semaphore locking is independent of
> > task context, the lock follows the object it protects, not the task
> > that took the lock. i.e. Lockdep is wrong to assume the "owner" of a
> > rw_sem will not change between lock and unlock.
> 
> We've added strict owner semantics to rwsem a long time ago.

/me sighs.

History repeats.

We went through all this crap about semaphores and strict ownership
a few years ago when someone tried to make the struct semaphore
require strict ownership and XFS went all explodey. :/

> If you want the actual semaphore semantics (which we greatly discourage,
> because you cannot do validation on it)

What you actually means is "you cannot use /lockdep/ on it", not
"cannot do validation" on it.  Indeed, this discussion started from
wanting to remove the wrappers that implement XFS context
specific locking validation on top of rwsems.

Yes, lockdep has it's uses, but let's not ignore the fact that it
also has significant limitations, introduces serious annotation
complexity for non-trivial locking models (e.g. XFS inodes), and
many people use it as a crutch to avoid thinking about locking
models (i.e. "lockdep doesn't complain, so it must be good!").

>From my perspective, lockdep is a very poor replacement for
architecting a robust locking model and then implementing code that
directly asserts that the correct locks are held at the correct
time. We trip over failed XFS locking asserts during development all
the time, but it's very, very rare to have lockdep tell us we have a
problem in XFS because we catch the problems long before lockdep
will notice them.

> you should use
> {down,up}_read_non_owner().
> I'm not sure we've got write_non_owner() variants for this.

For the case Christoph reported, it's the write side context of the
inode locks that is handed off to other threads. And no, we don't
have annotations for that.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs