Re: dcache locking question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 15, 2019 at 10:38:23AM -0700, Eric Biggers wrote:
> On Fri, Mar 15, 2019 at 01:50:21AM +0000, Al Viro wrote:
> > 
> > If it fails, we call __lock_parent().  Which
> > 	* grabs RCU lock
> > 	* drops ->d_lock (now we are not holding ->d_lock
> > on anything).
> > 	* fetches ->d_parent.  Note the READ_ONCE() there -
> > it's *NOT* stable (no ->d_lock held).  We can't expect
> > that ->d_parent won't change or that the reference it used
> > to contribute to parent's refcount is there anymore; as
> > the matter of fact, the only thing that prevents outright
> > _freeing_ of the object 'parent' points to is rcu_read_lock()
> > and RCU delay between dropping the last reference and
> > actual freeing of the sucker.  rcu_read_lock() is there,
> > though, which makes it safe to grab ->d_lock on 'parent'.
> > 
> > That 'parent' might very well have nothing to do with our
> > dentry by now.  We can check if it's equal to its
> > ->d_parent, though.  dentry->d_parent is *NOT* stable
> > at that point.  It might be changing right now.
> > 
> > However, the first store to dentry->d_parent making it
> > not equal to parent would have been done under parent->d_lock.
> > And since we are holding parent->d_lock, we won't miss that
> > store.  We might miss subsequent ones, but if we observe
> > dentry->d_parent == parent, we know that it's stable.  And
> > if we see dentry->d_parent != parent, we know that dentry
> > has moved around and we need to retry anyway.
> 
> Why isn't it necessary to use READ_ONCE(dentry->d_parent) here?
> 
> 	if (unlikely(parent != dentry->d_parent)) {
> 
> Suppose 'parent' is 0xAAAABBBB, and 'dentry->d_parent' is 0xAAAAAAAA and is
> concurrently changed to 0xBBBBBBBB.
> 
> d_parent could be read in two parts, 0xAAAA then 0xBBBB, resulting in it
> appearing that d_parent == 0xAAAABBBB == parent.
> 
> Yes it won't really be compiled as that in practice, but I thought the point of
> READ_ONCE() is to *guarantee* it's really done right...

READ_ONCE does not add any extra warranties of atomicity.  Fetches and stores
of pointers are atomic, period; if that ever breaks, we are in a very deep
trouble all over the place.

What's more, spin_lock acts as a compiler barrier and, on SMP, is an ACQUIRE
operation.  So that second fetch of ->d_parent will happen after we grab
parent->d_lock, from everyone's POV.  Critical areas for the same spinlock
are ordered wrt each other.  So we have observed
	FETCH dentry->d_parent => parent
	LOCK parent->d_lock
	FETCH dentry->d_parent => parent
All stores to dentry->d_parent are done with ->d_lock held on dentry,
old value of dentry->d_parent *and* new value of dentry->d_parent.  So
the second fetch is ordered wrt all stores making dentry->d_parent change
to parent and all stores making it change *from* parent.  We might miss
some stores changing it from one value other than parent to another such,
but the predicate itself is fine and will stay fine until we drop
parent->d_lock.

Paul, could you comment on that one?  The function in question is
this:
static struct dentry *__lock_parent(struct dentry *dentry)
{
        struct dentry *parent;
        rcu_read_lock();
        spin_unlock(&dentry->d_lock);
again:
        parent = READ_ONCE(dentry->d_parent);
        spin_lock(&parent->d_lock);
        /*
         * We can't blindly lock dentry until we are sure
         * that we won't violate the locking order.
         * Any changes of dentry->d_parent must have
         * been done with parent->d_lock held, so
         * spin_lock() above is enough of a barrier
         * for checking if it's still our child.
         */
        if (unlikely(parent != dentry->d_parent)) {
                spin_unlock(&parent->d_lock);
                goto again;
        }
        rcu_read_unlock();
        if (parent != dentry)
                spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
        else
                parent = NULL;
        return parent;
}
(in fs/dcache.c) and all stores to ->d_parent are guaranteed to be done
under ->d_lock on dentry itself and ->d_lock on both old and new values
of ->d_parent.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux