Re: [RFC] simplifying fast_dput(), dentry_kill() et.al.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 31, 2023 at 12:18:48AM +0000, Al Viro wrote:
> On Mon, Oct 30, 2023 at 12:18:28PM -1000, Linus Torvalds wrote:
> > On Mon, 30 Oct 2023 at 11:53, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > After fixing a couple of brainos, it seems to work.
> > 
> > This all makes me unnaturally nervous, probably because it;s overly
> > subtle, and I have lost the context for some of the rules.
> 
> A bit of context: I started to look at the possibility of refcount overflows.
> Writing the current rules for dentry refcounting and lifetime down was the
> obvious first step, and that immediately turned into an awful mess.
> 
> It is overly subtle.  Even more so when you throw the shrink lists into
> the mix - shrink_lock_dentry() got too smart for its own good, and that
> leads to really awful correctness proofs.  The next thing in the series
> is getting rid of the "it had been moved around, so somebody had clearly
> been taking/dropping references and we can just evict it from the
> shrink list and be done with that" crap - the things get much simpler
> if the rules become
> 	* call it under rcu_read_lock, with dentry locked
> 	* if returned true
> 		dentry, parent, inode locked, refcount is zero.
> 	* if returned false
> 		dentry locked, refcount is non-zero.
> It used to be that way, but removal of trylock loops had turned that
> into something much more subtle.  Restoring the old semantics without
> trylocks on the slow path is doable and it makes analysis much simpler.

It's also a perfect match to what we want in dentry_kill(), actually.
And looking into that has caught another place too subtle for its own
good:
        if (!IS_ROOT(dentry)) {
                parent = dentry->d_parent;
                if (unlikely(!spin_trylock(&parent->d_lock))) {
                        parent = __lock_parent(dentry);
                        if (likely(inode || !dentry->d_inode))
                                goto got_locks;
                        /* negative that became positive */
                        if (parent)
                                spin_unlock(&parent->d_lock);
                        inode = dentry->d_inode;
                        goto slow_positive;
                }
        }
        __dentry_kill(dentry);
        return parent;

slow_positive:
        spin_unlock(&dentry->d_lock);
        spin_lock(&inode->i_lock);
        spin_lock(&dentry->d_lock);
        parent = lock_parent(dentry);
got_locks:

That code (in dentry_kill()) relies upon the assumption that
positive dentry couldn't have become negative under us while
__lock_parent() had it unlocked.  Which is only true because
we have a positive refcount here.

IOW, the patch is broken as posted upthread.  It's really
not hard to fix, fortunately, and what we end up in dentry_kill()
looks a lot better that way -

static struct dentry *dentry_kill(struct dentry *dentry)
        __releases(dentry->d_lock) __releases(rcu)
{
        struct dentry *parent = NULL;
	if (likely(shrink_lock_dentry(dentry))) {
		if (!IS_ROOT(dentry))
			parent = dentry->d_parent;
		rcu_read_unlock();
		__dentry_kill(dentry);
	} else {
		rcu_read_unlock();
		spin_unlock(&dentry->d_lock);
	}
	return parent;
}

Carving that series up will be interesting, though...




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux