Re: [PATCH 17/22] don't try to cut corners in shrink_lock_dentry()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 09, 2023 at 06:20:51AM +0000, Al Viro wrote:
> That is to say, do *not* treat the ->d_inode or ->d_parent changes
> as "it's hard, return false; somebody must have grabbed it, so
> even if has zero refcount, we don't need to bother killing it -
> final dput() from whoever grabbed it would've done everything".
> 
> First of all, that is not guaranteed.  It might have been dropped
> by shrink_kill() handling of victim's parent, which would've found
> it already on a shrink list (ours) and decided that they don't need
> to put it on their shrink list.
> 
> What's more, dentry_kill() is doing pretty much the same thing,
> cutting its own set of corners (it assumes that dentry can't
> go from positive to negative, so its inode can change but only once
> and only in one direction).
> 
> Doing that right allows to get rid of that not-quite-duplication
> and removes the only reason for re-incrementing refcount before
> the call of dentry_kill().
> 
> Replacement is called lock_for_kill(); called under rcu_read_lock
> and with ->d_lock held.  If it returns false, dentry has non-zero
> refcount and the same locks are held.  If it returns true,
> dentry has zero refcount and all locks required by __dentry_kill()
> are taken.
> 
> Part of __lock_parent() had been lifted into lock_parent() to
> allow its reuse.  Now it's called with rcu_read_lock already
> held and dentry already unlocked.
> 
> Note that this is not the final change - locking requirements for
> __dentry_kill() are going to change later in the series and the
> set of locks taken by lock_for_kill() will be adjusted.
> 
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> ---

It's a bit unfortunate that __lock_parent() locks the parent *and* may
lock the child which isn't really obvious from the name. It just becomes
clear that this is assumed by how callers release the child's lock.

>  fs/dcache.c | 159 ++++++++++++++++++++++------------------------------
>  1 file changed, 66 insertions(+), 93 deletions(-)
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 23afcd48c1a9..801502871671 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -625,8 +625,6 @@ static void __dentry_kill(struct dentry *dentry)
>  static struct dentry *__lock_parent(struct dentry *dentry)
>  {
>  	struct dentry *parent;
> -	rcu_read_lock();
> -	spin_unlock(&dentry->d_lock);
>  again:
>  	parent = READ_ONCE(dentry->d_parent);
>  	spin_lock(&parent->d_lock);
> @@ -642,7 +640,6 @@ static struct dentry *__lock_parent(struct dentry *dentry)
>  		spin_unlock(&parent->d_lock);
>  		goto again;
>  	}
> -	rcu_read_unlock();
>  	if (parent != dentry)
>  		spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
>  	else
> @@ -657,7 +654,64 @@ static inline struct dentry *lock_parent(struct dentry *dentry)
>  		return NULL;
>  	if (likely(spin_trylock(&parent->d_lock)))
>  		return parent;
> -	return __lock_parent(dentry);
> +	rcu_read_lock();
> +	spin_unlock(&dentry->d_lock);
> +	parent = __lock_parent(dentry);
> +	rcu_read_unlock();
> +	return parent;
> +}
> +
> +/*
> + * Lock a dentry for feeding it to __dentry_kill().
> + * Called under rcu_read_lock() and dentry->d_lock; the former
> + * guarantees that nothing we access will be freed under us.
> + * Note that dentry is *not* protected from concurrent dentry_kill(),
> + * d_delete(), etc.
> + *
> + * Return false if dentry is busy.  Otherwise, return true and have
> + * that dentry's inode and parent both locked.
> + */
> +
> +static bool lock_for_kill(struct dentry *dentry)
> +{
> +	struct inode *inode = dentry->d_inode;
> +	struct dentry *parent = dentry->d_parent;
> +
> +	if (unlikely(dentry->d_lockref.count))
> +		return false;
> +
> +	if (inode && unlikely(!spin_trylock(&inode->i_lock)))
> +		goto slow;
> +	if (dentry == parent)
> +		return true;
> +	if (likely(spin_trylock(&parent->d_lock)))
> +		return true;
> +
> +	if (inode)
> +		spin_unlock(&inode->i_lock);
> +slow:
> +	spin_unlock(&dentry->d_lock);
> +
> +	for (;;) {
> +		if (inode)
> +			spin_lock(&inode->i_lock);
> +		parent = __lock_parent(dentry);

We're under rcu here. Are we sure that this can't trigger rcu timeouts
because we're spinning? Maybe there's a reason that's not an issue here.

That spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED) in
__lock_parent() is there for the sake of lockdep to verify that the
parent lock is always aqcuired before the child lock?




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux