Re: [PATCH][RFC] make take_dentry_name_snapshot() lockless

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 09, 2024 at 03:52:51AM +0000, Al Viro wrote:
> There's a bunch of places where we are accessing dentry names without
> sufficient protection and where locking environment is not predictable
> enough to fix the things that way; take_dentry_name_snapshot() is
> one variant of solution.  It does, however, have a problem - copying
> is cheap, but bouncing ->d_lock may be nasty on seriously shared dentries.
> 
> How about the following (completely untested)?
> 
> Use ->d_seq instead of grabbing ->d_lock; in case of shortname dentries
> that avoids any stores to shared data objects and in case of long names
> we are down to (unavoidable) atomic_inc on the external_name refcount.
>     
> Makes the thing safer as well - the areas where ->d_seq is held odd are
> all nested inside the areas where ->d_lock is held, and the latter are
> much more numerous.

Is there a problem retaining the lock acquire if things fail?

As in maybe loop 2-3 times, but eventually take the lock to guarantee forward
progress.

I don't think there is a *real* workload where this would be a problem,
but with core counts seen today one may be able to purposefuly introduce
stalls when running this.

>     
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> ---
> diff --git a/fs/dcache.c b/fs/dcache.c
> index b4d5e9e1e43d..78fd7e2a3011 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -329,16 +329,34 @@ static inline int dname_external(const struct dentry *dentry)
>  
>  void take_dentry_name_snapshot(struct name_snapshot *name, struct dentry *dentry)
>  {
> -	spin_lock(&dentry->d_lock);
> +	unsigned seq;
> +
> +	rcu_read_lock();
> +retry:
> +	seq = read_seqcount_begin(&dentry->d_seq);
>  	name->name = dentry->d_name;
> -	if (unlikely(dname_external(dentry))) {
> -		atomic_inc(&external_name(dentry)->u.count);
> -	} else {
> -		memcpy(name->inline_name, dentry->d_iname,
> -		       dentry->d_name.len + 1);
> +	if (read_seqcount_retry(&dentry->d_seq, seq))
> +		goto retry;
> +	// ->name and ->len are at least consistent with each other, so if
> +	// ->name points to dentry->d_iname, ->len is below DNAME_INLINE_LEN
> +	if (likely(name->name.name == dentry->d_iname)) {
> +		memcpy(name->inline_name, dentry->d_iname, name->name.len + 1);
>  		name->name.name = name->inline_name;
> +		if (read_seqcount_retry(&dentry->d_seq, seq))
> +			goto retry;
> +	} else {
> +		struct external_name *p;
> +		p = container_of(name->name.name, struct external_name, name[0]);
> +		// get a valid reference
> +		if (unlikely(!atomic_inc_not_zero(&p->u.count)))
> +			goto retry;
> +		if (read_seqcount_retry(&dentry->d_seq, seq)) {
> +			if (unlikely(atomic_dec_and_test(&p->u.count)))
> +				kfree_rcu(p, u.head);
> +			goto retry;
> +		}
>  	}
> -	spin_unlock(&dentry->d_lock);
> +	rcu_read_unlock();
>  }
>  EXPORT_SYMBOL(take_dentry_name_snapshot);
>  




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux