Re: [PATCH 9/9] rename(): avoid a deadlock in the case of parents having no common ancestor

Jan Kara <jack@xxxxxxx> · Thu, 23 Nov 2023 11:36:22 +0100

On Wed 22-11-23 19:36:52, Al Viro wrote:
> ... and fix the directory locking documentation and proof of correctness.
> Holding ->s_vfs_rename_mutex *almost* prevents ->d_parent changes; the
> case where we really don't want it is splicing the root of disconnected
> tree to somewhere.
> 
> In other words, ->s_vfs_rename_mutex is sufficient to stabilize "X is an
> ancestor of Y" only if X and Y are already in the same tree.  Otherwise
> it can go from false to true, and one can construct a deadlock on that.
> 
> Make lock_two_directories() report an error in such case and update the
> callers of lock_rename()/lock_rename_child() to handle such errors.
> 
> And yes, such conditions are not impossible to create ;-/
> 
> Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>

Looks good to me. Just one nit below but whether you decide to address it
or not, feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

> +// p1 != p2, both are on the same filesystem, ->s_vfs_rename_mutex is held
>  static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2)
>  {
> -	struct dentry *p;
> +	struct dentry *p = p1, *q = p2, *r;
>  
> -	p = d_ancestor(p2, p1);
> -	if (p) {
> +	while ((r = p->d_parent) != p2 && r != p)
> +		p = r;
> +	if (r == p2) {
> +		// p is a child of p2 and an ancestor of p1 or p1 itself
>  		inode_lock_nested(p2->d_inode, I_MUTEX_PARENT);
>  		inode_lock_nested(p1->d_inode, I_MUTEX_PARENT2);
>  		return p;
>  	}
> -
> -	p = d_ancestor(p1, p2);
> -	inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
> -	inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
> -	return p;
> +	// p is the root of connected component that contains p1
> +	// p2 does not occur on the path from p to p1
> +	while ((r = q->d_parent) != p1 && r != p && r != q)
> +		q = r;
> +	if (r == p1) {
> +		// q is a child of p1 and an ancestor of p2 or p2 itself
> +		inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
> +		inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
> +		return q;
> +	} else if (likely(r == p)) {
> +		// both p2 and p1 are descendents of p
> +		inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
> +		inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
> +		return NULL;
> +	} else { // no common ancestor at the time we'd been called
> +		mutex_unlock(&p1->d_sb->s_vfs_rename_mutex);

It would look more natural to me if s_vfs_rename_mutex got dropped in the
callers (lock_rename(), lock_rename_child()) which have acquired the lock
instead of here. I agree it results in a bit more boiler plate code though.

> +		return ERR_PTR(-EXDEV);
> +	}
>  }

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR