Re: [PATCH 20/20] 9p: fix ->rename_sem exclusion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 09, 2025 at 07:11:46PM -0800, Linus Torvalds wrote:
> On Thu, 9 Jan 2025 at 18:45, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > However, to reduce dentry_operations bloat, let's add one method instead -
> > ->d_want_unalias(alias, true) instead of ->d_unalias_trylock(alias) and
> > ->d_want_unalias(alias, false) instead of ->d_unalias_unlock(alias).
> 
> Ugh.
> 
> So of all the patches, this is the one that I hate.
> 
> I absolutely detest interfaces with random true/false arguments, and
> when it is about locking, the "detest" becomes something even darker
> and more visceral.
> 
> I think it would be a lot better as separate ops, considering that
> 
>  (a) we'll probably have only one or two actual users, so it's not
> like it complicates things on that side
> 
>  (b) we don't have *that* many "struct dentry_operations" structures:
> I think they are all statically generated constant structures
> (typically one or two per filesystem), so it's not like we're saving
> memory by merging those pointers into one.

ACK.

> Please?

Done and force-pushed; see below for updated variant of that commit

commit 1f28d77e868e63a07ab50e7fe161fc366b2fb23b
Author: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date:   Sun Jan 5 21:33:17 2025 -0500

    9p: fix ->rename_sem exclusion
    
    9p wants to be able to build a path from given dentry to fs root and keep
    it valid over a blocking operation.
    
    ->s_vfs_rename_mutex would be a natural candidate, but there are places
    where we need that and where we have no way to tell if ->s_vfs_rename_mutex
    is already held deeper in callchain.  Moreover, it's only held for
    cross-directory renames; name changes within the same directory happen
    without it.
    
    Solution:
            * have d_move() done in ->rename() rather than in its caller
            * maintain a 9p-private rwsem (per-filesystem)
            * hold it exclusive over the relevant part of ->rename()
            * hold it shared over the places where we want the path.
    
    That almost works.  FS_RENAME_DOES_D_MOVE is enough to put all d_move()
    and d_exchange() calls under filesystem's control.  However, there's
    also __d_unalias(), which isn't covered by any of that.
    
    If ->lookup() hits a directory inode with preexisting dentry elsewhere
    (due to e.g. rename done on server behind our back), d_splice_alias()
    called by ->lookup() will move/rename that alias.
    
    Add a couple of optional methods, so that __d_unalias() would do
            if alias->d_op->d_unalias_trylock != NULL
                    if (!alias->d_op->d_unalias_trylock(alias))
                            fail (resulting in -ESTALE from lookup)
            __d_move(...)
            if alias->d_op->d_unalias_unlock != NULL
                    alias->d_unalias_unlock(alias)
    where it currently does __d_move().  9p instances do down_write_trylock()
    and up_write() of ->rename_mutex.
    
    Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 146e7d8aa736..d20a32b77b60 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -31,6 +31,8 @@ prototypes::
 	struct vfsmount *(*d_automount)(struct path *path);
 	int (*d_manage)(const struct path *, bool);
 	struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
+	bool (*d_unalias_trylock)(const struct dentry *);
+	void (*d_unalias_unlock)(const struct dentry *);
 
 locking rules:
 
@@ -50,6 +52,8 @@ d_dname:	   no		no		no		no
 d_automount:	   no		no		yes		no
 d_manage:	   no		no		yes (ref-walk)	maybe
 d_real		   no		no		yes 		no
+d_unalias_trylock  yes		no		no 		no
+d_unalias_unlock   yes		no		no 		no
 ================== ===========	========	==============	========
 
 inode_operations
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 7c352ebaae98..31eea688609a 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -1265,6 +1265,8 @@ defined:
 		struct vfsmount *(*d_automount)(struct path *);
 		int (*d_manage)(const struct path *, bool);
 		struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
+		bool (*d_unalias_trylock)(const struct dentry *);
+		void (*d_unalias_unlock)(const struct dentry *);
 	};
 
 ``d_revalidate``
@@ -1428,6 +1430,25 @@ defined:
 
 	For non-regular files, the 'dentry' argument is returned.
 
+``d_unalias_trylock``
+	if present, will be called by d_splice_alias() before moving a
+	preexisting attached alias.  Returning false prevents __d_move(),
+	making d_splice_alias() fail with -ESTALE.
+
+	Rationale: setting FS_RENAME_DOES_D_MOVE will prevent d_move()
+	and d_exchange() calls from the outside of filesystem methods;
+	however, it does not guarantee that attached dentries won't
+	be renamed or moved by d_splice_alias() finding a preexisting
+	alias for a directory inode.  Normally we would not care;
+	however, something that wants to stabilize the entire path to
+	root over a blocking operation might need that.  See 9p for one
+	(and hopefully only) example.
+
+``d_unalias_unlock``
+	should be paired with ``d_unalias_trylock``; that one is called after
+	__d_move() call in __d_unalias().
+
+
 Each dentry has a pointer to its parent dentry, as well as a hash list
 of child dentries.  Child dentries are basically like files in a
 directory.
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
index 698c43dd5dc8..f28bc763847a 100644
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -202,7 +202,7 @@ static inline struct v9fs_session_info *v9fs_inode2v9ses(struct inode *inode)
 	return inode->i_sb->s_fs_info;
 }
 
-static inline struct v9fs_session_info *v9fs_dentry2v9ses(struct dentry *dentry)
+static inline struct v9fs_session_info *v9fs_dentry2v9ses(const struct dentry *dentry)
 {
 	return dentry->d_sb->s_fs_info;
 }
diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index 872c1abe3295..5061f192eafd 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -105,14 +105,30 @@ static int v9fs_lookup_revalidate(struct inode *dir, const struct qstr *name,
 	return __v9fs_lookup_revalidate(dentry, flags);
 }
 
+static bool v9fs_dentry_unalias_trylock(const struct dentry *dentry)
+{
+	struct v9fs_session_info *v9ses = v9fs_dentry2v9ses(dentry);
+	return down_write_trylock(&v9ses->rename_sem);
+}
+
+static void v9fs_dentry_unalias_unlock(const struct dentry *dentry)
+{
+	struct v9fs_session_info *v9ses = v9fs_dentry2v9ses(dentry);
+	up_write(&v9ses->rename_sem);
+}
+
 const struct dentry_operations v9fs_cached_dentry_operations = {
 	.d_revalidate = v9fs_lookup_revalidate,
 	.d_weak_revalidate = __v9fs_lookup_revalidate,
 	.d_delete = v9fs_cached_dentry_delete,
 	.d_release = v9fs_dentry_release,
+	.d_unalias_trylock = v9fs_dentry_unalias_trylock,
+	.d_unalias_unlock = v9fs_dentry_unalias_unlock,
 };
 
 const struct dentry_operations v9fs_dentry_operations = {
 	.d_delete = always_delete_dentry,
 	.d_release = v9fs_dentry_release,
+	.d_unalias_trylock = v9fs_dentry_unalias_trylock,
+	.d_unalias_unlock = v9fs_dentry_unalias_unlock,
 };
diff --git a/fs/dcache.c b/fs/dcache.c
index 7d42ca367522..2ac614fc8bba 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2961,7 +2961,12 @@ static int __d_unalias(struct dentry *dentry, struct dentry *alias)
 		goto out_err;
 	m2 = &alias->d_parent->d_inode->i_rwsem;
 out_unalias:
+	if (alias->d_op->d_unalias_trylock &&
+	    !alias->d_op->d_unalias_trylock(alias))
+		goto out_err;
 	__d_move(alias, dentry, false);
+	if (alias->d_op->d_unalias_unlock)
+		alias->d_op->d_unalias_unlock(alias);
 	ret = 0;
 out_err:
 	if (m2)
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 4a6bdadf2f29..9a1a30857763 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -159,6 +159,8 @@ struct dentry_operations {
 	struct vfsmount *(*d_automount)(struct path *);
 	int (*d_manage)(const struct path *, bool);
 	struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
+	bool (*d_unalias_trylock)(const struct dentry *);
+	void (*d_unalias_unlock)(const struct dentry *);
 } ____cacheline_aligned;
 
 /*




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux