On Thu, Jan 09, 2025 at 07:11:46PM -0800, Linus Torvalds wrote: > On Thu, 9 Jan 2025 at 18:45, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > However, to reduce dentry_operations bloat, let's add one method instead - > > ->d_want_unalias(alias, true) instead of ->d_unalias_trylock(alias) and > > ->d_want_unalias(alias, false) instead of ->d_unalias_unlock(alias). > > Ugh. > > So of all the patches, this is the one that I hate. > > I absolutely detest interfaces with random true/false arguments, and > when it is about locking, the "detest" becomes something even darker > and more visceral. > > I think it would be a lot better as separate ops, considering that > > (a) we'll probably have only one or two actual users, so it's not > like it complicates things on that side > > (b) we don't have *that* many "struct dentry_operations" structures: > I think they are all statically generated constant structures > (typically one or two per filesystem), so it's not like we're saving > memory by merging those pointers into one. ACK. > Please? Done and force-pushed; see below for updated variant of that commit commit 1f28d77e868e63a07ab50e7fe161fc366b2fb23b Author: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Date: Sun Jan 5 21:33:17 2025 -0500 9p: fix ->rename_sem exclusion 9p wants to be able to build a path from given dentry to fs root and keep it valid over a blocking operation. ->s_vfs_rename_mutex would be a natural candidate, but there are places where we need that and where we have no way to tell if ->s_vfs_rename_mutex is already held deeper in callchain. Moreover, it's only held for cross-directory renames; name changes within the same directory happen without it. Solution: * have d_move() done in ->rename() rather than in its caller * maintain a 9p-private rwsem (per-filesystem) * hold it exclusive over the relevant part of ->rename() * hold it shared over the places where we want the path. That almost works. FS_RENAME_DOES_D_MOVE is enough to put all d_move() and d_exchange() calls under filesystem's control. However, there's also __d_unalias(), which isn't covered by any of that. If ->lookup() hits a directory inode with preexisting dentry elsewhere (due to e.g. rename done on server behind our back), d_splice_alias() called by ->lookup() will move/rename that alias. Add a couple of optional methods, so that __d_unalias() would do if alias->d_op->d_unalias_trylock != NULL if (!alias->d_op->d_unalias_trylock(alias)) fail (resulting in -ESTALE from lookup) __d_move(...) if alias->d_op->d_unalias_unlock != NULL alias->d_unalias_unlock(alias) where it currently does __d_move(). 9p instances do down_write_trylock() and up_write() of ->rename_mutex. Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 146e7d8aa736..d20a32b77b60 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -31,6 +31,8 @@ prototypes:: struct vfsmount *(*d_automount)(struct path *path); int (*d_manage)(const struct path *, bool); struct dentry *(*d_real)(struct dentry *, enum d_real_type type); + bool (*d_unalias_trylock)(const struct dentry *); + void (*d_unalias_unlock)(const struct dentry *); locking rules: @@ -50,6 +52,8 @@ d_dname: no no no no d_automount: no no yes no d_manage: no no yes (ref-walk) maybe d_real no no yes no +d_unalias_trylock yes no no no +d_unalias_unlock yes no no no ================== =========== ======== ============== ======== inode_operations diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 7c352ebaae98..31eea688609a 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -1265,6 +1265,8 @@ defined: struct vfsmount *(*d_automount)(struct path *); int (*d_manage)(const struct path *, bool); struct dentry *(*d_real)(struct dentry *, enum d_real_type type); + bool (*d_unalias_trylock)(const struct dentry *); + void (*d_unalias_unlock)(const struct dentry *); }; ``d_revalidate`` @@ -1428,6 +1430,25 @@ defined: For non-regular files, the 'dentry' argument is returned. +``d_unalias_trylock`` + if present, will be called by d_splice_alias() before moving a + preexisting attached alias. Returning false prevents __d_move(), + making d_splice_alias() fail with -ESTALE. + + Rationale: setting FS_RENAME_DOES_D_MOVE will prevent d_move() + and d_exchange() calls from the outside of filesystem methods; + however, it does not guarantee that attached dentries won't + be renamed or moved by d_splice_alias() finding a preexisting + alias for a directory inode. Normally we would not care; + however, something that wants to stabilize the entire path to + root over a blocking operation might need that. See 9p for one + (and hopefully only) example. + +``d_unalias_unlock`` + should be paired with ``d_unalias_trylock``; that one is called after + __d_move() call in __d_unalias(). + + Each dentry has a pointer to its parent dentry, as well as a hash list of child dentries. Child dentries are basically like files in a directory. diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h index 698c43dd5dc8..f28bc763847a 100644 --- a/fs/9p/v9fs.h +++ b/fs/9p/v9fs.h @@ -202,7 +202,7 @@ static inline struct v9fs_session_info *v9fs_inode2v9ses(struct inode *inode) return inode->i_sb->s_fs_info; } -static inline struct v9fs_session_info *v9fs_dentry2v9ses(struct dentry *dentry) +static inline struct v9fs_session_info *v9fs_dentry2v9ses(const struct dentry *dentry) { return dentry->d_sb->s_fs_info; } diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c index 872c1abe3295..5061f192eafd 100644 --- a/fs/9p/vfs_dentry.c +++ b/fs/9p/vfs_dentry.c @@ -105,14 +105,30 @@ static int v9fs_lookup_revalidate(struct inode *dir, const struct qstr *name, return __v9fs_lookup_revalidate(dentry, flags); } +static bool v9fs_dentry_unalias_trylock(const struct dentry *dentry) +{ + struct v9fs_session_info *v9ses = v9fs_dentry2v9ses(dentry); + return down_write_trylock(&v9ses->rename_sem); +} + +static void v9fs_dentry_unalias_unlock(const struct dentry *dentry) +{ + struct v9fs_session_info *v9ses = v9fs_dentry2v9ses(dentry); + up_write(&v9ses->rename_sem); +} + const struct dentry_operations v9fs_cached_dentry_operations = { .d_revalidate = v9fs_lookup_revalidate, .d_weak_revalidate = __v9fs_lookup_revalidate, .d_delete = v9fs_cached_dentry_delete, .d_release = v9fs_dentry_release, + .d_unalias_trylock = v9fs_dentry_unalias_trylock, + .d_unalias_unlock = v9fs_dentry_unalias_unlock, }; const struct dentry_operations v9fs_dentry_operations = { .d_delete = always_delete_dentry, .d_release = v9fs_dentry_release, + .d_unalias_trylock = v9fs_dentry_unalias_trylock, + .d_unalias_unlock = v9fs_dentry_unalias_unlock, }; diff --git a/fs/dcache.c b/fs/dcache.c index 7d42ca367522..2ac614fc8bba 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2961,7 +2961,12 @@ static int __d_unalias(struct dentry *dentry, struct dentry *alias) goto out_err; m2 = &alias->d_parent->d_inode->i_rwsem; out_unalias: + if (alias->d_op->d_unalias_trylock && + !alias->d_op->d_unalias_trylock(alias)) + goto out_err; __d_move(alias, dentry, false); + if (alias->d_op->d_unalias_unlock) + alias->d_op->d_unalias_unlock(alias); ret = 0; out_err: if (m2) diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 4a6bdadf2f29..9a1a30857763 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -159,6 +159,8 @@ struct dentry_operations { struct vfsmount *(*d_automount)(struct path *); int (*d_manage)(const struct path *, bool); struct dentry *(*d_real)(struct dentry *, enum d_real_type type); + bool (*d_unalias_trylock)(const struct dentry *); + void (*d_unalias_unlock)(const struct dentry *); } ____cacheline_aligned; /*