On Tue 18-01-22 14:00:31, Amir Goldstein wrote: > Apparently, there are some applications that use IN_DELETE event as an > invalidation mechanism and expect that if they try to open a file with > the name reported with the delete event, that it should not contain the > content of the deleted file. > > Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of > d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify > will have access to a positive dentry. > > This allowed a race where opening the deleted file via cached dentry > is now possible after receiving the IN_DELETE event. > > To fix the regression, we use two different techniques: > 1) For call sites that call d_delete() with elevated refcount, convert > the call to d_drop() and move the fsnotify hook after d_drop(). Maybe do this in a separate patch? It's quite a bit of mostly mechanical changes, after separating them it is more obvious what the logical changes actually are (and backporting is actually less error prone as well). > 2) For the vfs helpers that may turn dentry to negative on d_delete(), > use a helper d_delete_notify() to pin the inode, so we can pass it > to an fsnotify hook after d_delete(). > > Create a new hook fsnotify_delete() that allows to pass a negative > dentry and takes the unlinked inode as an argument. > > Add a missing fsnotify_unlink() hook in nfsdfs that was found during > the call sites audit. > > Note that the call sites in simple_recursive_removal() follow > d_invalidate(), so they require no change. > > Backporting hint: this regression is from v5.3. Although patch will > apply with only trivial conflicts to v5.4 and v5.10, it won't build, > because fsnotify_delete() implementation is different in each of those > versions (see fsnotify_link()). > > Reported-by: Ivan Delalande <colona@xxxxxxxxxx> > Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ > Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") > Cc: stable@xxxxxxxxxxxxxxx # v5.3+ > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> ... > diff --git a/fs/namei.c b/fs/namei.c > index 1f9d2187c765..b11991b57f9b 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -3929,6 +3929,23 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode) > return do_mkdirat(AT_FDCWD, getname(pathname), mode); > } > > +/** > + * d_delete_notify - delete a dentry and call fsnotify_delete() > + * @dentry: The dentry to delete > + * > + * This helper is used to guaranty that the unlinked inode cannot be found ^^^ guarantee > + * by lookup of this name after fsnotify_delete() event has been delivered. > + */ > +static void d_delete_notify(struct inode *dir, struct dentry *dentry) > +{ > + struct inode *inode = d_inode(dentry); > + > + ihold(inode); > + d_delete(dentry); > + fsnotify_delete(dir, inode, dentry); > + iput(inode); > +} > + > /** > * vfs_rmdir - remove directory > * @mnt_userns: user namespace of the mount the inode was found from ... > @@ -4101,7 +4117,6 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir, > if (!error) { > dont_mount(dentry); > detach_mounts(dentry); > - fsnotify_unlink(dir, dentry); > } > } > } > @@ -4109,9 +4124,11 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir, > inode_unlock(target); > > /* We don't d_delete() NFS sillyrenamed files--they still exist. */ > - if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) { > + if (dentry->d_flags & DCACHE_NFSFS_RENAMED) { > + fsnotify_unlink(dir, dentry); > + } else if (!error) { > fsnotify_link_count(target); > - d_delete(dentry); > + d_delete_notify(dir, dentry); > } Are we sure that if DCACHE_NFSFS_RENAMED is set, error == 0? Maybe yes but it is not completely clear to me - e.g. if you try to rename something to a name that is taken by sillyrenamed file, the unlink will fail but dentry has DCACHE_NFSFS_RENAMED set... > +/* > + * fsnotify_delete - @dentry was unlinked and unhashed > + * > + * Caller must make sure that dentry->d_name is stable. > + * > + * Note: unlike fsnotify_unlink(), we have to pass also the unlinked inode > + * as this may be called after d_delete() and old_dentry may be negative. > + */ > +static inline void fsnotify_delete(struct inode *dir, struct inode *inode, > + struct dentry *dentry) > +{ > + __u32 mask = FS_DELETE; > + > + if (S_ISDIR(inode->i_mode)) > + mask |= FS_ISDIR; > + > + fsnotify_name(mask, inode, FSNOTIFY_EVENT_INODE, dir, &dentry->d_name, > + 0); > +} > + OK, this is fine because we use dentry only for FAN_RENAME event, don't we? In all other cases we always use only inode anyway. Can we perhaps cleanup include/linux/fsnotify.h to use FSNOTIFY_EVENT_DENTRY only in that one call site inside fsnotify_move() and use FSNOTIFY_EVENT_INODE in all the other cases? So that this is clear and also so that we don't start using dentry inadvertedly for something inside fsnotify thus breaking unlink reporting in subtle ways... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR