Re: [PATCH] fnotify: invalidate dcache before IN_DELETE event

Amir Goldstein <amir73il@xxxxxxxxx> · Thu, 20 Jan 2022 16:31:11 +0200

On Thu, Jan 20, 2022 at 2:52 PM Jan Kara <jack@xxxxxxx> wrote:
>
> On Tue 18-01-22 14:00:31, Amir Goldstein wrote:
> > Apparently, there are some applications that use IN_DELETE event as an
> > invalidation mechanism and expect that if they try to open a file with
> > the name reported with the delete event, that it should not contain the
> > content of the deleted file.
> >
> > Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of
> > d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify
> > will have access to a positive dentry.
> >
> > This allowed a race where opening the deleted file via cached dentry
> > is now possible after receiving the IN_DELETE event.
> >
> > To fix the regression, we use two different techniques:
> > 1) For call sites that call d_delete() with elevated refcount, convert
> >    the call to d_drop() and move the fsnotify hook after d_drop().
>
> Maybe do this in a separate patch? It's quite a bit of mostly mechanical
> changes, after separating them it is more obvious what the logical changes
> actually are (and backporting is actually less error prone as well).

ok.

>
> > 2) For the vfs helpers that may turn dentry to negative on d_delete(),
> >    use a helper d_delete_notify() to pin the inode, so we can pass it
> >    to an fsnotify hook after d_delete().
> >
> > Create a new hook fsnotify_delete() that allows to pass a negative
> > dentry and takes the unlinked inode as an argument.
> >
> > Add a missing fsnotify_unlink() hook in nfsdfs that was found during
> > the call sites audit.
> >
> > Note that the call sites in simple_recursive_removal() follow
> > d_invalidate(), so they require no change.
> >
> > Backporting hint: this regression is from v5.3. Although patch will
> > apply with only trivial conflicts to v5.4 and v5.10, it won't build,
> > because fsnotify_delete() implementation is different in each of those
> > versions (see fsnotify_link()).
> >
> > Reported-by: Ivan Delalande <colona@xxxxxxxxxx>
> > Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/
> > Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()")
> > Cc: stable@xxxxxxxxxxxxxxx # v5.3+
> > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
>
> ...
>
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 1f9d2187c765..b11991b57f9b 100644
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > @@ -3929,6 +3929,23 @@ SYSCALL_DEFINE2(mkdir, const char __user *, pathname, umode_t, mode)
> >       return do_mkdirat(AT_FDCWD, getname(pathname), mode);
> >  }
> >
> > +/**
> > + * d_delete_notify - delete a dentry and call fsnotify_delete()
> > + * @dentry: The dentry to delete
> > + *
> > + * This helper is used to guaranty that the unlinked inode cannot be found
>                              ^^^ guarantee
>
> > + * by lookup of this name after fsnotify_delete() event has been delivered.
> > + */
> > +static void d_delete_notify(struct inode *dir, struct dentry *dentry)
> > +{
> > +     struct inode *inode = d_inode(dentry);
> > +
> > +     ihold(inode);
> > +     d_delete(dentry);
> > +     fsnotify_delete(dir, inode, dentry);
> > +     iput(inode);
> > +}
> > +
> >  /**
> >   * vfs_rmdir - remove directory
> >   * @mnt_userns:      user namespace of the mount the inode was found from
> ...
> > @@ -4101,7 +4117,6 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
> >                       if (!error) {
> >                               dont_mount(dentry);
> >                               detach_mounts(dentry);
> > -                             fsnotify_unlink(dir, dentry);
> >                       }
> >               }
> >       }
> > @@ -4109,9 +4124,11 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
> >       inode_unlock(target);
> >
> >       /* We don't d_delete() NFS sillyrenamed files--they still exist. */
> > -     if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
> > +     if (dentry->d_flags & DCACHE_NFSFS_RENAMED) {
> > +             fsnotify_unlink(dir, dentry);
> > +     } else if (!error) {
> >               fsnotify_link_count(target);
> > -             d_delete(dentry);
> > +             d_delete_notify(dir, dentry);
> >       }
>
> Are we sure that if DCACHE_NFSFS_RENAMED is set, error == 0? Maybe yes but
> it is not completely clear to me - e.g. if you try to rename something to a
> name that is taken by sillyrenamed file, the unlink will fail but dentry
> has DCACHE_NFSFS_RENAMED set...
>

That's an oversight.

> > +/*
> > + * fsnotify_delete - @dentry was unlinked and unhashed
> > + *
> > + * Caller must make sure that dentry->d_name is stable.
> > + *
> > + * Note: unlike fsnotify_unlink(), we have to pass also the unlinked inode
> > + * as this may be called after d_delete() and old_dentry may be negative.
> > + */
> > +static inline void fsnotify_delete(struct inode *dir, struct inode *inode,
> > +                                struct dentry *dentry)
> > +{
> > +     __u32 mask = FS_DELETE;
> > +
> > +     if (S_ISDIR(inode->i_mode))
> > +             mask |= FS_ISDIR;
> > +
> > +     fsnotify_name(mask, inode, FSNOTIFY_EVENT_INODE, dir, &dentry->d_name,
> > +                   0);
> > +}
> > +
>
> OK, this is fine because we use dentry only for FAN_RENAME event, don't we?

Almost.
We also use dentry in FS_CREATE to get sb from d_sb for error event, because:
 * Note: some filesystems (e.g. kernfs) leave @dentry negative and instantiate
 * ->d_inode later

> In all other cases we always use only inode anyway. Can we perhaps cleanup
> include/linux/fsnotify.h to use FSNOTIFY_EVENT_DENTRY only in that one call
> site inside fsnotify_move() and use FSNOTIFY_EVENT_INODE in all the other
> cases? So that this is clear and also so that we don't start using dentry
> inadvertedly for something inside fsnotify thus breaking unlink reporting
> in subtle ways...
>

I don't know.
For fsnotify_unlink/rmdir we check d_is_negative, so it's fine to use
FSNOTIFY_EVENT_INODE.
For fsnotify_link,fsnotify_move we get the inode explicitly, but we already
use FSNOTIFY_EVENT_INODE in those cases (except FS_RENAME).

Thanks,
Amir.