Re: fsnotify path hooks

Amir Goldstein <amir73il@xxxxxxxxx> · Tue, 20 Apr 2021 01:04:29 +0300

On Mon, Apr 19, 2021 at 8:02 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Apr 19, 2021 at 07:41:51PM +0300, Amir Goldstein wrote:
>
> > Would you be willing to make an exception for notify_change()
> > and pass mnt arg to the helper? and if so, which of the following
> > is the lesser evil in your opinion:
> >
> > 1. Optional mnt arg
> > --------------------------
> > int notify_change(struct vfsmount *mnt,
> >                  struct user_namespace *mnt_userns,
> >                  struct dentry *dentry, struct iattr *attr,
> >                  struct inode **delegated_inode)
> >
> > @mnt is non-NULL from syscalls and nfsd and NULL from other callers.
> >
> > 2. path instead of dentry
> > --------------------------------
> > int notify_change(struct user_namespace *mnt_userns,
> >                  struct path *path, struct iattr *attr,
> >                  struct inode **delegated_inode)
> >
> > This is symmetric with vfs_getattr().
> > syscalls and nfsd use the actual path.
> > overlayfs, ecryptfs, cachefiles compose a path from the private mount
> > (Christian posted patches to make ecryptfs, cachefiles mount private).
> >
> > 3. Mandatory mnt arg
> > -----------------------------
> > Like #1, but use some private mount instead of NULL, similar to the
> > mnt_userns arg.
> >
> > Any of the above acceptable?
> >
> > Pushed option #1 (along with rest of the work) to:
> > https://github.com/amir73il/linux/commits/fsnotify_path_hooks
> >
> > It's only sanity tested.
>
>         Out of that bunch only #2 is more or less tolerable.

Tolerable works for me.

> HOWEVER, if we go that way, mnt_user_ns crap must go, and

Christian requested that I refrain from re-acquiring mnt_user_ns
from mnt after it had already been used for security checks,
for example:
 do_open()
    may_create_in_sticky(mnt_userns,...)
    may_open(mnt_userns,...)
    handle_truncate(mnt_userns,...
        do_truncate(mnt_userns,...
              notify_change(mnt_userns,...

Although, I am not sure exactly why.
Isn't mnt_userns supposed to be stable after the mount is
connected to the namespace?
What is the concern from re-quiring mnt_userns from path->mnt
inside notify_change()?

> I really want to see details on all callers - which mount are
> you going to use in each case.

The callers are:
cachefiles, ecryptfs, nfsd, devtmpfs,
do_truncate(), vfs_utimes() and file_remove_privs()

* cachefiles, ecryptfs, nfsd compose paths from stashed
mount like this all the time (e.g. for vfs_truncate(), vf_getattr()).

* devtmpfs has the parent path from and also uses it to
compose child path for vfs_getattr().

* vfs_utimes() and all callers of do_truncate() already have the
path, just need to pass it through to notify_change()

>
>         The thing that is not going to be acceptable is
> a combination of mount from one filesystem and dentry from
> another.  In particular, file_remove_privs() is going to be
> interesting.
>
>         Note, BTW, that ftruncate() and file_remove_privs()
> are different in that respect - the latter hits d_real()
> (by way of file_dentry()), the former does not.  Which one
> is correct and if both are, why are their needs different?

Nowadays (>= v4.19) I think the only files whose file_inode() and
f_path do not agree are the overlayfs "real.file" that can find their
way to f_mapping and to some vfs helpers and from there to
filesystem ops and to file_modified() or generic_file_write_iter()
and to file_remove_privs().

Contrary to that, overlayfs does not call any vfs truncate()
helper, it calls notify_change() directly (with a composed path).

So what should we do about file_remove_privs()?
Since I don't think we really need to care about generating an
event on file_remove_privs(), perhaps it could call __notify_change()
that does not generate an event and the rest of the callers call this wrapper:

int notify_change(struct path *path, struct iattr *attr,
                            struct inode **delegated_inode)
{
        unsigned int ia_valid;
        int error = __notify_change(mnt_user_ns(path->mnt), path->dentry,
                                                    attr, &ia_valid,
delegated_inode);

        if (!error)
                fsnotify_change(path, ia_valid);
        return error;
}

Does this make sense?

Thanks,
Amir.