On Sun, 8 Dec 2024 at 22:26, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Fri, Dec 06, 2024 at 04:11:52PM +0100, Miklos Szeredi wrote: > I wanted to see how feasible this would be and so I've added my changes > on top of your patch. Please see the appended UNTESTED DIFF. Why a separate list for connected unmounts and for mounts? Can't the same list be used for both? > > +static void mnt_add_to_ns(struct mnt_namespace *ns, struct mount *mnt, struct list_head *notif) > > +{ > > + __mnt_add_to_ns(ns, mnt); > > + queue_notify(ns, mnt, notif); > > All but one call to mnt_add_to_ns() passes NULL. I would just add a > mnt_add_to_ns_notify() helper and leave all the other callers as is. Still need the else branch from queue_notify() otherwise the prev_ns logic breaks. > > > void dissolve_on_fput(struct vfsmount *mnt) > > { > > struct mnt_namespace *ns; > > + LIST_HEAD(notif); > > + > > namespace_lock(); > > lock_mount_hash(); > > ns = real_mount(mnt)->mnt_ns; > > if (ns) { > > if (is_anon_ns(ns)) > > - umount_tree(real_mount(mnt), UMOUNT_CONNECTED); > > + umount_tree(real_mount(mnt), ¬if, UMOUNT_CONNECTED); > > This shouldn't notify as it's currently impossible to place mark on an > anonymous mount. Yeah, I was first undecided whether to allow notification on anon namespaces, but then opted not to for simplicity. > > @@ -1855,8 +1906,18 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask, > > mnt = path.mnt; > > if (mark_type == FAN_MARK_MOUNT) > > obj = mnt; > > - else > > + else if (mark_type == FAN_MARK_FILESYSTEM) > > obj = mnt->mnt_sb; > > + else /* if (mark_type == FAN_MARK_MNTNS) */ { > > + mntns = get_ns_from_mnt(mnt); > > I would prefer to be strict here and require that an actual mount > namespace file descriptor is passed instead of allowing the mount > namespace to be derived from any file descriptor. Okay. > > > + ret = -EINVAL; > > + if (!mntns) > > + goto path_put_and_out; > > + /* don't allow anon ns yet */ > > + if (is_anon_ns(mntns)) > > + goto path_put_and_out; > > Watching an anoymous mount namespace doesn't yet make sense because you > currently cannot add or remove mounts in them apart from closing the > file descriptor and destroying the whole mount namespace. I just > remember that I have a pending patch series related to this comment. I > haven't had the time to finish it with tests yet though maybe I can find > a few days in December to finish the tests... Okay. > > > @@ -549,8 +549,10 @@ static void restore_mounts(struct list_head *to_restore) > > mp = parent->mnt_mp; > > parent = parent->mnt_parent; > > } > > - if (parent != mnt->mnt_parent) > > + if (parent != mnt->mnt_parent) { > > + /* FIXME: does this need to trigger a MOVE fsnotify event */ > > mnt_change_mountpoint(parent, mp, mnt); > > This is what I mentally always referred to as "rug-pulling umount > propagation". So basically for the case where we have a locked mount > (stuff that was overmounted when the mntns was created) or a mount with > children that aren't going/can't be unmounted. In both cases it's > necessary to reparent the mount. > > The watcher will see a umount event for the parent of that mount but > that's not enough information because the watcher could end up infering > that all child mounts of the mount have vanished as well which is > obviously not the case. > > So I think that we need to generate a FS_MNT_MOVE event for mounts that > got reparented. Yep. Thanks, Miklos