On Mon, Dec 09, 2024 at 06:02:06PM +0100, Miklos Szeredi wrote: > On Sun, 8 Dec 2024 at 22:26, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > On Fri, Dec 06, 2024 at 04:11:52PM +0100, Miklos Szeredi wrote: > > > I wanted to see how feasible this would be and so I've added my changes > > on top of your patch. Please see the appended UNTESTED DIFF. > > Why a separate list for connected unmounts and for mounts? Can't the > same list be used for both? Yes, they sure can. I was just being overly explicit. > > > > +static void mnt_add_to_ns(struct mnt_namespace *ns, struct mount *mnt, struct list_head *notif) > > > +{ > > > + __mnt_add_to_ns(ns, mnt); > > > + queue_notify(ns, mnt, notif); > > > > All but one call to mnt_add_to_ns() passes NULL. I would just add a > > mnt_add_to_ns_notify() helper and leave all the other callers as is. > > Still need the else branch from queue_notify() otherwise the prev_ns > logic breaks. Yep. > > > > > > void dissolve_on_fput(struct vfsmount *mnt) > > > { > > > struct mnt_namespace *ns; > > > + LIST_HEAD(notif); > > > + > > > namespace_lock(); > > > lock_mount_hash(); > > > ns = real_mount(mnt)->mnt_ns; > > > if (ns) { > > > if (is_anon_ns(ns)) > > > - umount_tree(real_mount(mnt), UMOUNT_CONNECTED); > > > + umount_tree(real_mount(mnt), ¬if, UMOUNT_CONNECTED); > > > > This shouldn't notify as it's currently impossible to place mark on an > > anonymous mount. > > Yeah, I was first undecided whether to allow notification on anon > namespaces, but then opted not to for simplicity. > > > > @@ -1855,8 +1906,18 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask, > > > mnt = path.mnt; > > > if (mark_type == FAN_MARK_MOUNT) > > > obj = mnt; > > > - else > > > + else if (mark_type == FAN_MARK_FILESYSTEM) > > > obj = mnt->mnt_sb; > > > + else /* if (mark_type == FAN_MARK_MNTNS) */ { > > > + mntns = get_ns_from_mnt(mnt); > > > > I would prefer to be strict here and require that an actual mount > > namespace file descriptor is passed instead of allowing the mount > > namespace to be derived from any file descriptor. > > Okay. > > > > > > + ret = -EINVAL; > > > + if (!mntns) > > > + goto path_put_and_out; > > > + /* don't allow anon ns yet */ > > > + if (is_anon_ns(mntns)) > > > + goto path_put_and_out; > > > > Watching an anoymous mount namespace doesn't yet make sense because you > > currently cannot add or remove mounts in them apart from closing the > > file descriptor and destroying the whole mount namespace. I just > > remember that I have a pending patch series related to this comment. I > > haven't had the time to finish it with tests yet though maybe I can find > > a few days in December to finish the tests... > > Okay. > > > > > > @@ -549,8 +549,10 @@ static void restore_mounts(struct list_head *to_restore) > > > mp = parent->mnt_mp; > > > parent = parent->mnt_parent; > > > } > > > - if (parent != mnt->mnt_parent) > > > + if (parent != mnt->mnt_parent) { > > > + /* FIXME: does this need to trigger a MOVE fsnotify event */ > > > mnt_change_mountpoint(parent, mp, mnt); > > > > This is what I mentally always referred to as "rug-pulling umount > > propagation". So basically for the case where we have a locked mount > > (stuff that was overmounted when the mntns was created) or a mount with > > children that aren't going/can't be unmounted. In both cases it's > > necessary to reparent the mount. > > > > The watcher will see a umount event for the parent of that mount but > > that's not enough information because the watcher could end up infering > > that all child mounts of the mount have vanished as well which is > > obviously not the case. > > > > So I think that we need to generate a FS_MNT_MOVE event for mounts that > > got reparented. > > Yep. > > Thanks, > Miklos