> On Apr 14, 2023, at 09:41, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Fri, Apr 14, 2023 at 06:06:38AM -0400, Jeff Layton wrote: >> On Fri, 2023-04-14 at 03:43 +0100, Al Viro wrote: >>> On Fri, Apr 14, 2023 at 08:41:03AM +1000, NeilBrown wrote: >>> >>>> The path name that appears in /proc/mounts is the key that must be used >>>> to find and unmount a filesystem. When you do that "find"ing you are >>>> not looking up a name in a filesystem, you are looking up a key in the >>>> mount table. >>> >>> No. The path name in /proc/mounts is *NOT* a key - it's a best-effort >>> attempt to describe the mountpoint. Pathname resolution does not work >>> in terms of "the longest prefix is found and we handle the rest within >>> that filesystem". >>> >>>> We could, instead, create an api that is given a mount-id (first number >>>> in /proc/self/mountinfo) and unmounts that. Then /sbin/umount could >>>> read /proc/self/mountinfo, find the mount-id, and unmount it - all >>>> without ever doing path name lookup in the traditional sense. >>>> >>>> But I prefer your suggestion. LOOKUP_MOUNTPOINT could be renamed >>>> LOOKUP_CACHED, and it only finds paths that are in the dcache, never >>>> revalidates, at most performs simple permission checks based on cached >>>> content. >>> >>> umount /proc/self/fd/42/barf/something >>> >> >> Does any of that involve talking to the server? I don't necessarily see >> a problem with doing the above. If "something" is in cache then that >> should still work. >> >> The main idea here is that we want to avoid communicating with the >> backing store during the umount(2) pathwalk. >> >>> Discuss. >>> >>> OTON, umount-by-mount-id is an interesting idea, but we'll need to decide >>> what would the right permissions be for it. >>> >>> But please, lose the "mount table is a mapping from path prefix to filesystem" >>> notion - it really, really is not. IIRC, there are systems that work that way, >>> but it's nowhere near the semantics used by any Unices, all variants of Linux >>> included. >> >> I'm not opposed to something by umount-by-mount-id either. All of this >> seems like something that should probably rely on CAP_SYS_ADMIN. > > The permission model needs to account for the fact that mount ids are > global and as such you could in principle unmount any mount in any mount > namespace. IOW, you can circumvent lookup restrictions completely. > > So we could resolve the mnt-id to an FMODE_PATH and then very roughly > with no claim to solving everything: > > may_umount_by_mnt_id(struct path *opath) > { > struct path root; > bool reachable; > > // caller in principle able to circumvent lookup restrictions > if (!may_cap_dac_readsearch()) > return false; > > // caller can mount in their mountns > if (!may_mount()) > return false; > > // target mount and caller in the same mountns > if (!check_mnt()) > return false; > > // caller could in principle reach mount from it's root > get_fs_root(current->fs, &root); > reachable = is_path_reachable(real_mount(opath->mnt), opath->dentry, &root); > path_put(&root); > > return reachable; > } > > However, that still means that we have laxer restrictions on unmounting > by mount-id then on unmount with lookup as for lookup just having > CAP_DAC_READ_SEARCH isn't enough. Usually - at least for filesytems > without custom permission handlers - we also establish that the inode > can be mapped into the caller's idmapping. > > So that would mean that unmounting by mount-id would allow you to > unmount mounts in cases where you wouldn't with umount. That might be > fine though as that's ultimately the goal here in a way. > > One could also see a very useful feature in this where you require > capable(CAP_DAC_READ_SEARCH) and capable(CAP_SYS_ADMIN) and then allow > unmounting any mount in the system by mount-id. This would obviously be > very useful for privileged service managers but I haven't thought this > Through. That is exactly why having a separate syscall to do the lookup of the mount-id is good: it provides separation of privilege. The conversion of mount-id to an O_PATH file descriptor is just akin to a path lookup, so only needs CAP_DAC_READ_SEARCH (since you require privilege only to bypass the ACL directory read and lookup restrictions). The resulting O_PATH file descriptor has no special properties that require any further privilege. Then use that resulting file descriptor for the umount, which normally requires CAP_SYS_ADMIN. _________________________________ Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx