On Fri, Apr 14, 2023 at 02:21:00PM +0000, Trond Myklebust wrote: > > > > On Apr 14, 2023, at 09:41, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > On Fri, Apr 14, 2023 at 06:06:38AM -0400, Jeff Layton wrote: > >> On Fri, 2023-04-14 at 03:43 +0100, Al Viro wrote: > >>> On Fri, Apr 14, 2023 at 08:41:03AM +1000, NeilBrown wrote: > >>> > >>>> The path name that appears in /proc/mounts is the key that must be used > >>>> to find and unmount a filesystem. When you do that "find"ing you are > >>>> not looking up a name in a filesystem, you are looking up a key in the > >>>> mount table. > >>> > >>> No. The path name in /proc/mounts is *NOT* a key - it's a best-effort > >>> attempt to describe the mountpoint. Pathname resolution does not work > >>> in terms of "the longest prefix is found and we handle the rest within > >>> that filesystem". > >>> > >>>> We could, instead, create an api that is given a mount-id (first number > >>>> in /proc/self/mountinfo) and unmounts that. Then /sbin/umount could > >>>> read /proc/self/mountinfo, find the mount-id, and unmount it - all > >>>> without ever doing path name lookup in the traditional sense. > >>>> > >>>> But I prefer your suggestion. LOOKUP_MOUNTPOINT could be renamed > >>>> LOOKUP_CACHED, and it only finds paths that are in the dcache, never > >>>> revalidates, at most performs simple permission checks based on cached > >>>> content. > >>> > >>> umount /proc/self/fd/42/barf/something > >>> > >> > >> Does any of that involve talking to the server? I don't necessarily see > >> a problem with doing the above. If "something" is in cache then that > >> should still work. > >> > >> The main idea here is that we want to avoid communicating with the > >> backing store during the umount(2) pathwalk. > >> > >>> Discuss. > >>> > >>> OTON, umount-by-mount-id is an interesting idea, but we'll need to decide > >>> what would the right permissions be for it. > >>> > >>> But please, lose the "mount table is a mapping from path prefix to filesystem" > >>> notion - it really, really is not. IIRC, there are systems that work that way, > >>> but it's nowhere near the semantics used by any Unices, all variants of Linux > >>> included. > >> > >> I'm not opposed to something by umount-by-mount-id either. All of this > >> seems like something that should probably rely on CAP_SYS_ADMIN. > > > > The permission model needs to account for the fact that mount ids are > > global and as such you could in principle unmount any mount in any mount > > namespace. IOW, you can circumvent lookup restrictions completely. > > > > So we could resolve the mnt-id to an FMODE_PATH and then very roughly > > with no claim to solving everything: > > > > may_umount_by_mnt_id(struct path *opath) > > { > > struct path root; > > bool reachable; > > > > // caller in principle able to circumvent lookup restrictions > > if (!may_cap_dac_readsearch()) > > return false; > > > > // caller can mount in their mountns > > if (!may_mount()) > > return false; > > > > // target mount and caller in the same mountns > > if (!check_mnt()) > > return false; > > > > // caller could in principle reach mount from it's root > > get_fs_root(current->fs, &root); > > reachable = is_path_reachable(real_mount(opath->mnt), opath->dentry, &root); > > path_put(&root); > > > > return reachable; > > } > > > > However, that still means that we have laxer restrictions on unmounting > > by mount-id then on unmount with lookup as for lookup just having > > CAP_DAC_READ_SEARCH isn't enough. Usually - at least for filesytems > > without custom permission handlers - we also establish that the inode > > can be mapped into the caller's idmapping. > > > > So that would mean that unmounting by mount-id would allow you to > > unmount mounts in cases where you wouldn't with umount. That might be > > fine though as that's ultimately the goal here in a way. > > > > One could also see a very useful feature in this where you require > > capable(CAP_DAC_READ_SEARCH) and capable(CAP_SYS_ADMIN) and then allow > > unmounting any mount in the system by mount-id. This would obviously be > > very useful for privileged service managers but I haven't thought this > > Through. > > That is exactly why having a separate syscall to do the lookup of the mount-id is good: it provides separation of privilege. > > The conversion of mount-id to an O_PATH file descriptor is just akin to a path lookup, so only needs CAP_DAC_READ_SEARCH (since you require privilege only to bypass the ACL directory read and lookup restrictions). The resulting O_PATH file descriptor has no special properties that require any further privilege. > > Then use that resulting file descriptor for the umount, which normally requires CAP_SYS_ADMIN. There's a difference between unmounting directly by providing a mount id and getting an O_PATH file descriptor from a mnt-id. If you can simply unmount by mount-id it's useful for users that have CAP_DAC_READ_SEARCH in a container. Without it you likely need to require capable(CAP_DAC_READ_SEARCH) aka system level privileges just like open_to_handle_at() which makes this interface way less generic and usable. Otherwise you'd be able to get an O_PATH fd to something that you wouldn't be able to access through normal path lookup.