On Fri, 2023-04-14 at 13:16 +0200, Christian Brauner wrote: > On Fri, Apr 14, 2023 at 06:09:46AM -0400, Jeff Layton wrote: > > On Fri, 2023-04-14 at 11:41 +0200, Christian Brauner wrote: > > > On Fri, Apr 14, 2023 at 04:51:04AM +0100, Al Viro wrote: > > > > On Fri, Apr 14, 2023 at 03:28:45AM +0000, Trond Myklebust wrote: > > > > > > > > > We already have support for directory file descriptors when mounting with move_mount(). Why not add a umountat() with similar support for the unmount side? > > > > > Then add a syscall to allow users with (e.g.) the CAP_DAC_OVERRIDE privilege to convert the mount-id into an O_PATH file descriptor. > > > > > > > > You can already do umount -l /proc/self/fd/69 if you have a descriptor. > > > > > > Way back when we put together stuff for [2] we had umountat() as an item > > > but decided against it because it's mostely useful when used as AT_EMPTY_PATH. > > > > > > umount("/proc/self/fd/<nr>", ...) is useful when you don't trust the > > > path and you need to resolve it with lookup restrictions. Then path > > > resolution restrictions of openat2() can be used to get an fd. Which can > > > be passed to umount(). > > > > > > I need to step outside so this is a halfway-out-the-door thought but > > > given your description of the problem Jeff, why doesn't the following > > > work (Just sketching this, you can't call openat2() like that.): > > > > > > fd_mnt = openat2("/my/funky/nfs/share/mount", RESOLVE_CACHED) > > > umount("/proc/self/fd/fd_mnt", MNT_DETACH) > > > > Something like that might work. A RESOLVE_CACHED flag is something that > > would involve more than just umount(2) though. That said, it could be > > useful in other situations. > > I think I introduced an ambiguity by accident. What I meant by "you > can't call openat2() like that" is that it takes a struct open_how > argument not just a simple flags argument. > > The flag I was talking about, RESOLVE_CACHED, does exist already. So it > is already possible to use openat2() to resolve paths like that. See > include/uapi/linux/openat2.h > Oh, thanks! I haven't tried this yet, but I doubt it'd fix the problem David was reproducing. There, the problem is mostly permission checks during the pathwalk. It tries to contact the server to check the permission, but because it's unexported, EACCES bubbles up. Also, to follow up: I tried Al's suggestion of lazily unmounting the bad intermediate mount closest to the root, and it does seem to clean up child mounts as expected. So, it does look like there is recourse in current kernels, even if the method is a bit unintuitive. Still, it might be worth discussing at LSF or here on the list. It'd be nice if umount "just worked" in this situation. Is there any reason to do a full-blown d_revalidating and permission-checking pathwalk on umount, assuming the caller already has CAP_SYS_ADMIN or similar? -- Jeff Layton <jlayton@xxxxxxxxxx>