On 2022-01-12, Andrey Zhadchenko <andrey.zhadchenko@xxxxxxxxxxxxx> wrote: > If you have an opened O_PATH file, currently there is no way to re-open > it with other flags with openat/openat2. As a workaround it is possible > to open it via /proc/self/fd/<X>, however > 1) You need to ensure that /proc exists > 2) You cannot use O_NOFOLLOW flag There is also another issue -- you can mount on top of magic-links so if you're a container runtime that has been tricked into creating bad mounts of top of /proc/ subdirectories there's no way of detecting that this has happened. (Though I think in the long-term we will need to make it possible for unprivileged users to create a procfs mountfd if they have hidepid=4,subset=pids set -- there are loads of things containers need to touch in procfs which can be overmounted in malicious ways.) > Both problems may look insignificant, but they are sensitive for CRIU. > First of all, procfs may not be mounted in the namespace where we are > restoring the process. Secondly, if someone opens a file with O_NOFOLLOW > flag, it is exposed in /proc/pid/fdinfo/<X>. So CRIU must also open the > file with this flag during restore. > > This patch adds new constant RESOLVE_EMPTY_PATH for resolve field of > struct open_how and changes getname() call to getname_flags() to avoid > ENOENT for empty filenames. This is something I've wanted to implement for a while, but from memory we need to add some other protections in place before enabling this. The main one is disallowing re-opening of a path when it was originally opened with a different set of modes. [1] is the patch I originally wrote as part of the openat2(2) (but I dropped it since I wasn't sure whether it might break some systems in subtle ways -- though according to my testing there wasn't an issue on any of my machines). I'd can try to revive that patchset if you'd like. Being able to re-open paths without going through procfs is something I've wanted to finish up for a while, so thanks for sending this patch. :D [1]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@xxxxxxxxxx/ > Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@xxxxxxxxxxxxx> > --- > > Why does even CRIU needs to reopen O_PATH files? > Long story short: to support restoring opened files that are overmounted > with single file bindmounts. > In-depth explanation: when restoring mount tree, before doing mount() > call, CRIU opens mountpoint with O_PATH and saves this fd for later use > for each mount. If we need to restore overmounted file, we look at the > mount which overmounts file mount and use its saved mountpoint fd in > openat(<saved_fd>, <relative_path>, flags). > If we need to open an overmounted mountpoint directory itself, we can use > openat(<saved_fd>, ".", flags). However, if we have a bindmount, its > mountpoint is a regular file. Therefore to open it we need to be able to > reopen O_PATH descriptor. As I mentioned above, procfs workaround is > possible but imposes several restrictions. Not to mention a hussle with > /proc. > > Important note: the algorithm above relies on Virtozzo CRIU "mount-v2" > engine, which is currently being prepared for mainstream CRIU. > This patch ensures that CRIU will support all kinds of overmounted files. > > fs/open.c | 4 +++- > include/linux/fcntl.h | 2 +- > include/uapi/linux/openat2.h | 2 ++ > 3 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/open.c b/fs/open.c > index f732fb9..cfde988 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -1131,6 +1131,8 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op) > return -EAGAIN; > lookup_flags |= LOOKUP_CACHED; > } > + if (how->resolve & RESOLVE_EMPTY_PATH) > + lookup_flags |= LOOKUP_EMPTY; > > op->lookup_flags = lookup_flags; > return 0; > @@ -1203,7 +1205,7 @@ static long do_sys_openat2(int dfd, const char __user *filename, > if (fd) > return fd; > > - tmp = getname(filename); > + tmp = getname_flags(filename, op.lookup_flags, 0); > if (IS_ERR(tmp)) > return PTR_ERR(tmp); > > diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h > index a332e79..eabc7a8 100644 > --- a/include/linux/fcntl.h > +++ b/include/linux/fcntl.h > @@ -15,7 +15,7 @@ > /* List of all valid flags for the how->resolve argument: */ > #define VALID_RESOLVE_FLAGS \ > (RESOLVE_NO_XDEV | RESOLVE_NO_MAGICLINKS | RESOLVE_NO_SYMLINKS | \ > - RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_CACHED) > + RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_CACHED | RESOLVE_EMPTY_PATH) > > /* List of all open_how "versions". */ > #define OPEN_HOW_SIZE_VER0 24 /* sizeof first published struct */ > diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h > index a5feb76..a42cf88 100644 > --- a/include/uapi/linux/openat2.h > +++ b/include/uapi/linux/openat2.h > @@ -39,5 +39,7 @@ struct open_how { > completed through cached lookup. May > return -EAGAIN if that's not > possible. */ > +#define RESOLVE_EMPTY_PATH 0x40 /* If pathname is an empty string, open > + the file referred by dirfd */ > > #endif /* _UAPI_LINUX_OPENAT2_H */ > -- > 1.8.3.1 > -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH <https://www.cyphar.com/>
Attachment:
signature.asc
Description: PGP signature