Re: [PATCH] fs/open: add new RESOLVE_EMPTY_PATH flag for openat2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-01-12, Andrey Zhadchenko <andrey.zhadchenko@xxxxxxxxxxxxx> wrote:
> If you have an opened O_PATH file, currently there is no way to re-open
> it with other flags with openat/openat2. As a workaround it is possible
> to open it via /proc/self/fd/<X>, however
> 1) You need to ensure that /proc exists
> 2) You cannot use O_NOFOLLOW flag

There is also another issue -- you can mount on top of magic-links so if
you're a container runtime that has been tricked into creating bad
mounts of top of /proc/ subdirectories there's no way of detecting that
this has happened. (Though I think in the long-term we will need to
make it possible for unprivileged users to create a procfs mountfd if
they have hidepid=4,subset=pids set -- there are loads of things
containers need to touch in procfs which can be overmounted in malicious
ways.)

> Both problems may look insignificant, but they are sensitive for CRIU.
> First of all, procfs may not be mounted in the namespace where we are
> restoring the process. Secondly, if someone opens a file with O_NOFOLLOW
> flag, it is exposed in /proc/pid/fdinfo/<X>. So CRIU must also open the
> file with this flag during restore.
> 
> This patch adds new constant RESOLVE_EMPTY_PATH for resolve field of
> struct open_how and changes getname() call to getname_flags() to avoid
> ENOENT for empty filenames.

This is something I've wanted to implement for a while, but from memory
we need to add some other protections in place before enabling this.

The main one is disallowing re-opening of a path when it was originally
opened with a different set of modes. [1] is the patch I originally
wrote as part of the openat2(2) (but I dropped it since I wasn't sure
whether it might break some systems in subtle ways -- though according
to my testing there wasn't an issue on any of my machines).

I'd can try to revive that patchset if you'd like. Being able to re-open
paths without going through procfs is something I've wanted to finish up
for a while, so thanks for sending this patch. :D

[1]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@xxxxxxxxxx/

> Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@xxxxxxxxxxxxx>
> ---
> 
> Why does even CRIU needs to reopen O_PATH files?
> Long story short: to support restoring opened files that are overmounted
> with single file bindmounts.
> In-depth explanation: when restoring mount tree, before doing mount()
> call, CRIU opens mountpoint with O_PATH and saves this fd for later use
> for each mount. If we need to restore overmounted file, we look at the
> mount which overmounts file mount and use its saved mountpoint fd in
> openat(<saved_fd>, <relative_path>, flags).
> If we need to open an overmounted mountpoint directory itself, we can use
> openat(<saved_fd>, ".", flags). However, if we have a bindmount, its
> mountpoint is a regular file. Therefore to open it we need to be able to
> reopen O_PATH descriptor. As I mentioned above, procfs workaround is
> possible but imposes several restrictions. Not to mention a hussle with
> /proc.
> 
> Important note: the algorithm above relies on Virtozzo CRIU "mount-v2"
> engine, which is currently being prepared for mainstream CRIU.
> This patch ensures that CRIU will support all kinds of overmounted files.
> 
>  fs/open.c                    | 4 +++-
>  include/linux/fcntl.h        | 2 +-
>  include/uapi/linux/openat2.h | 2 ++
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/open.c b/fs/open.c
> index f732fb9..cfde988 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -1131,6 +1131,8 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op)
>  			return -EAGAIN;
>  		lookup_flags |= LOOKUP_CACHED;
>  	}
> +	if (how->resolve & RESOLVE_EMPTY_PATH)
> +		lookup_flags |= LOOKUP_EMPTY;
>  
>  	op->lookup_flags = lookup_flags;
>  	return 0;
> @@ -1203,7 +1205,7 @@ static long do_sys_openat2(int dfd, const char __user *filename,
>  	if (fd)
>  		return fd;
>  
> -	tmp = getname(filename);
> +	tmp = getname_flags(filename, op.lookup_flags, 0);
>  	if (IS_ERR(tmp))
>  		return PTR_ERR(tmp);
>  
> diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h
> index a332e79..eabc7a8 100644
> --- a/include/linux/fcntl.h
> +++ b/include/linux/fcntl.h
> @@ -15,7 +15,7 @@
>  /* List of all valid flags for the how->resolve argument: */
>  #define VALID_RESOLVE_FLAGS \
>  	(RESOLVE_NO_XDEV | RESOLVE_NO_MAGICLINKS | RESOLVE_NO_SYMLINKS | \
> -	 RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_CACHED)
> +	 RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_CACHED | RESOLVE_EMPTY_PATH)
>  
>  /* List of all open_how "versions". */
>  #define OPEN_HOW_SIZE_VER0	24 /* sizeof first published struct */
> diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h
> index a5feb76..a42cf88 100644
> --- a/include/uapi/linux/openat2.h
> +++ b/include/uapi/linux/openat2.h
> @@ -39,5 +39,7 @@ struct open_how {
>  					completed through cached lookup. May
>  					return -EAGAIN if that's not
>  					possible. */
> +#define RESOLVE_EMPTY_PATH	0x40 /* If pathname is an empty string, open
> +					the file referred by dirfd */
>  
>  #endif /* _UAPI_LINUX_OPENAT2_H */
> -- 
> 1.8.3.1
> 

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux