Re: [PATCH] fs/open: add new RESOLVE_EMPTY_PATH flag for openat2

Aleksa Sarai <cyphar@xxxxxxxxxx> · Thu, 13 Jan 2022 17:55:00 +1100

On 2022-01-12, Christian Brauner <christian.brauner@xxxxxxxxxx> wrote:
> On Thu, Jan 13, 2022 at 01:34:19AM +1100, Aleksa Sarai wrote:
> > On 2022-01-12, Andrey Zhadchenko <andrey.zhadchenko@xxxxxxxxxxxxx> wrote:
> > > If you have an opened O_PATH file, currently there is no way to re-open
> > > it with other flags with openat/openat2. As a workaround it is possible
> > > to open it via /proc/self/fd/<X>, however
> > > 1) You need to ensure that /proc exists
> > > 2) You cannot use O_NOFOLLOW flag
> > 
> > There is also another issue -- you can mount on top of magic-links so if
> > you're a container runtime that has been tricked into creating bad
> > mounts of top of /proc/ subdirectories there's no way of detecting that
> > this has happened. (Though I think in the long-term we will need to
> > make it possible for unprivileged users to create a procfs mountfd if
> > they have hidepid=4,subset=pids set -- there are loads of things
> > containers need to touch in procfs which can be overmounted in malicious
> > ways.)
> 
> Yeah, though I see this as a less pressing issue for now. I'd rather
> postpone this and make userspace work a bit more. There are ways to
> design programs so you know that the procfs instance you're interacting
> with is the one you want to interact with without requiring unprivileged
> mounting outside of a userns+pidns+mountns pair. ;)

I'm not sure I agree. While with openat2(RESOLVE_NO_XDEV|RESOLVE_NO_SYMLINKS)
you can access the vast majority of procfs through a checked procfs
handle (fstatfs and the other checks you can do), you cannot jump
through magic links safely because RESOLVE_NO_XDEV blocks magic-link
jumps.

You can't even use a safe handle to /proc/self/fd and then just follow
the link because you can mount magiclinks on top of magiclinks, so even
a hypothetical RESOLVE_ONLY_MAGICLINKS wouldn't really work. Maybe a
RESOLVE_ONLY_MAGICLINKS that didn't cross mounts except if the crossing
was through a magiclink would work but I suspect implementing that would
be tricky (there's loads of places where you might trip LOOKUP_NO_XDEV).

O_EMPTYPATH would solve this issue for the /proc/self/fd/... magiclinks,
but /proc/self/{exe,cwd,root,ns/*} are all still susceptible to the same
issue. We use /proc/self/exe in runc, and everyone uses /proc/self/ns/*.

But yeah we can definitely solve this in a separate patchseries, and
O_EMPTYPATH is something we should have regardless of whether we solve
the procfs issue another way.

> > > Both problems may look insignificant, but they are sensitive for CRIU.
> > > First of all, procfs may not be mounted in the namespace where we are
> > > restoring the process. Secondly, if someone opens a file with O_NOFOLLOW
> > > flag, it is exposed in /proc/pid/fdinfo/<X>. So CRIU must also open the
> > > file with this flag during restore.
> > > 
> > > This patch adds new constant RESOLVE_EMPTY_PATH for resolve field of
> > > struct open_how and changes getname() call to getname_flags() to avoid
> > > ENOENT for empty filenames.
> > 
> > This is something I've wanted to implement for a while, but from memory
> > we need to add some other protections in place before enabling this.
> > 
> > The main one is disallowing re-opening of a path when it was originally
> > opened with a different set of modes. [1] is the patch I originally
> > wrote as part of the openat2(2) (but I dropped it since I wasn't sure
> > whether it might break some systems in subtle ways -- though according
> > to my testing there wasn't an issue on any of my machines).
> 
> Oh this is the discussion we had around turning an opath fd into a say
> O_RDWR fd, I think.
> So yes, I think restricting fd reopening makes sense. However, going
> from an O_PATH fd to e.g. an fd with O_RDWR does make sense and needs to
> be the default anyway. So we would need to implement this as a denylist
> anyway. The default is that opath fds can be reopened with whatever and
> only if the opath creator has restricted reopening will it fail, i.e.
> it's similar to a denylist.
> 
> But this patch wouldn't prevent that or hinder the upgrade mask
> restriction afaict.

Yeah the patch I linked implements the behaviour you mentioned (O_PATH
of a regular path lets you re-open with any mode, O_PATH of an O_PATH
copies the permitted re-opening modes of the old O_PATH, and O_PATH of a
magic link copies the file mode of the magic link).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
Attachment:
signature.asc

Description: PGP signature