> On May 6, 2019, at 12:17 PM, Aleksa Sarai <cyphar@xxxxxxxxxx> wrote: > >> On 2019-05-06, Jann Horn <jannh@xxxxxxxxxx> wrote: >>> On Mon, May 6, 2019 at 6:56 PM Aleksa Sarai <cyphar@xxxxxxxxxx> wrote: >>> The need to be able to scope path resolution of interpreters became >>> clear with one of the possible vectors used in CVE-2019-5736 (which >>> most major container runtimes were vulnerable to). >>> >>> Naively, it might seem that openat(2) -- which supports path scoping -- >>> can be combined with execveat(AT_EMPTY_PATH) to trivially scope the >>> binary being executed. Unfortunately, a "bad binary" (usually a symlink) >>> could be written as a #!-style script with the symlink target as the >>> interpreter -- which would be completely missed by just scoping the >>> openat(2). An example of this being exploitable is CVE-2019-5736. >>> >>> In order to get around this, we need to pass down to each binfmt_* >>> implementation the scoping flags requested in execveat(2). In order to >>> maintain backwards-compatibility we only pass the scoping AT_* flags. >>> >>> To avoid breaking userspace (in the exceptionally rare cases where you >>> have #!-scripts with a relative path being execveat(2)-ed with dfd != >>> AT_FDCWD), we only pass dfd down to binfmt_* if any of our new flags are >>> set in execveat(2). >> >> This seems extremely dangerous. I like the overall series, but not this patch. >> >>> @@ -1762,6 +1774,12 @@ static int __do_execve_file(int fd, struct filename *filename, >>> >>> sched_exec(); >>> >>> + bprm->flags = flags & (AT_XDEV | AT_NO_MAGICLINKS | AT_NO_SYMLINKS | >>> + AT_THIS_ROOT); >> [...] >>> +#define AT_THIS_ROOT 0x100000 /* - Scope ".." resolution to dirfd (like chroot(2)). */ >> >> So now what happens if there is a setuid root ELF binary with program >> interpreter "/lib64/ld-linux-x86-64.so.2" (like /bin/su), and an >> unprivileged user runs it with execveat(..., AT_THIS_ROOT)? Is that >> going to let the unprivileged user decide which interpreter the >> setuid-root process should use? From a high-level perspective, opening >> the interpreter should be controlled by the program that is being >> loaded, not by the program that invoked it. > > I went a bit nuts with openat_exec(), and I did end up adding it to the > ELF interpreter lookup (and you're completely right that this is a bad > idea -- I will drop it from this patch if it's included in the next > series). > > The proposed solutions you give below are much nicer than this patch so > I can drop it and work on fixing those issues separately. > >> In my opinion, CVE-2019-5736 points out two different problems: >> >> The big problem: The __ptrace_may_access() logic has a special-case >> short-circuit for "introspection" that you can't opt out of; this >> makes it possible to open things in procfs that are related to the >> current process even if the credentials of the process wouldn't permit >> accessing another process like it. I think the proper fix to deal with >> this would be to add a prctl() flag for "set whether introspection is >> allowed for this process", and if userspace has manually un-set that >> flag, any introspection special-case logic would be skipped. > > We could do PR_SET_DUMPABLE=3 for this, I guess? > >> An additional problem: /proc/*/exe can be used to open a file for >> writing; I think it may have been Andy Lutomirski who pointed out some >> time ago that it would be nice if you couldn't use /proc/*/fd/* to >> re-open files with more privileges, which is sort of the same thing. > > This is something I'm currently working on a series for, which would > boil down to some restrictions on how re-opening of file descriptors > works through procfs. > > However, execveat() of a procfs magiclink is a bit hard to block -- > there is no way for userspace to to represent a file being "open for > execute" so they are all "open for execute" by default and blocking it > outright seems a bit extreme (though I actually hope to eventually add > the ability to mark an O_PATH as "open for X" to resolveat(2) -- hence > why I've reserved some bits). There’s an O_MAYEXEC series floating around. > > (Thinking more about it, there is an argument that I should include the > above patch into this series so that we can block re-opening of fds > opened through resolveat(2) without explicit flags from the outset.) > > -- > Aleksa Sarai > Senior Software Engineer (Containers) > SUSE Linux GmbH > <https://www.cyphar.com/>