Re: [PATCH RFC v2 0/4] coredump: mitigate privilege escalation of process coredump

ebiederm@xxxxxxxxxxxx (Eric W. Biederman) · Mon, 03 Jan 2022 16:11:38 -0600

Wander Lairson Costa <wander@xxxxxxxxxx> writes:

Have you seen the discussion at:
https://lkml.kernel.org/r/20211221021744.864115-1-longman@xxxxxxxxxx
?

Adding a few people from that conversation.

> v2
> ==
>
> Patch 02 conflicted with commit 92307383082d("coredump:  Don't perform
> any cleanups before dumping core") which I didn't have in my tree. V2
> just changes the hunk
>
> +#define PF_SUID   0x00000008
>
> To
>
> +#define PF_SUID   0x01000000
>
> To merge cleanly. Other than that, it is the same patch as v1.
>
> v1
> ==
>
> A set-uid executable might be a vector to privilege escalation if the
> system configures the coredump file name pattern as a relative
> directory destiny. The full description of the vulnerability and
> a demonstration of how we can exploit it can be found at [1].
>
> This patch series adds a PF_SUID flag to the process in execve if it is
> set-[ug]id binary and elevates the new image's privileges.
>
> In the do_coredump function, we check if:
>
> 1) We have the SUID_FLAG set
> 2) We have CAP_SYS_ADMIN (the process might have decreased its
>    privileges)
> 3) The current directory is owned by root (the current code already
>    checks for core_pattern being a relative path).
> 4) non-privileged users don't have permission to write to the current
>    directory.

Which is a slightly different approach than we have discussed
previously.

Something persistent to mark the processes descended from the suid exec
is something commonly agreed upon.

How we can dump core after that (with the least disruption is the
remaining question).

You would always allow coredumps unless the target directory is
problematic.  I remember it being suggested that even dumping to a pipe
might not be safe in practice.  Can someone remember why?

The other approach we have discussed is simply not allowing coredumps
unless the target process takes appropriate action to reenable them.

Willy posted a patch to that effect.

>From a proof of concept perspective PF_SUID and your directory checks look
like fine.  From a production implementation standpoint I think we would
want to make them a bit more general.  PF_SUID because it is more than
uid changes that can grant privilege during exec.  We especially want to
watch out for setcap executables.  The directory checks similarly look
very inflexible.  I think what we want is to test if the process before
the privilege change of the exec could write to the directory.

Even with your directory test approach you are going to run into
the semi-common idio of becomming root and then starting a daemon
in debugging mode so you can get a core dump.

> If all four conditions match, we set the need_suid_safe flag.
>
> An alternative implementation (and more elegant IMO) would be saving
> the fsuid and fsgid of the process in the task_struct before loading the
> new image to the memory. But this approach would add eight bytes to all
> task_struct instances where only a tiny fraction of the processes need
> it and under a configuration that not all (most?) distributions don't
> adopt by default.

One possibility is to save a struct cred on the mm_struct.  If done
carefully I think that would allow commit_creds to avoid the need
for dumpability changes (there would always be enough information to
directly compute it).

I can see that working for detecting dropped privileges.  I don't know
if that would work for raised privileges.

Definitely focusing in on the mm_struct for where to save the needed
information seems preferable, as in general it is an mm property not a
per task property.

Eric

> Wander Lairson Costa (4):
>   exec: add a flag indicating if an exec file is a suid/sgid
>   process: add the PF_SUID flag
>   coredump: mitigate privilege escalation of process coredump
>   exec: only set the suid flag if the current proc isn't root
>
>  fs/coredump.c           | 15 +++++++++++++++
>  fs/exec.c               | 10 ++++++++++
>  include/linux/binfmts.h |  6 +++++-
>  include/linux/sched.h   |  1 +
>  kernel/fork.c           |  2 ++
>  5 files changed, 33 insertions(+), 1 deletion(-)