Re: [RFC,PATCH 1/2] seccomp_filters: system call filtering using BPF

Andrew Lutomirski <luto@xxxxxxx> · Thu, 12 Jan 2012 08:51:38 -0800

On Thu, Jan 12, 2012 at 8:27 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Thu, 2012-01-12 at 08:14 -0800, Andrew Lutomirski wrote:
>
>> The longer I linger on lists and see neat ideas like this, the more I
>> get annoyed that execve is magical.  I dream of a distribution that
>> doesn't use setuid, file capabilities, selinux transitions on exec, or
>> any other privilege changes on exec *at all*.
>
> Is that the fear with filtering on execv? That if we have filters on an
> execv calling a setuid program that we change the behavior of that
> privileged program and might cause unexpected results?

Exactly.

>
> In that case, just have execv fail if filtering is enabled and we are
> execing a setuid program. But I don't see why non "magical" execv's
> should be prohibited.
>

How do you define "non-magical"?

If setuid is set, then it's obviously magical.  On a nosuid
filesystem, strange things happen.  If file capabilities are enabled
and set, then different magic happens.  With LSMs involved, anything
can be magical.  (SELinux AFAICT looks up rules on every single exec,
so it might be impossible for execve to be non-magical.)  If execve is
banned entirely when seccomp is enabled, then there will never be any
attacks based on abusing these mechanisms.

My proposal is to have an alternative mechanism that, from a security
POV, does nothing that the caller couldn't have done on its own.  The
only reason it would be needed at all is because implementing execve
with correct semantics from userspace is a PITA -- the right set of
fds needs to be closed, threads need to be killed (without races),
vmas need to be found an unmapped, a new program needs to be mapped in
(possibly at the same place that the old one was mapped at),
/proc/self/exe needs to be updated, auxv needs to be recreated
(including using values that glibc might have erased already), etc.

The code is short and it works (although I have no idea whether it
applies to current kernels).

Oleg: my only issue with setting something like LSM_UNSAFE_SECCOMP is
that a different class of vulnerability might be introduced: take a
setuid program that calls other setuid programs (or just uses execve
as a way to get the default execve capability handling, SELinux
handling, etc), run it (as root!) inside seccomp, and watch it
possibly develop security holes.  If the alternate execve is a
different syscall, then this can't happen.  And if someone remaps
execve to execve_nosecurity (from userspace or via some in-kernel
mechanism) and causes problems, it's entirely clear who to blame.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html