Re: [RFC,PATCH 1/2] seccomp_filters: system call filtering using BPF

Andrew Lutomirski <luto@xxxxxxx> · Thu, 12 Jan 2012 08:14:25 -0800

On Thu, Jan 12, 2012 at 7:43 AM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> On Wed, 2012-01-11 at 11:25 -0600, Will Drewry wrote:
>
>> Filter programs may _only_ cross the execve(2) barrier if last filter
>> program was attached by a task with CAP_SYS_ADMIN capabilities in its
>> user namespace.  Once a task-local filter program is attached from a
>> process without privileges, execve will fail.  This ensures that only
>> privileged parent task can affect its privileged children (e.g., setuid
>> binary).
>
> This means that a non privileged user can not run another program with
> limited features? How would a process exec another program and filter
> it? I would assume that the filter would need to be attached first and
> then the execv() would be performed. But after the filter is attached,
> the execv is prevented?
>
> Maybe I don't understand this correctly.

Time to resurrect execve_nosecurity?  If so, then the rule could be
simplified to: seccomp programs cannot use normal execve at all.

The longer I linger on lists and see neat ideas like this, the more I
get annoyed that execve is magical.  I dream of a distribution that
doesn't use setuid, file capabilities, selinux transitions on exec, or
any other privilege changes on exec *at all*.  I think that the only
things missing in the kernel (other than something intelligent to do
about SELinux) are execve_nosecurity and the ability for a normal
program to wait for an unrelated program to finish (or some other way
that a program can ask a daemon to spawn a privileged program for it
and then to cleanly wait for that program to finish in a way that
could survive re-exec of the daemon).

--Andy

>
> -- Steve
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html