Re: [net-next v3 0/2] eBPF seccomp filters

Sargun Dhillon <sargun@xxxxxxxxx> · Mon, 26 Feb 2018 16:01:13 -0800



On Mon, Feb 26, 2018 at 3:04 PM, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
> On Mon, Feb 26, 2018 at 07:26:54AM +0000, Sargun Dhillon wrote:
>> This patchset enables seccomp filters to be written in eBPF. Although, this
>> patchset doesn't introduce much of the functionality enabled by eBPF, it lays
>> the ground work for it. Currently, you have to disable CHECKPOINT_RESTORE
>> support in order to utilize eBPF seccomp filters, as eBPF filters cannot be
>> retrieved via the ptrace GET_FILTER API.
>
> this was discussed multiple times in the past.
> In eBPF land it's practically impossible to do checkpoint/restore
> of the whole bpf program/map graph.
>
>> Any user can load a bpf seccomp filter program, and it can be pinned and
>> reused without requiring access to the bpf syscalls. A user only requires
>> the traditional permissions of either being cap_sys_admin, or have
>> no_new_privs set in order to install their rule.
>>
>> The primary reason for not adding maps support in this patchset is
>> to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
>> If we have a map that the BPF program can read, it can potentially
>> "change" privileges after running. It seems like doing writes only
>> is safe, because it can be pure, and side effect free, and therefore
>> not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
>> to an agreement, this can be in a follow-up patchset.
>
> readonly maps already exist. See BPF_F_RDONLY.
> Is that not enough?
>
With BPF_F_RDONLY, is there a mechanism to populate a prog_array, and
then mark it rd_only?

>> A benchmark of this patchset is as follows for a very standard eBPF filter:
>>
>> Given this test program:
>> for (i = 10; i < 99999999; i++) syscall(__NR_getpid);
>>
>> If I implement an eBPF filter with PROG_ARRAYs with a program per syscall,
>> and tail call, the numbers are such:
>> ebpf JIT 12.3% slower than native
>> ebpf no JIT 13.6% slower than native
>> seccomp JIT 17.6% slower than native
>> seccomp no JIT 37% slower than native
>
> the perf gains are misleading, since patches don't enable bpf_tail_call.
>
> The main statement I want to hear from seccomp maintainers before
> proceeding any further on this that enabling eBPF in seccomp won't lead
> to seccomp folks arguing against changes in bpf core (like verifier)
> just because it's used by seccomp.
> It must be spelled out in the commit log with explicit Ack.
>
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers