Re: [PATCH net-next 0/3] eBPF Seccomp filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 02/13/2018 01:35 PM, Kees Cook wrote:
On Tue, Feb 13, 2018 at 12:33 PM, Tom Hromatka <tom.hromatka@xxxxxxxxxx> wrote:
On Tue, Feb 13, 2018 at 7:42 AM, Sargun Dhillon <sargun@xxxxxxxxx> wrote:
This patchset enables seccomp filters to be written in eBPF. Although,
this patchset doesn't introduce much of the functionality enabled by
eBPF, it lays the ground work for it.

It also introduces the capability to dump eBPF filters via the PTRACE
API in order to make it so that CHECKPOINT_RESTORE will be satisifed.
In the attached samples, there's an example of this. One can then use
BPF_OBJ_GET_INFO_BY_FD in order to get the actual code of the program,
and use that at reload time.

The primary reason for not adding maps support in this patchset is
to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
If we have a map that the BPF program can read, it can potentially
"change" privileges after running. It seems like doing writes only
is safe, because it can be pure, and side effect free, and therefore
not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
to an agreement, this can be in a follow-up patchset.


Coincidentally I also sent an RFC for adding eBPF hash maps to the seccomp
userspace mailing list just last week:
https://groups.google.com/forum/#!topic/libseccomp/pX6QkVF0F74

The kernel changes I proposed are in this email:
https://groups.google.com/d/msg/libseccomp/pX6QkVF0F74/ZUJlwI5qAwAJ

In that email thread, Kees requested that I try out a binary tree in cBPF
and evaluate its performance.  I just got a rough prototype working, and
while not as fast as an eBPF hash map, the cBPF binary tree was a
significant
improvement over the linear list of ifs that are currently generated.  Also,
it only required changing a single function within the libseccomp libary
itself.

https://github.com/drakenclimber/libseccomp/commit/87b36369f17385f5a7a4d95101185577fbf6203b

Here are the results I am currently seeing using an in-house customer's
seccomp filter and a simplistic test program that runs getppid() thousands
of times.

Test Case                      minimum TSC ticks to make syscall
----------------------------------------------------------------
seccomp disabled                                             620
getppid() at the front of 306-syscall seccomp filter         722
getppid() in middle of 306-syscall seccomp filter           1392
getppid() at the end of the 306-syscall filter              2452
seccomp using a 306-syscall-sized EBPF hash map              800
cBPF filter using a binary tree                              922
I still think that's a crazy filter. :) It should be inverted to just
check the 26 syscalls and a final "greater than" test. I would expect
it to be faster still. :)

-Kees

I completely agree it's a crazy filter, but it seems to be a
common "mistake" our users are making.  It would be nice to
help them out if we can.

Tom

_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers




[Index of Archives]     [Cgroups]     [Netdev]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux