On Tue, Feb 13, 2018 at 12:33 PM, Tom Hromatka <tom.hromatka@xxxxxxxxxx> wrote: > On Tue, Feb 13, 2018 at 7:42 AM, Sargun Dhillon <sargun@xxxxxxxxx> wrote: >> >> This patchset enables seccomp filters to be written in eBPF. Although, >> this patchset doesn't introduce much of the functionality enabled by >> eBPF, it lays the ground work for it. >> >> It also introduces the capability to dump eBPF filters via the PTRACE >> API in order to make it so that CHECKPOINT_RESTORE will be satisifed. >> In the attached samples, there's an example of this. One can then use >> BPF_OBJ_GET_INFO_BY_FD in order to get the actual code of the program, >> and use that at reload time. >> >> The primary reason for not adding maps support in this patchset is >> to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS. >> If we have a map that the BPF program can read, it can potentially >> "change" privileges after running. It seems like doing writes only >> is safe, because it can be pure, and side effect free, and therefore >> not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come >> to an agreement, this can be in a follow-up patchset. > > > > Coincidentally I also sent an RFC for adding eBPF hash maps to the seccomp > userspace mailing list just last week: > https://groups.google.com/forum/#!topic/libseccomp/pX6QkVF0F74 > > The kernel changes I proposed are in this email: > https://groups.google.com/d/msg/libseccomp/pX6QkVF0F74/ZUJlwI5qAwAJ > > In that email thread, Kees requested that I try out a binary tree in cBPF > and evaluate its performance. I just got a rough prototype working, and > while not as fast as an eBPF hash map, the cBPF binary tree was a > significant > improvement over the linear list of ifs that are currently generated. Also, > it only required changing a single function within the libseccomp libary > itself. > > https://github.com/drakenclimber/libseccomp/commit/87b36369f17385f5a7a4d95101185577fbf6203b > > Here are the results I am currently seeing using an in-house customer's > seccomp filter and a simplistic test program that runs getppid() thousands > of times. > > Test Case minimum TSC ticks to make syscall > ---------------------------------------------------------------- > seccomp disabled 620 > getppid() at the front of 306-syscall seccomp filter 722 > getppid() in middle of 306-syscall seccomp filter 1392 > getppid() at the end of the 306-syscall filter 2452 > seccomp using a 306-syscall-sized EBPF hash map 800 > cBPF filter using a binary tree 922 I still think that's a crazy filter. :) It should be inverted to just check the 26 syscalls and a final "greater than" test. I would expect it to be faster still. :) -Kees -- Kees Cook Pixel Security _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers