On Wed, May 8, 2019 at 9:47 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, May 08, 2019 at 04:17:29PM -0700, Eric Dumazet wrote: > > On Wed, May 8, 2019 at 4:09 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Wed, May 08, 2019 at 02:21:52PM -0700, Eric Dumazet wrote: > > > > Hi Alexei and Daniel > > > > > > > > I have a question about seccomp. > > > > > > > > It seems that after this patch, seccomp no longer needs a helper > > > > (seccomp_bpf_load()) > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bd4cf0ed331a275e9bf5a49e6d0fd55dffc551b8 > > > > > > > > Are we detecting that a particular JIT code needs to call at least one > > > > function from the kernel at all ? > > > > > > Currently we don't track such things and trying very hard to avoid > > > any special cases for classic vs extended. > > > > > > > If the filter contains self-contained code (no call, just inline > > > > code), then we could use any room in whole vmalloc space, > > > > not only from the modules (which is something like 2GB total on x86_64) > > > > > > I believe there was an effort to make bpf progs and other executable things > > > to be everywhere too, but I lost the track of it. > > > It's not that hard to tweak x64 jit to emit 64-bit calls to helpers > > > when delta between call insn and a helper is more than 32-bit that fits > > > into call insn. iirc there was even such patch floating around. > > > > > > but what motivated you question? do you see 2GB space being full?! > > > > > > A customer seems to hit the limit, with about 75,000 threads, > > each one having a seccomp filter with 6 pages (plus one guard page > > given by vmalloc) > > Since cbpf doesn't have "fd as a program" concept I suspect > the same program was loaded 75k times. What a waste of kernel memory. > And, no, we're not going to extend or fix cbpf for this. > cbpf is frozen. seccomp needs to start using ebpf. > It can have one program to secure all threads. > If necessary single program can be customized via bpf maps > for each thread. Yes, docker seems to have a very generic implementation and should probably be fixed ( https://github.com/moby/moby/blob/v17.03.2-ce/profiles/seccomp/seccomp.go )