Re: bpf_jit_limit close shave

Lorenz Bauer <lmb@xxxxxxxxxxxxxx> · Fri, 24 Sep 2021 11:35:01 +0100

On Thu, 23 Sept 2021 at 12:52, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> See bpf_jit_alloc_exec() which calls module_alloc() for the images' r+x memory
> holding the generated opcodes, and there's only one such pool for the system
> on the latter: on x86 in particular, the rationale for module_alloc() use is
> so that the image is guaranteed to be within +/- 2GB of where the kernel image
> resides. See the encoding of BPF_CALL with __bpf_call_base + imm32, for example.

Thanks, makes a lot more sense now. I sent some more clean up patches your way.

> > How does the knob solve the "can't load a new module" problem if our
> > suggestion / preference is to steer people towards CAP_BPF anyways
> > (since unpriv BPF is trouble)? Over time all BPF will be privileged
> > and we're in the same mess again?
>
> Keep in mind that the knob was added before CAP_BPF. In general, unprivileged
> cBPF->eBPF is also using the same bpf_jit_alloc_exec() for the JIT, so that
> needs to be taken into consideration as well, but if you grant an application
> CAP_BPF then you're essentially privileged. The knob's point was to prevent
> fully unprivileged users to play bad games.

You're right, it does help with that. Now, how do I solve the problem
of our privileged (but automated!) tooling eating up all the memory
anyways?

As an aside: it's _really_ hard (impossible?) to track down where this
memory is used. cbpf -> ebpf conversions don't show up in bpftool,
where does one go to look?

Lorenz

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com