On Tue, Sep 21, 2021 at 8:59 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, Sep 21, 2021 at 12:11 PM Frank Hofmann <fhofmann@xxxxxxxxxxxxxx> wrote: > > > > Wouldn't that (updating the variable only for unpriv use) also make the leak impossible to notice that we ran into ? > > impossible? > That jit limit is not there on older kernels and doesn't apply to root. > How would you notice such a kernel bug in such conditions? I'm talking about bpf_jit_current - it's an "overall gauge" for allocation, priv and unpriv. I understood Lorenz' note as "change it so it only tracks unpriv BPF mem usage - since we'll never act on privileged usage anyway" FrankH. > > > (we have something near to a simple reproducer for https://www.spinics.net/lists/kernel/msg4029472.html ... need to extract the relevant parts of an app of ours, will update separately when there) > > > > FrankH. > > > > On Tue, Sep 21, 2021 at 4:52 PM Lorenz Bauer <lmb@xxxxxxxxxxxxxx> wrote: > >> > >> On Tue, 21 Sept 2021 at 15:34, Alexei Starovoitov > >> <alexei.starovoitov@xxxxxxxxx> wrote: > >> > > >> > On Tue, Sep 21, 2021 at 4:50 AM Lorenz Bauer <lmb@xxxxxxxxxxxxxx> wrote: > >> > > > >> > > Does it make sense to include !capable(CAP_BPF) in the check? > >> > > >> > Good point. Makes sense to add CAP_BPF there. > >> > Taking down critical networking infrastructure because of this limit > >> > that supposed to apply to unpriv users only is scary indeed. > >> > >> Ok, I'll send a patch. Can I add a Fixes: 2c78ee898d8f ("bpf: > >> Implement CAP_BPF")? > >> > >> Another thought: move the check for bpf_capable before the > >> atomic_long_add_return? This means we only track JIT allocations from > >> unprivileged users. As it stands a privileged user can easily "lock > >> out" unprivileged users, which on our set up is a real concern. We > >> have several socket filters / SO_REUSEPORT programs which are > >> critical, and also use lots of XDP from privileged processes as you > >> know. > >> > >> > > >> > > This limit reminds me a bit of the memlock issue, where a global limit > >> > > causes coupling between independent systems / processes. Can we remove > >> > > the limit in favour of something more fine grained? > >> > > >> > Right. Unfortunately memcg doesn't distinguish kernel module > >> > memory vs any other memory. All types of memory are memory. > >> > Regardless of whether its type is per-cpu, bpf map memory, bpf jit memory, etc. > >> > That's the main reason for the independent knob for JITed memory. > >> > Since it's a bit special. It's a crude knob. Certainly not perfect. > >> > >> I'm missing context, how is JIT memory different from these other kinds of code? > >> > >> Lorenz > >> > >> -- > >> Lorenz Bauer | Systems Engineer > >> 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK > >> > >> www.cloudflare.com