On Mon, 27 Jun 2022, Alexei Starovoitov wrote: > On Mon, Jun 27, 2022 at 5:17 PM Christoph Lameter <cl@xxxxxxxxx> wrote: > > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > > > Introduce any context BPF specific memory allocator. > > > > > > Tracing BPF programs can attach to kprobe and fentry. Hence they > > > run in unknown context where calling plain kmalloc() might not be safe. > > > Front-end kmalloc() with per-cpu per-bucket cache of free elements. > > > Refill this cache asynchronously from irq_work. > > > > GFP_ATOMIC etc is not going to work for you? > > slab_alloc_node->slab_alloc->local_lock_irqsave > kprobe -> bpf prog -> slab_alloc_node -> deadlock. > In other words, the slow path of slab allocator takes locks. That is a relatively new feature due to RT logic support. without RT this would be a simple irq disable. Generally doing slab allocation while debugging slab allocation is not something that can work. Can we exempt RT locks/irqsave or slab alloc from BPF tracing? I would assume that other key items of kernel logic will have similar issues. > Which makes it unsafe to use from tracing bpf progs. > That's why we preallocated all elements in bpf maps, > so there are no calls to mm or rcu logic. > bpf specific allocator cannot use locks at all. > try_lock approach could have been used in alloc path, > but free path cannot fail with try_lock. > Hence the algorithm in this patch is purely lockless. > bpf prog can attach to spin_unlock_irqrestore and > safely do bpf_mem_alloc. That is generally safe unless you get into reetrance issues with memory allocation. Which begs the question: What happens if I try to use BPF to trace *your* shiny new memory allocation functions in the BPF logic like bpf_mem_alloc? How do you stop that from happening?