David Miller <davem@xxxxxxxxxxxxx> writes: > From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Date: Fri, 14 Feb 2020 14:39:17 +0100 > >> This is a follow up to the initial patch series which David posted a >> while ago: >> >> https://lore.kernel.org/bpf/20191207.160357.828344895192682546.davem@xxxxxxxxxxxxx/ >> >> which was (while non-functional on RT) a good starting point for further >> investigations. > > This looks really good after a cursory review, thanks for doing this week. > > I was personally unaware of the pre-allocation rules for MAPs used by > tracing et al. And that definitely shapes how this should be handled. Hmm. I just noticed that my analysis only holds for PERF events. But that's broken on mainline already. Assume the following simplified callchain: kmalloc() from regular non BPF context cache empty freelist empty lock(zone->lock); tracepoint or kprobe BPF() update_elem() lock(bucket) kmalloc() cache empty freelist empty lock(zone->lock); <- DEADLOCK So really, preallocation _must_ be enforced for all variants of intrusive instrumentation. There is no if and but, it's simply mandatory as all intrusive instrumentation has to follow the only sensible principle: KISS = Keep It Safe and Simple. The above is a perfectly valid scenario and works with perf and tracing, so it has to work with BPF in the same safe way. I might be missing some magic enforcement of that, but I got lost in the maze. Thanks, tglx