On Thu, 12 Dec 2024 at 22:26, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Dec 12, 2024 at 4:41 PM Siddharth Chintamaneni > <sidchintamaneni@xxxxxxxxx> wrote: > > > > On Thu, 12 Dec 2024 at 18:58, Priya Bala Govindasamy <pgovind2@xxxxxxx> wrote: > > > > > > BPF program types like kprobe and fentry can cause deadlocks in certain > > > situations. If a function takes a lock and one of these bpf programs is > > > hooked to some point in the function's critical section, and if the > > > bpf program tries to call the same function and take the same lock it will > > > lead to deadlock. These situations have been reported in the following > > > bug reports. > > > > > > In percpu_freelist - > > > Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@xxxxxxxxxxxxxx/T/ > > > Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@xxxxxxxxxxxxxx/T/ > > > In bpf_lru_list - > > > Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@xxxxxxxxxxxxxx/T/ > > > Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@xxxxxxxxxxxxxx/T/ > > > Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@xxxxxxxxxxxxxx/T/ > > > > > > Similar bugs have been reported by syzbot. > > > In queue_stack_maps - > > > Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@xxxxxxxxxx/ > > > Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@xxxxxxxx/T/ > > > In lpm_trie - > > > Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@xxxxxxxxxx/T/ > > > In ringbuf - > > > Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@xxxxxxxx/T/ > > > > > > Prevent kprobe and fentry bpf programs from attaching to these critical > > > sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, > > > bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. > > > > > > > I think the current solution is to use a per-CPU variable to prevent > > deadlocks. You can look at the hashmap implementation for reference. > > However, ABBA deadlocks are still possible, so to avoid these, I think > > the BPF community is working towards implementing resilient spinlocks. > > Right. The resilient spinlocks are wip, but in the meantime > we need to stop the bleeding. > Ok I can resend the patches I was working on. https://lore.kernel.org/all/202405041108.2Up5HT0H-lkp@xxxxxxxxx/T/ I remember that you shared the RFC patch set for resilient spinlocks with me, but I didn't get a chance to check them at the time. Now that I have more free time, I'd be happy to help you test that work if you'd like. > > I was planning to send patches for some of these bugs earlier. I'm > > wondering if per-CPU checks would still be valid once resilient > > spinlocks are introduced? > > The wip patches with res_spin_lock remove these per-cpu > recursion counters from hash map and other places.