On Wed, May 29, 2024 at 2:50 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, May 29, 2024 at 2:46 PM Amery Hung <ameryhung@xxxxxxxxx> wrote: > > > > On Wed, May 29, 2024 at 2:20 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Wed, May 29, 2024 at 8:53 AM Amery Hung <ameryhung@xxxxxxxxx> wrote: > > > > > > > > Hello, > > > > > > > > We are developing a tool to perform static analysis on the bpf > > > > subsystem to detect locking violations. Our tool reported the > > > > spin_lock_irqsave() in trie_delete_elem() and trie_update_elem() that > > > > could be called from an NMI. If a bpf program holding the lock is > > > > interrupted by the same program in NMI, a deadlock can happen. The > > > > report was generated for kernel version 6.6-rc4, however, we believe > > > > this should still exist in the latest kernel. > > > > > > Fix it similar to > > > https://lore.kernel.org/all/20230911132815.717240-1-toke@xxxxxxxxxx/ > > > ? > > > > I applied the similar fixing approach to trie->lock, and then I found > > the two other locks mentioned earlier. My feeling is that there might > > not be a use case to justify doing trylocks in memcg and rcu. If you > > think the approach below is okay. I can send a fixing patch. > > > > trie_update_elem() { > > + if (in_nmi()) > > + return -EBUSY; > > } > > That's too crude. Trylock is less surprising. > re: other locks (memcg and rcu). > I think it's time to switch the whole lpm map to bpf_mem_alloc like we > did for the hash table. Thanks for the pointer. That makes sense. I will test and send a fix.