On Wed, Nov 6, 2024 at 1:53 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > > > On 11/6/2024 2:35 PM, Hou Tao wrote: > > From: Hou Tao <houtao1@xxxxxxxxxx> > > > > For htab of maps, when the map is removed from the htab, it may hold the > > last reference of the map. bpf_map_fd_put_ptr() will invoke > > bpf_map_free_id() to free the id of the removed map element. However, > > bpf_map_fd_put_ptr() is invoked while holding a bucket lock > > (raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock > > (spinlock_t), triggering the following lockdep warning: > > > > ============================= > > [ BUG: Invalid wait context ] > > 6.11.0-rc4+ #49 Not tainted > > ----------------------------- > > test_maps/4881 is trying to lock: > > ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70 > > other info that might help us debug this: > > context-{5:5} > > 2 locks held by test_maps/4881: > > #0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270 > > #1: ffff888149ced148 (&htab->lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80 > > stack backtrace: > > CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... > > Call Trace: > > <TASK> > > dump_stack_lvl+0x6e/0xb0 > > dump_stack+0x10/0x20 > > __lock_acquire+0x73e/0x36c0 > > lock_acquire+0x182/0x450 > > _raw_spin_lock_irqsave+0x43/0x70 > > bpf_map_free_id.part.0+0x21/0x70 > > bpf_map_put+0xcf/0x110 > > bpf_map_fd_put_ptr+0x9a/0xb0 > > free_htab_elem+0x69/0xe0 > > htab_map_update_elem+0x50f/0xa80 > > bpf_fd_htab_map_update_elem+0x131/0x270 > > htab_map_update_elem+0x50f/0xa80 > > bpf_fd_htab_map_update_elem+0x131/0x270 > > bpf_map_update_value+0x266/0x380 > > __sys_bpf+0x21bb/0x36b0 > > __x64_sys_bpf+0x45/0x60 > > x64_sys_call+0x1b2a/0x20d0 > > do_syscall_64+0x5d/0x100 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > One way to fix the lockdep warning is using raw_spinlock_t for > > map_idr_lock as well. However, bpf_map_alloc_id() invokes > > idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a > > similar lockdep warning because the slab's lock (s->cpu_slab->lock) is > > still a spinlock. > > Is it OK to move the calling of bpf_map_free_id() from bpf_map_put() to > bpf_map_free_deferred() ? It could fix the lockdep warning for htab of > maps. Its downside is that the free of map id will be delayed, but I > think it will not make a visible effect to the user, because the refcnt > is already 0, trying to get the map fd through map id will return -ENOENT. I've applied the current patch, since doing free outside of bucket lock is a good thing to do anyway. With that no need to move bpf_map_free_id(), right? At least offload case relies on id being removed immediately. I don't remember what else might care.