Hi, On 11/9/2024 3:55 AM, Alexei Starovoitov wrote: > On Wed, Nov 6, 2024 at 1:53 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> >> >> On 11/6/2024 2:35 PM, Hou Tao wrote: >>> From: Hou Tao <houtao1@xxxxxxxxxx> >>> >>> For htab of maps, when the map is removed from the htab, it may hold the >>> last reference of the map. bpf_map_fd_put_ptr() will invoke >>> bpf_map_free_id() to free the id of the removed map element. However, >>> bpf_map_fd_put_ptr() is invoked while holding a bucket lock >>> (raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock >>> (spinlock_t), triggering the following lockdep warning: >>> >>> ============================= >>> [ BUG: Invalid wait context ] >>> 6.11.0-rc4+ #49 Not tainted >>> ----------------------------- >>> test_maps/4881 is trying to lock: >>> ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70 >>> other info that might help us debug this: >>> context-{5:5} >>> 2 locks held by test_maps/4881: >>> #0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270 >>> #1: ffff888149ced148 (&htab->lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80 >>> stack backtrace: >>> CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49 >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... >>> Call Trace: >>> <TASK> >>> dump_stack_lvl+0x6e/0xb0 >>> dump_stack+0x10/0x20 >>> __lock_acquire+0x73e/0x36c0 >>> lock_acquire+0x182/0x450 >>> _raw_spin_lock_irqsave+0x43/0x70 >>> bpf_map_free_id.part.0+0x21/0x70 >>> bpf_map_put+0xcf/0x110 >>> bpf_map_fd_put_ptr+0x9a/0xb0 >>> free_htab_elem+0x69/0xe0 >>> htab_map_update_elem+0x50f/0xa80 >>> bpf_fd_htab_map_update_elem+0x131/0x270 >>> htab_map_update_elem+0x50f/0xa80 >>> bpf_fd_htab_map_update_elem+0x131/0x270 >>> bpf_map_update_value+0x266/0x380 >>> __sys_bpf+0x21bb/0x36b0 >>> __x64_sys_bpf+0x45/0x60 >>> x64_sys_call+0x1b2a/0x20d0 >>> do_syscall_64+0x5d/0x100 >>> entry_SYSCALL_64_after_hwframe+0x76/0x7e >>> >>> One way to fix the lockdep warning is using raw_spinlock_t for >>> map_idr_lock as well. However, bpf_map_alloc_id() invokes >>> idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a >>> similar lockdep warning because the slab's lock (s->cpu_slab->lock) is >>> still a spinlock. >> Is it OK to move the calling of bpf_map_free_id() from bpf_map_put() to >> bpf_map_free_deferred() ? It could fix the lockdep warning for htab of >> maps. Its downside is that the free of map id will be delayed, but I >> think it will not make a visible effect to the user, because the refcnt >> is already 0, trying to get the map fd through map id will return -ENOENT. > I've applied the current patch, since doing free outside of bucket lock > is a good thing to do anyway. > With that no need to move bpf_map_free_id(), right? > At least offload case relies on id being removed immediately. > I don't remember what else might care. > . Yes. There is no need to move bpf_map_free_id() after applying the patch. Thanks for the information about offload map. According to the implementation of _bpf_map_offload_destroy(), it seems that it may need to call bpf_map_free_id() twice, so now bpf_map_free_id() frees the immediately and sets map->id as 0.