Hi, On 11/9/2024 3:52 AM, Alexei Starovoitov wrote: > On Wed, Nov 6, 2024 at 1:49 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> Hi, >> >> On 11/6/2024 4:45 PM, Sebastian Andrzej Siewior wrote: >>> On 2024-11-06 14:35:39 [+0800], Hou Tao wrote: >>>> From: Hou Tao <houtao1@xxxxxxxxxx> >>>> >>>> Hi, >>> Hi Hou, >>> >>>> The patch set fixes a lockdep warning for htab of map. The >>>> warning is found when running test_maps. The warning occurs when >>>> htab_put_fd_value() attempts to acquire map_idr_lock to free the map id >>>> of the inner map while already holding the bucket lock (raw_spinlock_t). >>>> >>>> The fix moves the invocation of free_htab_elem() after >>>> htab_unlock_bucket() and adds a test case to verify the solution. Please >>>> see the individual patches for details. Comments are always welcome. > The fix makes sense. > I manually resolved merge conflict and applied. Thanks for the manually conflict resolving. However, the patch set doesn't move all free operations outside of lock scope (e.g., for bpf_map_lookup_and_delete()), because htab of maps doesn't support it. I could post another patch set to do that. > >>> Thank you. >>> >>> Acked-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> >>> >>> I've seen that you didn't move check_and_free_fields() out of the bucket >>> locked section. Type BPF_TIMER does hrtimer_cancel() if the timer >>> happens to run on a remote CPU. On PREEMPT_RT this will acquire a >>> sleeping lock which is problematic due to the raw_spinlock_t. >>> Would it be okay, to cleanup the timer unconditionally via the >>> workqueue? >> Yes. The patch set still invokes check_and_free_fields() under the >> bucket lock when updating an existing element in a pre-allocated htab. I >> missed the hrtimer case. For the sleeping lock, you mean the >> cpu_base->softirq_expiry_lock in hrtimer_cancel_waiting_running(), right >> ? Instead of cancelling the timer in workqueue, maybe we could save the >> old value temporarily in the bucket lock, and try to free it outside of >> the bucket lock or disabling the extra_elems logic temporarily for the >> case ? > We definitely need to avoid spamming wq when cancelling timers. > wq may not be able to handle the volume. > Moving it outside of bucket lock is certainly better. > . OK. Will try whether there is a better way to handle the cancelling of timer case.