On Fri, 2024-09-20 at 10:41 +0200, Remi Pommarel wrote: > > > That won't work, it's cancel_delayed_work_sync() under the wiphy mutex, > > > so that'll cause deadlocks (and should cause lockdep complaints about > > > them.) > > > > Yes you are right, and AFAIU that is this kind of issue using wiphy work > > queue would prevent. With wiphy work queue if wiphy_delayed_work_cancel > > is called with wiphy lock held, work cannot be running after it returns; > > making it handy to replace cancel_delayed_work_sync() with. Right, and safe under the lock, since the lock is taken externally to the actual work function. It either is completed or still on the list, both cases are handled safely (do nothing and remove, respectively.) > > So, in my opinion, switching to wiphy work queue seems to be the > > prefered solution here. Yes. > > While there is no wiphy work queue equivalent of delayed_work_pending(), Maybe we should add it? > > I think using timer_pending(&link->color_collision_detect_work->timer) > > to replace delayed_work_pending(), even if the semantic is a bit > > different, would be ok to fulfill the rate limiting purpose. I think you're right. We could as well check list_empty() or so, but it wouldn't make a big difference. As you say: > > Having the > > same delayed_work_pending() semantics on wiphy work queue would require > > to take wiphy lock To really have it known _precisely_, that's true. > > which seem a bit superfluous here. It's actually simply also not possible - if we could sleep in ieee80211_obss_color_collision_notify() and take the wiphy mutex, we'd probably have done this completely differently :) And a hypothetical wiphy_delayed_work_pending() API should therefore not be required to be called with the wiphy mutex held. I think that perhaps we should add such a trivial inline instead of open-coding the timer_pending() check, but I'm not sure whether or not it should also check the list (i.e. check if timer has expired, but work hasn't started yet): on the one hand it seems more appropriate, and if actually holding the wiphy mutex it would in fact be completely correct, on the other hand maybe it encourages being sloppily thinking the return value is always perfect? Right now I'd tend to have the check and document that it's only guaranteed when under the wiphy mutex. > Forgot to mention that I am currently running a wiphy work version of > ieee80211_color_collision_detection_work() using timer_pending() as a > delayed_work_pending() substitute since a couple of hours and lockdep > did not complain so far. :) johannes