On 20/09/2024 10:28, Remi Pommarel wrote:
On Thu, Sep 19, 2024 at 12:22:10PM +0200, Johannes Berg wrote:
On Thu, 2024-09-19 at 12:02 +0200, Nicolas Cavallari wrote:
Did I miss something ? Which one should we do ? I'm not sure of all the
implications of switching to the wiphy work queue and why it did not get
converted like the color_change_finalize_work stuff ?
ieee80211_color_collision_detection_work() used to lock the wdev mutex, now it
does not hold anything since 076fc8775da("wifi: cfg80211: remove wdev mutex")
Also the rate limiting uses delayed_work_pending(), There is no wiphy work queue
equivalent AFAIK, so the explicit lock is probably the way to go.
That won't work, it's cancel_delayed_work_sync() under the wiphy mutex,
so that'll cause deadlocks (and should cause lockdep complaints about
them.)
Yes you are right, and AFAIU that is this kind of issue using wiphy work
queue would prevent. With wiphy work queue if wiphy_delayed_work_cancel
is called with wiphy lock held, work cannot be running after it returns;
making it handy to replace cancel_delayed_work_sync() with.
So, in my opinion, switching to wiphy work queue seems to be the
prefered solution here.
While there is no wiphy work queue equivalent of delayed_work_pending(),
I think using timer_pending(&link->color_collision_detect_work->timer)
to replace delayed_work_pending(), even if the semantic is a bit
different, would be ok to fulfill the rate limiting purpose. Having the
same delayed_work_pending() semantics on wiphy work queue would require
to take wiphy lock which seem a bit superfluous here.
Does that make sense ?
At a cost of a spin_trylock, we can also simply take the __ratelimit() option
(or code an equivalent) and schedule a wiphy_work immediately.
The goal is just to rate limit the work (otherwise netlink is flooded and the
system is hosed). queue_delayed_work() and delayed_work_pending() was just an
easy way of implementing it since it had to be a work anyway.