On 19/09/2024 10:15, Nicolas Escande wrote:
Hi guys,
Running a pretty hacked up kernel under lockdep, I've had a few splats like this
[ 75.453180] Call trace:
[ 75.455595] cfg80211_bss_color_notify+0x220/0x260
[ 75.460341] ieee80211_color_collision_detection_work+0x4c/0x5c
[ 75.466205] process_one_work+0x434/0x6ec
[ 75.470169] worker_thread+0x9c/0x624
[ 75.473794] kthread+0x1a4/0x1ac
[ 75.476987] ret_from_fork+0x10/0x20
Which shows we are calling cfg80211_obss_color_collision_notify and thus
cfg80211_bss_color_notify from ieee80211_color_collision_detection_work without
holding the wiphy's mtx.
It seems that we should either explicitely lock the phy using something like
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index 847304a3a29a..481e1550cb21 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -4833,9 +4833,12 @@ void ieee80211_color_collision_detection_work(struct work_struct *work)
container_of(delayed_work, struct ieee80211_link_data,
color_collision_detect_work);
struct ieee80211_sub_if_data *sdata = link->sdata;
+ struct ieee80211_local *local = sdata->local;
+ wiphy_lock(local->hw.wiphy);
cfg80211_obss_color_collision_notify(sdata->dev, link->color_bitmap,
link->link_id);
+ wiphy_unlock(local->hw.wiphy);
}
void ieee80211_color_change_finish(struct ieee80211_vif *vif, u8 link_id)
Or switch to using the wiphy_work_queue infrastructure.
Did I miss something ? Which one should we do ? I'm not sure of all the
implications of switching to the wiphy work queue and why it did not get
converted like the color_change_finalize_work stuff ?
ieee80211_color_collision_detection_work() used to lock the wdev mutex, now it
does not hold anything since 076fc8775da("wifi: cfg80211: remove wdev mutex")
Also the rate limiting uses delayed_work_pending(), There is no wiphy work queue
equivalent AFAIK, so the explicit lock is probably the way to go.