On 15.09.22 06:35, venkatch@xxxxxxxxx wrote:
From: Venkat Chimata <venkata@shasta.cloud>
OpenWiFi's ucentral agent periodically (typically 120 seconds)
issues nl80211 call to get associated client list from the
WLAN driver. Somehow this operation was causing tx/rx delays
sometimes and the video calls on connected clients are experiencing
jitter. The associated client list was protected by
a mutex lock. I saw that ieee80211_check_fast_xmit_all uses
rcu_read_lock and rcu_read_unlock to iterate through sta_list.
I took it as a refernce and changed the lock to rcu_read lock
from mutex.
Also saw this this comment just above sta_mutex declaration.
/* Station data */
/*
* The mutex only protects the list, hash table and
* counter, reads are done with RCU.
*/
struct mutex sta_mtx;
Hence tried changing mutex_lock(/unlock) in ieee80211_dump_station
to rcu_read_lock(/unlock) and it resolved the jitter issue in the
video calls.
Tests:
We had this issue show up consistently and the patch fixed the issue.
We spent a good part of 2 weeks following up on this and with this
fix, the video calls are smooth.
Also tested if this could cause any crashes with below mentioned
process.
1. Connect 3 clients
2. A script running dumping clients in an infinite loop
3. Continuously disconnect / connect one client to see if
there is any crash. No crash was observed.
The drv_sta_statistics call (called from sta_set_sinfo) can sleep, so
you're not allowed to use RCU for iterating over the station list.
Also, please analyze and explain the actual root cause before proposing
another fix.
- Felix