Search Linux Wireless

Re: [PATCH v2] wifi: mac80211: Fix performance issue with mutex_lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 20.09.22 21:23, Venkat Ch wrote:
Hi Felix,

  Following is the background of the problem, how I traced to
mutex_lock and why I propose rcu locks.

Issue:
  On a 10Mbps upload / 50 Mbps download connection, the following issue reported.

Video periodically freezes and/or appears delayed when on Zoom or Teams.
1. Video will freeze for 10 or 15 seconds periodically when on a call,
but audio continues and the session doesn't drop.
2. The video  still works but it appears delay (I move, but the video
of my movement is noticeably delay by a second or so)

Tracing to mutex_lock(sta_mutex):

  When I investigated, I found that the ucentral agent in openwifi
fetches the station list periodically.  Without the station list
fetch, the video quality is just fine. I investigated the station list
path and found this mutex_lock. I also see that earlier it was
rcu_lock which protected the station list in this path. In this
commit, https://github.com/torvalds/linux/commit/66572cfc30a4b764150c83ee5d842a3ce17991c9,
rcu lock was changed to mutex lock without providing any reason.
The reason seems clear to me, even though it was not explicitly stated in the commit message: in sta_set_sinfo it introduces a call to a driver op that is allowed to sleep.

I also saw this comment just above the sta_mutex declaration.
         /* Station data */
         /*
          * The mutex only protects the list, hash table and
          * counter, reads are done with RCU.
          */
         struct mutex sta_mtx;

So I reverted back the mutex_lock with rcu_lock and it just worked
fine. We tested for more than 2 weeks before concluding this analysis.

I think the usage of mutex_lock is impacting the tx / rx path
somewhere and hence the issue. It is a challenge to trace the exact
function though.

I don't see any critical part in the tx/rx path which depends on the sta_mtx lock. My guess is that for some reason your change is simply accidentally papering over the real bug, which may be somewhere else entirely, maybe even in the driver. A freeze for 10-15 seconds definitely does not sound like simple lock contention on the mutex, since a single station dump will be much faster than that.

- Felix



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux