Search Linux Wireless

Re: [PATCH v2] wifi: mac80211: Fix performance issue with mutex_lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Felix,

 Thanks again for patiently discussing things. I clearly understand
what you said about locks. Now that we know the background of the
problem, do you suggest any potential solution or any potential
direction that I should start investigating? Please let me know.

Thanks & Regards
Venkat

On Wed, 21 Sept 2022 at 13:35, Felix Fietkau <nbd@xxxxxxxx> wrote:
>
>
> On 20.09.22 21:23, Venkat Ch wrote:
> > Hi Felix,
> >
> >   Following is the background of the problem, how I traced to
> > mutex_lock and why I propose rcu locks.
> >
> > Issue:
> >   On a 10Mbps upload / 50 Mbps download connection, the following issue reported.
> >
> > Video periodically freezes and/or appears delayed when on Zoom or Teams.
> > 1. Video will freeze for 10 or 15 seconds periodically when on a call,
> > but audio continues and the session doesn't drop.
> > 2. The video  still works but it appears delay (I move, but the video
> > of my movement is noticeably delay by a second or so)
> >
> > Tracing to mutex_lock(sta_mutex):
> >
> >   When I investigated, I found that the ucentral agent in openwifi
> > fetches the station list periodically.  Without the station list
> > fetch, the video quality is just fine. I investigated the station list
> > path and found this mutex_lock. I also see that earlier it was
> > rcu_lock which protected the station list in this path. In this
> > commit, https://github.com/torvalds/linux/commit/66572cfc30a4b764150c83ee5d842a3ce17991c9,
> > rcu lock was changed to mutex lock without providing any reason.
> The reason seems clear to me, even though it was not explicitly stated
> in the commit message: in sta_set_sinfo it introduces a call to a driver
> op that is allowed to sleep.
>
> > I also saw this comment just above the sta_mutex declaration.
> >          /* Station data */
> >          /*
> >           * The mutex only protects the list, hash table and
> >           * counter, reads are done with RCU.
> >           */
> >          struct mutex sta_mtx;
> >
> > So I reverted back the mutex_lock with rcu_lock and it just worked
> > fine. We tested for more than 2 weeks before concluding this analysis.
> >
> > I think the usage of mutex_lock is impacting the tx / rx path
> > somewhere and hence the issue. It is a challenge to trace the exact
> > function though.
>
> I don't see any critical part in the tx/rx path which depends on the
> sta_mtx lock. My guess is that for some reason your change is simply
> accidentally papering over the real bug, which may be somewhere else
> entirely, maybe even in the driver. A freeze for 10-15 seconds
> definitely does not sound like simple lock contention on the mutex,
> since a single station dump will be much faster than that.
>
> - Felix



-- 
If you rest, you rust



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux