Re: [PATCH] wifi: nl80211: fix deadlock in nl80211_set_cqm_rssi (6.6.x)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2023-12-11 at 07:47 +0100, Greg KH wrote:
> On Sun, Dec 10, 2023 at 09:39:30PM +0000, Léo Lam wrote:
> > Commit 4a7e92551618f3737b305f62451353ee05662f57 ("wifi: cfg80211: fix
> > CQM for non-range use" on 6.6.x) causes nl80211_set_cqm_rssi not to
> > release the wdev lock in some situations.
> > 
> > Of course, the ensuing deadlock causes userland network managers to
> > break pretty badly, and on typical systems this also causes lockups on
> > on suspend, poweroff and reboot. See [1], [2], [3] for example reports.
> > 
> > The upstream commit, 7e7efdda6adb385fbdfd6f819d76bc68c923c394
> > ("wifi: cfg80211: fix CQM for non-range use"), does not trigger this
> > issue because the wdev lock does not exist there.
> > 
> > Fix the deadlock by releasing the lock before returning.
> > 
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=218247
> > [2] https://bbs.archlinux.org/viewtopic.php?id=290976
> > [3] https://lore.kernel.org/all/87sf4belmm.fsf@xxxxxxxxxxxxx/
> > 
> > Fixes: 4a7e92551618 ("wifi: cfg80211: fix CQM for non-range use")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Léo Lam <leo@xxxxxxxxx>
> > ---
> >  net/wireless/nl80211.c | 18 ++++++++++++------
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> 

Apologies for the slow reply - been dealing with some eye soreness. :(

First of all, thank you for taking the time to review this and for
reverting the broken commit so quickly as it seems quite a few users
were hitting this.

> So this is only for the 6.6.y tree?  If so, you should at least cc: the
> other wireless developers involved in the original fix, right?
> 
You're right. Sorry I forgot to cc: johannes.berg@xxxxxxxxx; though just
to clarify, there is nothing wrong with their commit per se; the issue
comes from how it was backported without 076fc8775daf ("wifi: cfg80211:
remove wdev mutex").

> And what commit actually fixed this issue upstream, why not take that
> instead?
> 

As far as I understand, this was never an issue upstream because
076fc8775daf ("wifi: cfg80211: remove wdev mutex") was committed in
August, *before* commit 7e7efdda6adb ("wifi: cfg80211: fix CQM for non-
range use") added the early returns in late November. This only became
an issue on the 6.1.x and 6.6.x trees because the CQM fix commit was
applied without first applying the "remove wdev mutex" as well.

I did consider taking 076fc8775daf (i.e. removing the wdev mutex) and
applying it to the 6.6.x tree but that diff is much bigger than 100
lines long and I thought it would be simpler and safer to just fix the
buggy error handling. Especially for a newcomer who isn't very familiar
with the development process...


-- 
Thanks,
Leo






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux