Search Linux Wireless

Re: [EXT] Re: [PATCH] mwifiex: fix nested rtnl locking on BG_SCAN_STOPPED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ganapathi,

For some reason, I'm daft enough to reply to this ancient thread. It
appears as if you probably have not resolved this issue yet though, so
I figured I could lend some advice.

On Wed, Apr 25, 2018 at 1:06 AM Ganapathi Bhat <gbhat@xxxxxxxxxxx> wrote:
> > Ganapathi Bhat <gbhat@xxxxxxxxxxx> writes:
> > > --- a/drivers/net/wireless/marvell/mwifiex/sta_event.c
> > > +++ b/drivers/net/wireless/marvell/mwifiex/sta_event.c
> > > @@ -848,7 +848,10 @@ int mwifiex_process_sta_event(struct
> > > mwifiex_private *priv)
> > >
> > >     case EVENT_BG_SCAN_STOPPED:
> > >             dev_dbg(adapter->dev, "event: BGS_STOPPED\n");
> > > -           cfg80211_sched_scan_stopped(priv->wdev.wiphy, 0);
> > > +           if (rtnl_is_locked())
> > > +                   cfg80211_sched_scan_stopped_rtnl(priv-
> > >wdev.wiphy, 0);
> > > +           else
> > > +                   cfg80211_sched_scan_stopped(priv->wdev.wiphy,
> > 0);
> >
> > IMHO checking if a lock is taking is rather ugly and an indication there's a
> > problem with the locking. Instead making an ugly workaround like this I
> > would rather investigate who is holding the rtnl and solve that.

Agreed, this is not good. You are now bound to hit ASSERT_RTNL()
warnings occasionally, since you might see rtnl locked here, but a
split second later, when you're running in
__cfg80211_stop_sched_scan(), it could be released by some other
thread.

> There can be applications trying to access driver(via cfg80211), holding rtnl_lock.
> For example(in our case):
> 1. "iw dev"  was called, when BG_SCAN was active.
> 2. NL80211_CMD_GET_INTERFACE requires rtnl_lock to be hold(specified in internal_flags)
> 3. cfg80211 will  hold rtnl_lock before calling "get_tx_power"(in pre_doit).
> 4. mwifiex will download RF_TX_PWR command to firmware
> 5. firmware will send BG_SCAN_STOPPED event(since BG_SCAN was active).
> 6. mwifiex will call "cfg80211_sched_scan_stopped" causing nested rtnl locking.

IIUC, it's not exactly a nested lock, but a lock inversion issue.

#1
NL80211_CMD_GET_INTERFACE thread is holding rtnl lock and later
waiting on one of its HostCmd_CMD_* to complete

In the meantime:
#2
a EVENT_BG_SCAN_STOPPED is queued, and it grabs the rtnl lock

Because events are served on the same thread as commands, you get #1
waiting on #2, which is waiting on the lock held in #1. i.e., ABBA.

The way to resolve this is to either move event processing to a
different thread than command processing (that looks it would be very
difficult to do correctly, given the ossified structure of your
driver; but that might be a wise move in the long term)...
...or else maybe you could defer this specific piece of work to its
own thread. That might require yet another workqueue?

Anyway, the key point is that you really don't want to hold rtnl_lock
in your main workqueue, or in any other way that might block the rest
of the operation of your driver.

Brian



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux