Search Linux Wireless

Re: [PATCH v2] mwifiex: fix kernel crash after shutdown command timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dmitry and Amit,

On Thu, Mar 16, 2017 at 11:41:15AM -0700, Brian Norris wrote:
> On Thu, Mar 16, 2017 at 11:33:17AM -0700, Dmitry Torokhov wrote:
> > On Thu, Mar 16, 2017 at 03:58:52PM +0530, Amitkumar Karwar wrote:
> > > We observed a SHUTDOWN command timeout during reboot stress test
> > > due to a corner case firmware bug. It leads to use-after-free on
> > > adapter structure pointer and crash.
> > > 
> > > Let's add MWIFIEX_IFACE_WORK_DONT_RUN work flag to avoid executing

BTW, the 'DONT_RUN' suggestion was more of a pseudo-code suggestion than
a real name, but I guess it's not terrible :)

> > > any work scheduled after cancel_work_sync() call in teardown path
> > > to resolve the issue.
> > > 
> > > Signed-off-by: Amitkumar Karwar <akarwar@xxxxxxxxxxx>
> > > ---
> > > v2: New work_flag has been added to resolve the issue cleanly as per
> > > Brian's suggestion.
> > > ---
> > >  drivers/net/wireless/marvell/mwifiex/main.h | 1 +
> > >  drivers/net/wireless/marvell/mwifiex/pcie.c | 4 ++++
> > >  drivers/net/wireless/marvell/mwifiex/sdio.c | 4 ++++
> > >  3 files changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h
> > > index 5c82972..d5b1fd6 100644
> > > --- a/drivers/net/wireless/marvell/mwifiex/main.h
> > > +++ b/drivers/net/wireless/marvell/mwifiex/main.h
> > > @@ -510,6 +510,7 @@ struct mwifiex_roc_cfg {
> > >  enum mwifiex_iface_work_flags {
> > >  	MWIFIEX_IFACE_WORK_DEVICE_DUMP,
> > >  	MWIFIEX_IFACE_WORK_CARD_RESET,
> > > +	MWIFIEX_IFACE_WORK_DONT_RUN,
> > >  };
> > >  
> > >  struct mwifiex_private {
> > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > index a0d9180..bb3d798 100644
> > > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > @@ -294,6 +294,7 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
> > >  	if (!adapter || !adapter->priv_num)
> > >  		return;
> > >  
> > > +	set_bit(MWIFIEX_IFACE_WORK_DONT_RUN, &card->work_flags);
> > >  	cancel_work_sync(&card->work);
> > >  
> > >  	reg = card->pcie.reg;
> > > @@ -2721,6 +2722,9 @@ static void mwifiex_pcie_work(struct work_struct *work)
> > >  	struct pcie_service_card *card =
> > >  		container_of(work, struct pcie_service_card, work);
> > >  
> > > +	if (test_bit(MWIFIEX_IFACE_WORK_DONT_RUN, &card->work_flags))
> > > +		return;
> > 
> > I do not see how this could possible prevent use-after-free, assuming
> > that the "card" memory is gone by the time mwifiex_pcie_work() gets to
> > run.
> 
> The 'card' memory isn't getting freed; it's the 'adapter' memory we're
> worried about. This is either already freed (because the FW init
> procedure failed), or else it's freed later in this function via
> mwifiex_remove_card().

I guess there was a slight miscommunication here: Dmitry pointed out to
me that he *was* actually talking about 'card' getting freed -- when it
gets freed after remove() finishes.

So the sequence would have to go like:

1. enter remove()
2. set DONT_RUN flag; cancel_work_sync()
3. begin to shutdown firmware
4. hit, e.g., a command timeout that schedules the work again
5. ** scheduler decides not to schedule the work for a while **
6. we finish mwifiex_remove_card(), and exit from remove() successfully
7. devm_* frees the pcie_service_card (and enclosed work_struct)
8. scheduler tries to run our work item
9. use-after-free!

However unlikely that the delay from 4 to 8 might be, this is indeed a
race condition.

> (We're also worried about having the FW dump race with the FW shutdown
> sequence, which can begin later in this function. This patch blocks both
> races AFAICT.)
> 
> > You need to check this flag before queueing firmware dump work, and
> > make sure it is not racy with setting this flag in mwifiex_pcie_remove()
> > (and sdio).
> 
> That's another approach that could work, but it's a little more
> invasive.

Never mind, that isn't too invasive. There's only one schedule_work() in
pcie.c and two in sdio.c. We could even factor out a helper, that knows
how to check the appropriate MWIFIEX_IFACE_* flags, if we really wanted
to...

Brian



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux