Search Linux Wireless

RE: [PATCH v2 2/2] mwifiex: check hw_status in suspend and resume handlers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brian,

Thanks for your thorough review.

> From: Brian Norris [mailto:briannorris@xxxxxxxxxxxx]
> Sent: Wednesday, October 05, 2016 2:35 AM
> To: Amitkumar Karwar
> Cc: linux-wireless@xxxxxxxxxxxxxxx; Cathy Luo; Nishant Sarmukadam;
> rajatja@xxxxxxxxxx; briannorris@xxxxxxxxxx; Xinming Hu
> Subject: Re: [PATCH v2 2/2] mwifiex: check hw_status in suspend and
> resume handlers
> 
> Hi,
> 
> On Tue, Oct 04, 2016 at 10:38:25PM +0530, Amitkumar Karwar wrote:
> > From: Xinming Hu <huxm@xxxxxxxxxxx>
> >
> > We have observed a kernel crash when system immediately suspends after
> > booting. There is a race between suspend and driver initialization
> > paths.
> > This patch adds hw_status checks to fix the problem
> >
> > Signed-off-by: Xinming Hu <huxm@xxxxxxxxxxx>
> > Signed-off-by: Amitkumar Karwar <akarwar@xxxxxxxxxxx>
> > ---
> > v2: Return failure in suspend/resume handler in this scenario.
> > ---
> >  drivers/net/wireless/marvell/mwifiex/pcie.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > index ba9e068..fa6bf85 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > @@ -122,9 +122,10 @@ static int mwifiex_pcie_suspend(struct device
> > *dev)
> >
> >  	if (pdev) {
> >  		card = pci_get_drvdata(pdev);
> > -		if (!card || !card->adapter) {
> > +		if (!card || !card->adapter ||
> > +		    card->adapter->hw_status != MWIFIEX_HW_STATUS_READY) {
> 
> Wait, is there no locking on the 'hw_status' field? That is inherently
> an unsafe race all on its own; you're not guaranteed that this will be
> read/written atomically. And you also aren't guaranteed that writes to
> this happen in the order they appear in the code -- in other words,
> reading this flag doesn't necessarily guarantee that initialization is
> actually complete (even if that's very likely to be true, given that
> it's probably just a single-instruction word-access, and any prior HW
> polling or interrupts likely have done some synchronization and can't be
> reordered).
> actually complete

Here is the brief info on how "hw_status" flag is updated.
1) It gets changed incrementally during initialization.
    MWIFIEX_HW_STATUS_INITIALIZING -> MWIFIEX_HW_STATUS_INIT_DONE -> MWIFIEX_HW_STATUS_READY

2) Status will remain READY once driver+firmware is up and running.

3) Below is status during teardown
MWIFIEX_HW_STATUS_READY -> MWIFIEX_HW_STATUS_RESET -> MWIFIEX_HW_STATUS_CLOSING -> MWIFIEX_HW_STATUS_NOT_READY

As the events occur one after another, we don't expect a race and don't need locking here for the flag. Flag status MWIFIEX_HW_STATUS_READY guarantees that initialization is completed.

In worst case scenario, only first system suspend attempt issued immediately after system boot will be aborted with BUSY error. I think, that should be fine.

Let me know if you have any concerns.

> 
> This is probably better than nothing, but it's not very good.
> 
> >  			pr_err("Card or adapter structure is not valid\n");
> > -			return 0;
> > +			return -EBUSY;
> 
> So the above cases all mean that the driver hasn't finished loading,
> right?
> 
> !card => can't happen (PCIe probe() would have failed
> mwifiex_add_card()), but fine to check

NULL "card" or "card->adapter" pointers are expected in below cases
1) Driver initialization failed after downloading the firmware. Driver is performing cleanup in init failure path. Now system suspends.
2) Race of teardown + suspend
 
> 
> !card->adapter => only happens after patch 1; i.e., when tearing down
> the device and detaching it from the driver
> 
> card->adapter->hw_status != MWIFIEX_HW_STATUS_READY => FW is not loaded
> (i.e., in the process of starting or stopping FW?)
> 
> I guess all of those cases make sense to be -EBUSY.

Yes. we can keep -EBUSY for all of the cases.

> 
> >  		}
> >  	} else {
> >  		pr_err("PCIE device is not specified\n");
> 
> I was going to complain about this branch (!pdev) returning 0, since
> that looked asymmetric. But this case will never happen, actually, since
> to_pci_dev() is just doing struct offset arithmetic on struct device,
> and struct device is never going to be NULL. It probably makes more
> sense to just remove this branch entirely, but that's for another patch.

Thanks for pointing this out. I will create separate patch for this.

> 
> > @@ -166,9 +167,10 @@ static int mwifiex_pcie_resume(struct device
> > *dev)
> >
> >  	if (pdev) {
> >  		card = pci_get_drvdata(pdev);
> > -		if (!card || !card->adapter) {
> > +		if (!card || !card->adapter ||
> > +		    card->adapter->hw_status != MWIFIEX_HW_STATUS_READY) {
> 
> Same complaint, except if you've screwed up here, you probably already
> screwed up in suspend().
> 
> Brian
> 
> >  			pr_err("Card or adapter structure is not valid\n");
> > -			return 0;
> > +			return -EBUSY;
> >  		}
> >  	} else {
> >  		pr_err("PCIE device is not specified\n");
> > --
> > 1.9.1
> >

Regards,
Amitkumar



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux