Re: [PATCH net] usbnet: smsc95xx: Fix deadlock on runtime resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 27, 2022 at 10:00:08AM -0400, Alan Stern wrote:
> On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote:
> > Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended
> > smsc95xx_resume() to call phy_init_hw().  That function waits for the
> > device to runtime resume even though it is placed in the runtime resume
> > path, causing a deadlock.
> > 
> > The problem is that phy_init_hw() calls down to smsc95xx_mdiobus_read(),
> > which never uses the _nopm variant of usbnet_read_cmd().  Amend it to
> > autosense that it's called from the runtime resume/suspend path and use
> > the _nopm variant if so.
[...]
> > --- a/drivers/net/usb/smsc95xx.c
> > +++ b/drivers/net/usb/smsc95xx.c
> > @@ -285,11 +285,21 @@ static void smsc95xx_mdio_write_nopm(struct usbnet *dev, int idx, int regval)
> >  	__smsc95xx_mdio_write(dev, pdata->phydev->mdio.addr, idx, regval, 1);
> >  }
> >  
> > +static bool smsc95xx_in_pm(struct usbnet *dev)
> > +{
> > +#ifdef CONFIG_PM
> > +	return dev->udev->dev.power.runtime_status == RPM_RESUMING ||
> > +	       dev->udev->dev.power.runtime_status == RPM_SUSPENDING;
> > +#else
> > +	return false;
> > +#endif
> > +}
> 
> This does not do what you want.  You want to know if this function is 
> being called in the resume pathway, but all it really tells you is 
> whether the function is being called while a resume is in progress (and 
> it doesn't even do that very precisely because the code does not use the 
> runtime-pm spinlock).  The resume could be running in a different 
> thread, in which case you most definitely _would_ want to want for it to 
> complete.

I'm aware of that.  I've explored various approaches and none solved
the problem perfectly.  This one seems good enough for all practical
purposes.

One approach I've considered is to use current_work() to determine if
we're called from dev->power.work.  But that only works if the runtime
resume/suspend is asynchronous (RPM_ASYNC is set).  In this case, the
runtime resume is synchronous and called from a different work item
(hub_event).  So the approach is not feasible.

Another approach is to assign a dev_pm_domain to the usb_device, whose
->runtime_resume hook first calls usb_runtime_resume() (so that the
usb_device and usb_interface has status RPM_ACTIVE), *then* calls
phy_init_hw().  Problem is, this only works for runtime resume
and we need a solution for runtime suspend as well.  (The device already
has status RPM_SUSPENDING when the dev_pm_domain's ->runtime_suspend hook
is invoked.)  So not a feasible approach either.

Fudging the runtime_status in the ->runtime_suspend and ->runtime_resume
hooks via pm_runtime_set_active() / _set_suspended() is rejected by the
runtime PM core.

I've even considered walking up the callstack via _RET_IP_ to determine
if one of the callers is smsc95xx_resume() / _suspend().  But I'm not
sure that's reliable and portable across all arches.

And I don't want to clutter phylib with _nopm variants either.

So the approach I've chosen here, while not perfect, does its job,
is simple and uses very little code.  If you've got a better idea,
please let me know.

Thanks,

Lukas



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux