Re: [PATCH 2/4] usb: musb: Fix sleeping function called from invalid context for hdrc glue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 10, 2016 at 10:41:50AM -0700, Tony Lindgren wrote:
> * Johan Hovold <johan@xxxxxxxxxx> [161110 09:04]:
> > On Wed, Nov 09, 2016 at 10:54:38AM -0700, Tony Lindgren wrote:
> > > * Johan Hovold <johan@xxxxxxxxxx> [161109 08:40]:
> > > > On Tue, Nov 08, 2016 at 06:26:07PM -0700, Tony Lindgren wrote:
> > > > > * Johan Hovold <johan@xxxxxxxxxx> [161108 12:03]:
> > > > > > On Tue, Nov 08, 2016 at 10:34:13AM -0700, Tony Lindgren wrote:
> > > > > > > * Johan Hovold <johan@xxxxxxxxxx> [161108 10:09]:
> > 
> > > > > > > > In fact, the dsps timer must also be cancelled on suspend, or you could
> > > > > > > > end up calling dsps_check_status() while suspended (it is currently not
> > > > > > > > cancelled until the parent device is suspended, which could be too
> > > > > > > > late).
> > > > > > > 
> > > > > > > And then this should no longer be an issue either.
> > > > > > 
> > > > > > It would still be an issue as a system-suspending device could already
> > > > > > have been runtime-resumed so that dsps_check_status() would be called
> > > > > > directly from the timer function.
> > > > > 
> > > > > The glue layers should do pm_runtime_get_sync(musb->controller) which
> > > > > dsps glue already does. So that's the musb_core.c device instance. And
> > > > > looks like we have dsps_suspend() call del_timer_sync(&glue->timer)
> > > > > already. I think we're safe here.
> > > > 
> > > > But the point is that the controller might be RPM_ACTIVE if the
> > > > controller was already runtime resumed when it is system suspended.
> > > > 
> > > > Since this (and the previous) patch run the work directly from the timer
> > > > callback if active, it could end up accessing the controller after it
> > > > has been system suspended. Specifically, stopping the timer in the glue
> > > > (parent) suspend callback is too late to avoid this.
> > > > 
> > > > 	pm_runtime_get_sync(musb->controller);
> > > > 		musb_runtime_resume()
> > > > 			musb_restore_context();
> > > > 	
> > > > 	...
> > > > 
> > > > 	musb_suspend()
> > > > 		musb_save_context();
> > > > 
> > > > 	otg_timer()
> > > > 		pm_runtime_get();
> > > > 		if (pm_runtime_active(musb->controller))
> > > > 			dsps_check_status();
> > > > 		pm_runtime_put_autosuspend();
> > > > 
> > > > 	dsps_suspend()
> > > > 		del_timer_sync();
> > > > 
> > > 
> > > OK so we need to return without doing anything from otg_timer() on
> > > pm_runtime_get() to avoid that.
> > 
> > I'm afraid that won't work as pm_runtime_get() would still succeed (i.e.
> > even after musb_suspend()).
> > 
> > See 6f3c77b040fc ("PM / Runtime: let rpm_resume() succeed if RPM_ACTIVE,
> > even when disabled, v2").
> 
> But doesn't that assume that we have musb core as in musb->controller
> in RPM_ACTIVE state? While if it's been suspended that's not the
> case meaning rpm_resume would fail?

Right, and it's still a good idea to check the return value of
pm_runtime_get(). It just won't be enough for when RPM_ACTIVE.

> If we have a window for a race there with RPM_ACTIVE set, we could
> add musb->is_disabled flag that gets set in musb_suspend().

Yes.

> > > In the long run it would be nice to make whatever optional state polling
> > > musb generic with just a glue layer callback.
> > 
> > Yes, and make sure to stop polling in musb_suspend(). Would it be
> > possible to use the enable and disable ops for this until then?
> 
> Hmm care to explain a bit more? That is assuming that rpm_resume()
> won't fail above.. And that using musb->is_disabled flag won't
> work..

By stopping the timer in musb->ops->disable which is called during
suspend, the race could perhaps also be avoided.

> > > > > +	if (musb->is_runtime_suspended) {
> > > > > +		list_add_tail(&w->node, &musb->pending_list);
> > > > > +		error = 0;
> > > > > +	} else {
> > > > > +		dev_err(musb->controller, "could not add resume work %p\n",
> > > > > +			callback);
> > > > > +		devm_kfree(musb->controller, w);
> > > > > +		error = -EINPROGRESS;
> > > > 
> > > > But this means you should be able to run the callback below, right? It
> > > > has to be run from somewhere so otherwise the caller needs to retry
> > > > instead.
> > > 
> > > Well there's no longer need to run the callback at that point any longer
> > > and with that removed that should not be an issue.
> > > 
> > > Anyways, musb->is_runtime_suspended is needed to protect anything from
> > > being queued between runtime resume having called musb_run_resume_work()
> > > and before pm_runtime_active() is true. At that point the caller could
> > > just wait for pm_runtime_active() to be set and run the code. But based
> > > on my tests that does not happen and queueing is faster than getting to
> > > the pm_runtime_active() state so we just print errors in those cases if
> > > we ever hit it later on.
> > 
> > But for a generic solution, this race could still be an issue. What
> > about musb_gadget_queue() for example? Would not failing to start I/O
> > if racing with resume be a problem?
> 
> It's probably safest to return an error for cases like that rather
> than attempt to do something and risk corrupting packets. I'll move
> the check to happen earlier musb_gadget_queue() so we don't do
> anything with the request if pm_runtime_get() fails there. That
> should make it easy to add further handling if we ever start really
> hitting that issue.

Ok.

Thanks,
Johan
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux