* Johan Hovold <johan@xxxxxxxxxx> [161108 12:03]: > On Tue, Nov 08, 2016 at 10:34:13AM -0700, Tony Lindgren wrote: > > * Johan Hovold <johan@xxxxxxxxxx> [161108 10:09]: > > > On Mon, Nov 07, 2016 at 02:50:18PM -0700, Tony Lindgren wrote: > > > > @@ -2604,6 +2669,9 @@ static int musb_resume(struct device *dev) > > > > mask = MUSB_DEVCTL_BDEVICE | MUSB_DEVCTL_FSDEV | MUSB_DEVCTL_LSDEV; > > > > if ((devctl & mask) != (musb->context.devctl & mask)) > > > > musb->port1_status = 0; > > > > + > > > > + schedule_delayed_work(&musb->pending_resume_work, 0); > > > > + > > > > > > The interactions with system suspend looks a bit suspicious. It seems > > > you need to drain any pending resume work on suspend for example. > > > > > > And then the above should not be needed, right? > > > > Hmm that's an interesting idea. I think that would allow us to get > > rid of the delayed work if we check the list both on resume and > > suspend :) > > Note that I was referring to draining the runtime-resume work on system > suspend above. I think we still want the delayed work considering that > we could stay active for long periods of time. It seems it's the delayed work causing these race issues, so I'd like to avoid it. I think we can do it all from PM runtime resume and checking at the end of musb_queue_resume_work(). We need to introduce is_runtime_suspended though. This is along the lines of what we have in Documentation/power/runtime_pm.txt if you search for completion there. Below is a version doing things without delayed work, care to check again if you still see races there? > > > In fact, the dsps timer must also be cancelled on suspend, or you could > > > end up calling dsps_check_status() while suspended (it is currently not > > > cancelled until the parent device is suspended, which could be too > > > late). > > > > And then this should no longer be an issue either. > > It would still be an issue as a system-suspending device could already > have been runtime-resumed so that dsps_check_status() would be called > directly from the timer function. The glue layers should do pm_runtime_get_sync(musb->controller) which dsps glue already does. So that's the musb_core.c device instance. And looks like we have dsps_suspend() call del_timer_sync(&glue->timer) already. I think we're safe here. Regards, Tony 8< ---------------------- >From tony Mon Sep 17 00:00:00 2001 From: Tony Lindgren <tony@xxxxxxxxxxx> Date: Wed, 2 Nov 2016 19:59:05 -0700 Subject: [PATCH] usb: musb: Fix sleeping function called from invalid context for hdrc glue Commit 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS glue layer") wrongly added a call for pm_runtime_get_sync to otg_timer that runs in softirq context. That causes a "BUG: sleeping function called from invalid context" every time when polling the cable status: [<c015ebb4>] (__might_sleep) from [<c0413d60>] (__pm_runtime_resume+0x9c/0xa0) [<c0413d60>] (__pm_runtime_resume) from [<c04d0bc4>] (otg_timer+0x3c/0x254) [<c04d0bc4>] (otg_timer) from [<c0191180>] (call_timer_fn+0xfc/0x41c) [<c0191180>] (call_timer_fn) from [<c01915c0>] (expire_timers+0x120/0x210) [<c01915c0>] (expire_timers) from [<c0191acc>] (run_timer_softirq+0xa4/0xdc) [<c0191acc>] (run_timer_softirq) from [<c010168c>] (__do_softirq+0x12c/0x594) I did not notice that as I did not have CONFIG_DEBUG_ATOMIC_SLEEP enabled. And looks like also musb_gadget_queue() suffers from the same problem. Let's fix the issue by using a list of delayed work then call it on resume. Note that we want to do this only when musb core and it's parent devices are awake as noted by Johan Hovold <johan@xxxxxxxxxx>. Later on we may be able to remove other delayed work in the musb driver and just do it from pending_resume_work. But this should be done only for delayed work that does not have other timing requirements beyond just being run on resume. Fixes: 65b3f50ed6fa ("usb: musb: Add PM runtime support for MUSB DSPS glue layer") Reported-by: Johan Hovold <johan@xxxxxxxxxx> Signed-off-by: Tony Lindgren <tony@xxxxxxxxxxx> --- drivers/usb/musb/musb_core.c | 114 +++++++++++++++++++++++++++++++++++++++-- drivers/usb/musb/musb_core.h | 7 +++ drivers/usb/musb/musb_dsps.c | 31 +++++++---- drivers/usb/musb/musb_gadget.c | 21 ++++++-- 4 files changed, 156 insertions(+), 17 deletions(-) diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c --- a/drivers/usb/musb/musb_core.c +++ b/drivers/usb/musb/musb_core.c @@ -1969,6 +1969,7 @@ static struct musb *allocate_instance(struct device *dev, INIT_LIST_HEAD(&musb->control); INIT_LIST_HEAD(&musb->in_bulk); INIT_LIST_HEAD(&musb->out_bulk); + INIT_LIST_HEAD(&musb->pending_list); musb->vbuserr_retry = VBUSERR_RETRY_COUNT; musb->a_wait_bcon = OTG_TIME_A_WAIT_BCON; @@ -2018,6 +2019,87 @@ static void musb_free(struct musb *musb) musb_host_free(musb); } +struct musb_pending_work { + int (*callback)(struct musb *musb, void *data); + void *data; + struct list_head node; +}; + +/* + * Called from musb_runtime_resume(), musb_resume(), and + * musb_queue_resume_work(). Callers must take musb->lock. + */ +static int musb_run_resume_work(struct musb *musb) +{ + struct musb_pending_work *w, *_w; + unsigned long flags; + int error = 0; + + spin_lock_irqsave(&musb->list_lock, flags); + list_for_each_entry_safe(w, _w, &musb->pending_list, node) { + if (w->callback) { + error = w->callback(musb, w->data); + if (error < 0) { + dev_err(musb->controller, + "resume callback %p failed: %i\n", + w->callback, error); + } + } + list_del(&w->node); + devm_kfree(musb->controller, w); + } + spin_unlock_irqrestore(&musb->list_lock, flags); + + return error; +} + +/* + * Called to run work if device is active or else queue the work to happen + * on resume. Caller must take musb->lock. + * + * Note that we cowardly refuse queuing work after musb PM runtime + * resume is done calling musb_run_resume_work() and return -EINPROGRESS + * instead. + */ +int musb_queue_resume_work(struct musb *musb, + int (*callback)(struct musb *musb, void *data), + void *data) +{ + struct musb_pending_work *w; + unsigned long flags; + int error; + + if (WARN_ON(!callback)) + return -EINVAL; + + if (pm_runtime_active(musb->controller)) + return callback(musb, data); + + w = devm_kzalloc(musb->controller, sizeof(*w), GFP_ATOMIC); + if (!w) + return -ENOMEM; + + w->callback = callback; + w->data = data; + spin_lock_irqsave(&musb->list_lock, flags); + if (musb->is_runtime_suspended) { + list_add_tail(&w->node, &musb->pending_list); + error = 0; + } else { + dev_err(musb->controller, "could not add resume work %p\n", + callback); + devm_kfree(musb->controller, w); + error = -EINPROGRESS; + } + spin_unlock_irqrestore(&musb->list_lock, flags); + + if (pm_runtime_active(musb->controller)) + return musb_run_resume_work(musb); + + return error; +} +EXPORT_SYMBOL_GPL(musb_queue_resume_work); + static void musb_deassert_reset(struct work_struct *work) { struct musb *musb; @@ -2065,6 +2147,7 @@ musb_init_controller(struct device *dev, int nIrq, void __iomem *ctrl) } spin_lock_init(&musb->lock); + spin_lock_init(&musb->list_lock); musb->board_set_power = plat->set_power; musb->min_power = plat->min_power; musb->ops = plat->platform_ops; @@ -2556,6 +2639,7 @@ static int musb_suspend(struct device *dev) struct musb *musb = dev_to_musb(dev); unsigned long flags; + WARN_ON(!list_empty(&musb->pending_list)); musb_platform_disable(musb); musb_generic_disable(musb); @@ -2579,9 +2663,11 @@ static int musb_suspend(struct device *dev) static int musb_resume(struct device *dev) { - struct musb *musb = dev_to_musb(dev); - u8 devctl; - u8 mask; + struct musb *musb = dev_to_musb(dev); + unsigned long flags; + int error; + u8 devctl; + u8 mask; /* * For static cmos like DaVinci, register values were preserved @@ -2599,6 +2685,7 @@ static int musb_resume(struct device *dev) mask = MUSB_DEVCTL_BDEVICE | MUSB_DEVCTL_FSDEV | MUSB_DEVCTL_LSDEV; if ((devctl & mask) != (musb->context.devctl & mask)) musb->port1_status = 0; + if (musb->need_finish_resume) { musb->need_finish_resume = 0; schedule_delayed_work(&musb->finish_resume_work, @@ -2615,6 +2702,13 @@ static int musb_resume(struct device *dev) musb_start(musb); + spin_lock_irqsave(&musb->lock, flags); + error = musb_run_resume_work(musb); + if (error) + dev_err(musb->controller, "resume work failed with %i\n", + error); + spin_unlock_irqrestore(&musb->lock, flags); + return 0; } @@ -2622,14 +2716,18 @@ static int musb_runtime_suspend(struct device *dev) { struct musb *musb = dev_to_musb(dev); + WARN_ON(!list_empty(&musb->pending_list)); musb_save_context(musb); + musb->is_runtime_suspended = 1; return 0; } static int musb_runtime_resume(struct device *dev) { - struct musb *musb = dev_to_musb(dev); + struct musb *musb = dev_to_musb(dev); + unsigned long flags; + int error; /* * When pm_runtime_get_sync called for the first time in driver @@ -2651,6 +2749,14 @@ static int musb_runtime_resume(struct device *dev) msecs_to_jiffies(USB_RESUME_TIMEOUT)); } + spin_lock_irqsave(&musb->lock, flags); + error = musb_run_resume_work(musb); + if (error) + dev_err(musb->controller, "resume work failed with %i\n", + error); + musb->is_runtime_suspended = 0; + spin_unlock_irqrestore(&musb->lock, flags); + return 0; } diff --git a/drivers/usb/musb/musb_core.h b/drivers/usb/musb/musb_core.h --- a/drivers/usb/musb/musb_core.h +++ b/drivers/usb/musb/musb_core.h @@ -303,6 +303,7 @@ struct musb_context_registers { struct musb { /* device lock */ spinlock_t lock; + spinlock_t list_lock; /* resume work list lock */ struct musb_io io; const struct musb_platform_ops *ops; @@ -337,6 +338,7 @@ struct musb { struct list_head control; /* of musb_qh */ struct list_head in_bulk; /* of musb_qh */ struct list_head out_bulk; /* of musb_qh */ + struct list_head pending_list; /* pending work list */ struct timer_list otg_timer; struct notifier_block nb; @@ -386,6 +388,7 @@ struct musb { unsigned long idle_timeout; /* Next timeout in jiffies */ unsigned is_initialized:1; + unsigned is_runtime_suspended:1; /* active means connected and not suspended */ unsigned is_active:1; @@ -542,6 +545,10 @@ extern irqreturn_t musb_interrupt(struct musb *); extern void musb_hnp_stop(struct musb *musb); +int musb_queue_resume_work(struct musb *musb, + int (*callback)(struct musb *musb, void *data), + void *data); + static inline void musb_platform_set_vbus(struct musb *musb, int is_on) { if (musb->ops->set_vbus) diff --git a/drivers/usb/musb/musb_dsps.c b/drivers/usb/musb/musb_dsps.c --- a/drivers/usb/musb/musb_dsps.c +++ b/drivers/usb/musb/musb_dsps.c @@ -188,21 +188,15 @@ static void dsps_musb_disable(struct musb *musb) musb_writeb(musb->mregs, MUSB_DEVCTL, 0); } -static void otg_timer(unsigned long _musb) +/* Caller must take musb->lock */ +static int dsps_check_status(struct musb *musb, void *unused) { - struct musb *musb = (void *)_musb; void __iomem *mregs = musb->mregs; struct device *dev = musb->controller; struct dsps_glue *glue = dev_get_drvdata(dev->parent); const struct dsps_musb_wrapper *wrp = glue->wrp; u8 devctl; - unsigned long flags; int skip_session = 0; - int err; - - err = pm_runtime_get_sync(dev); - if (err < 0) - dev_err(dev, "Poll could not pm_runtime_get: %i\n", err); /* * We poll because DSPS IP's won't expose several OTG-critical @@ -212,7 +206,6 @@ static void otg_timer(unsigned long _musb) dev_dbg(musb->controller, "Poll devctl %02x (%s)\n", devctl, usb_otg_state_string(musb->xceiv->otg->state)); - spin_lock_irqsave(&musb->lock, flags); switch (musb->xceiv->otg->state) { case OTG_STATE_A_WAIT_VRISE: mod_timer(&glue->timer, jiffies + @@ -245,8 +238,26 @@ static void otg_timer(unsigned long _musb) default: break; } - spin_unlock_irqrestore(&musb->lock, flags); + return 0; +} + +static void otg_timer(unsigned long _musb) +{ + struct musb *musb = (void *)_musb; + struct device *dev = musb->controller; + unsigned long flags; + int err; + + err = pm_runtime_get(dev); + if (err < 0) + dev_err(dev, "Poll could not pm_runtime_get: %i\n", err); + + spin_lock_irqsave(&musb->lock, flags); + err = musb_queue_resume_work(musb, dsps_check_status, NULL); + if (err < 0) + dev_err(dev, "%s resume work: %i\n", __func__, err); + spin_unlock_irqrestore(&musb->lock, flags); pm_runtime_mark_last_busy(dev); pm_runtime_put_autosuspend(dev); } diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c --- a/drivers/usb/musb/musb_gadget.c +++ b/drivers/usb/musb/musb_gadget.c @@ -1222,6 +1222,15 @@ void musb_ep_restart(struct musb *musb, struct musb_request *req) rxstate(musb, req); } +static int musb_ep_restart_resume_work(struct musb *musb, void *data) +{ + struct musb_request *req = data; + + musb_ep_restart(musb, req); + + return 0; +} + static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req, gfp_t gfp_flags) { @@ -1255,7 +1264,7 @@ static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req, map_dma_buffer(request, musb, musb_ep); - pm_runtime_get_sync(musb->controller); + pm_runtime_get(musb->controller); spin_lock_irqsave(&musb->lock, lockflags); /* don't queue if the ep is down */ @@ -1271,8 +1280,14 @@ static int musb_gadget_queue(struct usb_ep *ep, struct usb_request *req, list_add_tail(&request->list, &musb_ep->req_list); /* it this is the head of the queue, start i/o ... */ - if (!musb_ep->busy && &request->list == musb_ep->req_list.next) - musb_ep_restart(musb, request); + if (!musb_ep->busy && &request->list == musb_ep->req_list.next) { + status = musb_queue_resume_work(musb, + musb_ep_restart_resume_work, + request); + if (status < 0) + dev_err(musb->controller, "%s resume work: %i\n", + __func__, status); + } unlock: spin_unlock_irqrestore(&musb->lock, lockflags); -- 2.10.2 -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html