On Tue, Sep 05, 2017 at 04:21:10PM +0200, Johan Hovold wrote: > These patches fix a couple of bugs introduced by the recent runtime-PM > work. > > Note that the external abort was due to the irq work never being flushed > on suspend, and that we may need similar fixes for the delayed reset and > resume work which are likewise never cancelled on suspend. Looks like there are even more issues with musb suspend. With this series, which allows the controller to runtime suspend upon system resume, I can now trigger the following external abort at resume: PM: Finishing wakeup. OOM killer enabled. Restarting tasks ... done. hrtimer: interrupt took 191917 ns Unhandled fault: external abort on non-linefetch (0x1008) at 0xc8249412 pgd = c0004000 [c8249412] *pgd=87350811, *pte=47401653, *ppte=47401453 Internal error: : 1008 [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 572 Comm: kworker/0:2 Not tainted 4.12.0 #34 Hardware name: Generic AM33XX (Flattened Device Tree) Workqueue: pm pm_runtime_work task: c72057c0 task.stack: c722a000 PC is at musb_default_readw+0x4/0x10 LR is at musb_is_tx_fifo_empty+0x3c/0x48 <snip> [<c03ce444>] (musb_default_readw) from [<c03d4f68>] (musb_is_tx_fifo_empty+0x3c/0x48) [<c03d4f68>] (musb_is_tx_fifo_empty) from [<c03d5880>] (cppi41_recheck_tx_req+0x5c/0x118) [<c03d5880>] (cppi41_recheck_tx_req) from [<c016caf8>] (__hrtimer_run_queues.constprop.4+0x110/0x1bc) [<c016caf8>] (__hrtimer_run_queues.constprop.4) from [<c016cfa4>] (hrtimer_interrupt+0x98/0x230) [<c016cfa4>] (hrtimer_interrupt) from [<c0114018>] (omap2_gp_timer_interrupt+0x28/0x30) [<c0114018>] (omap2_gp_timer_interrupt) from [<c015bc08>] (__handle_irq_event_percpu+0x88/0x124) [<c015bc08>] (__handle_irq_event_percpu) from [<c015bcc0>] (handle_irq_event_percpu+0x1c/0x58) [<c015bcc0>] (handle_irq_event_percpu) from [<c015bd48>] (handle_irq_event+0x4c/0x84) [<c015bd48>] (handle_irq_event) from [<c015ebd8>] (handle_level_irq+0xb0/0x15c) [<c015ebd8>] (handle_level_irq) from [<c015af34>] (generic_handle_irq+0x24/0x34) [<c015af34>] (generic_handle_irq) from [<c015b4c0>] (__handle_domain_irq+0x70/0xdc) [<c015b4c0>] (__handle_domain_irq) from [<c010c20c>] (__irq_svc+0x6c/0xa8) [<c010c20c>] (__irq_svc) from [<c01168f4>] (omap_hwmod_idle+0x30/0x74) [<c01168f4>] (omap_hwmod_idle) from [<c0117cb8>] (omap_device_idle+0x40/0x90) [<c0117cb8>] (omap_device_idle) from [<c0360f88>] (__rpm_callback+0x15c/0x258) [<c0360f88>] (__rpm_callback) from [<c03610d4>] (rpm_callback+0x50/0x80) [<c03610d4>] (rpm_callback) from [<c0360000>] (rpm_suspend+0xe0/0x548) [<c0360000>] (rpm_suspend) from [<c036199c>] (pm_runtime_work+0xac/0xbc) [<c036199c>] (pm_runtime_work) from [<c013c0c0>] (process_one_work+0x11c/0x350) [<c013c0c0>] (process_one_work) from [<c013c32c>] (worker_thread+0x38/0x55c) [<c013c32c>] (worker_thread) from [<c0141a00>] (kthread+0x100/0x130) [<c0141a00>] (kthread) from [<c0108418>] (ret_from_fork+0x14/0x3c) after having suspended with an active ECM gadget. Turns out system suspend breaks musb in gadget mode. It seems I need to manually restart the gadget to get it to work again even it had just been enumerated (and which does not trigger the above crash). (Bug 1) But if an ECM gadget is also active (e.g. open SSH session) when suspending, this in turn can trigger yet another bug in that the early_tx dma-irq hrtimer is never cancelled when the tx-fifo does not empty when the gadget driver initiates a transfer after resume. The early_tx timer keeps rescheduling itself until the gadget it stopped manually (keeping the BBB CPU busy at about 20-30%). (Bug 2) If the controller is allowed to runtime suspend after system resume, as with this series, this repeated scheduling triggers the above external abort. I've respun the series so that the session flag and runtime pm count is left untouched unless we've already started the session-quirk timeout handling. This avoids the above crash, but does not address another problem with the current code, namely that the controller is left active in case a device is disconnected while suspended in host mode. (Bug 3) Johan -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html