On Mon, Mar 20, 2017 at 11:21 AM, Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> wrote: > On 19.03.2017 23:29, Diego Viola wrote: >> >> On Fri, Mar 17, 2017 at 5:18 PM, Diego Viola <diego.viola@xxxxxxxxx> >> wrote: >>> >>> On Fri, Mar 17, 2017 at 1:57 PM, Diego Viola <diego.viola@xxxxxxxxx> >>> wrote: >>>> >>>> On Fri, Mar 17, 2017 at 1:24 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> >>>> wrote: >>>>> >>>>> On Fri, 17 Mar 2017, Diego Viola wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Here's the log to the netconsole dmesg capture, I've used >>>>>> xhci_hcd.dyndbg no_console_suspend=1 as the kernel parameters. >>>>>> >>>>>> I did the usual suspend/resume cycle with i3lock, it hung after the >>>>>> third attempt when trying to resume from suspend. >>>>>> >>>>>> https://bugzilla.kernel.org/attachment.cgi?id=255309 >>>>> >>>>> >>>>> I'm not an expert on xHCI. This should be CC'ed to the xhci-hcd >>>>> maintainer. >>>>> >>>>> Alan Stern >>>>> >>>>>> >>>>>> Please let me know if I should provide something else. >>>>>> >>>>>> Thanks, >>>>>> Diego >>>>>> >>>>> >>>> >>>> I've forwarded my email to Mathias Nyman. >>>> >>>> Diego >>> >>> >>> Still a problem with 4.11.0-rc2-ARCH+ >>> > > From a quick glance it looks like rtsx_usb_ms probaly takes a mutex > (&ucr->dev_mutex) > and then issues a usb_bulk_msg() and waits for it to complete with mutex > held. > The usb message times out, usb core kills the urb but the URB probably never > gets completed, > and function never returns. > > Everyting using ucr->dev_mutex would block, for example the kthread, > rtsx_usb_detect_ms_card > that continuously tries to detect a ms card, takes and releases the same > ucr->dev_mutex for > each try. > > [ 614.026502] INFO: task kworker/u8:0:5 blocked for more than 120 seconds. > [ 614.027865] Not tainted 4.10.3-1-ARCH #1 > [ 614.029116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 614.030467] kworker/u8:0 D 0 5 2 0x00000000 > [ 614.031812] Workqueue: kacpi_hotplug acpi_hotplug_work_fn > [ 614.033179] Call Trace: > [ 614.034550] __schedule+0x22f/0x700 > [ 614.035940] schedule+0x3d/0x90 > [ 614.037334] schedule_preempt_disabled+0x15/0x20 > [ 614.038680] __mutex_lock_slowpath+0x19b/0x2d0 > [ 614.040067] ? flush_workqueue+0x204/0x580 > [ 614.041456] mutex_lock+0x23/0x30 > [ 614.042163] acpi_device_hotplug+0x43/0x3e7 > [ 614.042882] acpi_hotplug_work_fn+0x1e/0x29 > [ 614.043612] process_one_work+0x1e5/0x470 > [ 614.044356] worker_thread+0x48/0x4e0 > [ 614.045077] kthread+0x101/0x140 > [ 614.045788] ? process_one_work+0x470/0x470 > [ 614.046495] ? kthread_create_on_node+0x60/0x60 > [ 614.047215] ret_from_fork+0x2c/0x40 > [ 614.047950] INFO: task rtsx_usb_ms_1:235 blocked for more than 120 > seconds. > [ 614.048697] Not tainted 4.10.3-1-ARCH #1 > [ 614.049465] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 614.050265] rtsx_usb_ms_1 D 0 235 2 0x00000000 > [ 614.051064] Call Trace: > [ 614.051841] __schedule+0x22f/0x700 > [ 614.052626] schedule+0x3d/0x90 > [ 614.053411] usb_kill_urb.part.4+0x6c/0xa0 [usbcore] > [ 614.054198] ? wake_atomic_t_function+0x60/0x60 > [ 614.055005] usb_kill_urb+0x21/0x30 [usbcore] > [ 614.055819] usb_start_wait_urb+0xe5/0x170 [usbcore] > [ 614.056652] usb_bulk_msg+0xbd/0x160 [usbcore] > [ 614.057489] rtsx_usb_send_cmd+0x63/0x90 [rtsx_usb] > [ 614.058306] rtsx_usb_read_register+0x6c/0xc0 [rtsx_usb] > [ 614.059118] rtsx_usb_detect_ms_card+0x98/0x120 [rtsx_usb_ms] > > There is a lot going on in xhci during the last suspend befor this. > URBs are canceled, devices reset and re-enumerated, timeout while reading > descriptor, > device firmware changed. > > It's possible we end up in a situation where xhci never givers back the > URB. > > 4.11-rc2 has better xhci tracing, it shows each URB enqueue and dequeue and > giveback. > > Could you try enabling xhci tracing before suspending (not the same as xhci > verbose dynamic debug) > It will generate a lot of data, so better to remove all extra USB devices. > > xhci tracing can be added with: > > mount -t debugfs none /sys/kernel/debug > echo xhci-hcd >> /sys/kernel/debug/tracing/set_event > > and then send the output of cat /sys/kernel/debug/tracing/trace > > -Mathias > > https://bugzilla.kernel.org/attachment.cgi?id=255367 This is with Linux 4.11.0-rc3-ARCH. USB mouse/keyboard was unplugged before booting the machine. I didn't do a suspend/resume before getting this trace, should I do that? Should I reproduce the hang and get a netconsole dmesg capture with tracing enabled? Thanks, Diego -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html