On Thu, 25 Jan 2018, Samuel Sadok wrote: > 2018-01-23 17:31 GMT+01:00 Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>: > > On Tue, 23 Jan 2018, Samuel Sadok wrote: > > > >> Thanks Alan, > >> > >> While I was at it I also enabled debug logs for xhci_hcd and xhci_pci. > >> > >> $ echo module usbcore =p > /sys/kernel/debug/dynamic_debug/control > >> $ echo module xhci_hcd =p > /sys/kernel/debug/dynamic_debug/control > >> $ echo module xhci_pci =p > /sys/kernel/debug/dynamic_debug/control > >> $ modprobe usbmon > >> $ cat /sys/kernel/debug/usb/usbmon/0u > /tmp/usbmon.log > >> $ # press power button > >> > >> dmesg: https://gist.github.com/anonymous/29b81574abf40605f999cfeefe98e341 > >> usbmon: https://gist.github.com/anonymous/55b6d9bbf8b8c8627230b10d2b09dcb6 > >> > >> Both logs were collected at the same time so the timestamps should match. > > > > In fact they do, quite closely. But there is a noticeable gap in the > > usbmon trace, between lines 197 and 198 (the timestamp jumps from > > 35923531 to 38925126) and there obviously was a lot of activity on bus > > 1 in between. > > Yes you're right, that's inconvenient because the "failed to disable > LTM" message occurs exactly in this gap. Are gaps like this expected / > known? Any ideas on how to mitigate them? (I kinda want to avoid > soldering an external bus monitor to the mainboard :) ) The text interface to usbmon uses a fixed-size buffer, and the size is a little on the low side if you're dealing with large latencies (which of course are unavoidable during a suspend/resume scenario). The size is determined by EVENT_MAX, defined around line 40 in drivers/usb/mon/mon_text.c: #define EVENT_MAX (4*PAGE_SIZE / sizeof(struct mon_event_text)) Just change the 4 to something considerably larger and you should be in good shape. > >> Resume starts at dmesg line 1255. > >> As far as I can judge, the earliest indication of something going > >> wrong is line 1514: > >> [ 36.087176] xhci_hcd 0000:00:14.0: xHCI xhci_urb_enqueue called with > >> unaddressed device > >> [ 36.087180] usb 2-4: Disable of device-initiated U1 failed. > >> [ 36.087212] xhci_hcd 0000:00:14.0: xHCI xhci_urb_enqueue called with > >> unaddressed device > >> [ 36.087224] usb 2-4: Disable of device-initiated U2 failed. > >> [ 36.087226] xhci_hcd 0000:00:14.0: xHCI xhci_urb_enqueue called with > >> unaddressed device > >> [ 36.087227] usb 2-4: usb_reset_and_verify_device Failed to disable LTM > > > > Yes, that's where the problem first seems to show up. Apparently > > usb_disable_ltm() fails because it can't communicate with the device, > > and usb_reset_and_verify_device() then aborts because of that failure. > > It shouldn't do this; after all, one very good reason for resetting the > > device is because we're unable to communicate with it. > > > > You could try editing the source code for usb_reset_and_verify_device() > > in drivers/usb/core/hub.c, to see what happens if it doesn't give up > > after usb_disable_ltm() fails. > > For the following logs I modified the "Failed to disable LTM" message > and commented out the goto in the line below. Also I used kernel > version 4.15-rc9 this time. I omitted upgrading the out-of-tree > modules broadcom-wl and nouveau-fw, so don't mind errors related to > that. > The SD card reader is still missing after resume. > > dmesg & usbmon: > https://gist.github.com/anonymous/fc597612d5ebbac40d7bef9f8f0b579a > > `usb_reset_and_verify_device()` is executed around dmesg line 1524. > Again, the usbmon log has a major gap around that time. It looks like the system is trying to carry out multiple warm resets, or a single extremely long warm reset, when it really should give up and do a cold reset instead. It would be best to get Mathias's input on this. I don't know what is the right way to fix this problem. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html