On 01/13/2017 05:12 PM, Patrice Chotard wrote: > On 01/12/2017 06:17 PM, Mathias Nyman wrote: >> On 09.01.2017 18:42, Mathias Nyman wrote: >>> On 09.01.2017 16:23, Patrice Chotard wrote: >>>> On 01/09/2017 01:30 PM, Mathias Nyman wrote: >>>>> >>>>> On 09.01.2017 11:51, Patrice Chotard wrote: >>>>>> Hi Mathias, Greg >>>>> >>>>> Hi >>>>> >>>>>> >>>>>> I am working on ARM STi platform, since v4.10-rc1, when booting B2260 or B2120 STi boards platform >>>>>> with nothing plugged on USB3 connector, i observed the following kernel logs : >>>>>> >>>>>> .... >>>>>> [ 801.953836] usb usb6-port1: Cannot enable. Maybe the USB cable is bad? >>>>> >>>>> That's odd, why does it try to reset the port if there's nothing there? >>>>> >>>>>> [ 801.960455] xhci-hcd xhci-hcd.0.auto: Cannot set link state. >>>>> >>>>> This makes sense, nothing is connected, and we try to set link to U3 (PORT_PE == 0) >>>>> >>>>>> [ 801.966611] usb usb6-port1: cannot disable (err = -32) >>>>> xhci reutns -EPIPE as we try to set link state to U3 while port is not enabled (PORT_PE == 0) >>>>> >>>>>> [ 806.083772] usb usb6-port1: Cannot enable. Maybe the USB cable is bad? >>>>> >>>>> Again, about 5 seconds later the same port is reset, while nothing is connected. >>>>> Odd >>>>> >>>>>> [ 806.090370] xhci-hcd xhci-hcd.0.auto: Cannot set link state. >>>>>> [ 806.096494] usb usb6-port1: cannot disable (err = -32) >>>>>> [ 810.208766] usb usb6-port1: Cannot enable. Maybe the USB cable is bad? >>>>>> [ 810.215374] xhci-hcd xhci-hcd.0.auto: Cannot set link state. >>>>>> [ 810.221478] usb usb6-port1: cannot disable (err = -32) >>>>>> [ 814.333767] usb usb6-port1: Cannot enable. Maybe the USB cable is bad? >>>>>> [ 814.340364] xhci-hcd xhci-hcd.0.auto: Cannot set link state. >>>>>> ..... >>>>>> >>>>>> Another interesting thing is even if i plugged a device (a mass storage device in my case) on the USB3 connector, >>>>>> the above logs continue to appear. >>>>>> >>>>>> This is due to commit 37be66767e3ca "usb: hub: Fix auto-remount of safely removed or ejected USB-3 devices". >>>>>> >>>>>> I don't know if STi platforms are the only impacted by this issue. >>>>>> >>>>>> On v4.9 everything was ok. >>>>>> >>>>> >>>>> Was there not a single "Cannot enable. Maybe USB cable is bad" message with 4.9? >>>> >>>> There is no "Cannot enable. Maybe USB cable is bad" message with 4.9 >>>> >>>>> >>>>> Before commit 37be66767e3ca we forced the port to RxDetect state via Disabled state, >>>>> and cleared port change flags when a usb3_port_disable() was called. >>>>> >>>>> after we just set the port to U3. Maybe those STi platforms depend on that cycle somehow? >>>> >>>> I really don't know :-( >>>> >>>>> >>>>> Could you take logs (dmesg) of both 4.9 and 4.10-rc1 with both usb core and xhci debugging enabled: >>>>> >>>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control >>>>> echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control >>>> >>>> >>>> You will find below the 2 requested logs, in both case , no USB cable are plugged. >>>> >>>> v4.9 dmesg logs : >>>> >>>> ..... >>>> [ 31.368022] hub 6-0:1.0: hub_resume >>>> [ 31.368064] xhci-hcd xhci-hcd.0.auto: get port status, actual port 0 status = 0x400340 >>>> [ 31.368071] xhci-hcd xhci-hcd.0.auto: Get port status returned 0x400340 >>>> [ 31.368224] hub 6-0:1.0: state 7 ports 1 chg 0000 evt 0000 >>> >>> Looks like port is stuck in Compliance mode, this should only happen if a connect was detected, >>> moving the port from RxDetect to Polling, and then timeout on polling to get to compliance. >>> >>> 4.9 is then stuck in a hub_suspend/resume loop with a port in compliance mode. >>>> >>>> >>>> v4.10-rc1 dmesg logs : >>>> >>>> .... >>>> [ 269.436617] usb usb6-port1: cannot disable (err = -32) >>>> [ 269.464728] hub 6-0:1.0: state 7 ports 1 chg 0000 evt 0002 >>>> [ 269.464756] xhci-hcd xhci-hcd.0.auto: get port status, actual port 0 status = 0x340 >>> >>> I 4.10-rc1 we are again stuck in Compliance mode, but this time there was a event (00002) visible, maybe because >>> the flags were not cleared or some other reason. Now are stuck in a port reset loop with the >>> port in compliance mode. >>> So the real question is why is the port in compliance mode when there are no devices connected.? >>> >>> I'll continue looking at this more tomorrow >> >> So with both 4.9 and 4.10-rc the ports are in Compliance mode state, even if nothing is connected >> To me this looks like the real underlying issue. For some reason it didn't try to reset the port in 4.9, >> and never got to the loop of port reset. >> >> In 4.9 the hub->events_bits[0] are not set, so the hub thread won't even ask for port status >> on any ports. >> >> In 4.10-rc the hub->events_bits[0] is set for port 1, It will start handling port events, see the >> compliance state and try to warm reset the port, and fail as it never sees port >> >> Any chance you could add extra debugging to xhci to show why the change bit is set? >> Something like this: >> >> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c >> index 0ef1690..e425d82 100644 >> --- a/drivers/usb/host/xhci-hub.c >> +++ b/drivers/usb/host/xhci-hub.c >> @@ -1271,6 +1271,10 @@ int xhci_hub_status_data(struct usb_hcd *hcd, char *buf) >> jiffies, bus_state->resume_done[i]))) { >> buf[(i + 1) / 8] |= 1 << (i + 1) % 8; >> status = 1; >> + xhci_err(xhci, "portsc 0x%x, port_c_suspend:%d, resume_done:%d\n", >> + temp, >> + bus_state->port_c_suspend, >> + bus_state->resume_done[i]); >> } >> if ((temp & PORT_RC)) >> reset_change = true; >> >> >> and run with 4.9 and 4.10-rc >> >> -Mathias > > > Hi Mathias > > Thanks for your time > > For information i replaced xhci_err() by xhci_dbg() to avoid flooding kernel log which > avoid to complete the kernel boot. > > Please find attached both kernel log files for v4.9 and v4.10-rc1. > > In v4.9_xhci-hub_additionnal_debug.txt, at the beginning of the trace > portsc value is 0xa400340, then at [ 18.366392] it becomes 0x400340 (link state change) and then 0x2802a0 and back again to 0xa400340 > > In v4.10-rc1_xhci-hub_additionnal_debug.txt, at the beginning of the trace > portsc value is 0xa400340, then at [ 5.557976] it becomes 0x400340 (link state change), and then 0x2802a0 or 0x680340 for ever .... > > Patrice > Hi Mathias Did new logs give you some clue about the XHCi problem ? Regards Patrice��.n��������+%������w��{.n�����{���)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥