On Mon, Feb 10, 2014 at 1:26 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Fri, 31 Jan 2014, Dan Williams wrote: > >> Resuming a powered down port sometimes results in the port state being >> stuck in USB_SS_PORT_LS_POLLING: >> >> hub 3-0:1.0: debounce: port 1: total 2000ms stable 0ms status 0x2e0 >> port1: can't get reconnection after setting port power on, status -110 >> hub 3-0:1.0: port 1 status 0000.02e0 after resume, -19 >> usb 3-1: can't resume, status -19 >> hub 3-0:1.0: logical disconnect on port 1 > > It's not obvious that this illustrates your point. Are we supposed to > know offhand that 0x2e0 means USB_SS_PORT_LS_POLLING? Hmm, no, I'll make the clearer in the revised change log. > >> In the case above we wait 2 seconds for the port to reconnect and for >> that entire time the port remained in the polling state. A warm reset >> triggers the device to reconnect and resume as normal. With this patch >> we get: >> >> hub 3-0:1.0: debounce: port 1: total 2000ms stable 0ms status 0x2e0 >> usb usb3: port1 usb_port_runtime_resume requires warm reset >> hub 3-0:1.0: port 1 not warm reset yet, waiting 50ms >> usb 3-1: reset SuperSpeed USB device number 2 using xhci_hcd > > Could this be improved? We still spent 2 seconds waiting for a port > that remained in the polling state. So this and at least one other question is why this cc list includes the participants from the thread "[PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs": http://marc.info/?l=linux-usb&m=138678842000703&w=2 I believe I am seeing a "polling livelock" scenario as described by Julius. > >> With this in place we may want to consider reducing the timeout and >> relying on warm reset for recovery. > > Why? I'm not familiar with the intricacies of USB-3 link state > changes, but there seem to be only two possibilities: > > Either PORT_LS_POLLING is a valid state to be in while > trying to establish a SuperSpeed connection, in which case > we don't want to reduce the timeout, > > Or it isn't a valid state, in which case we should abort > the debounce immediately. > > One other thought (I don't know if it is the right thing to do) is that > we might _always_ perform a warm reset after powering-on a SuperSpeed > port, without bothering to call hub_port_debounce_be_connected. I'm leaning in that direction. However, the decision comes down to the relative occurrence frequency of devices that fall into this trap vs those that successfully recover and would suffer the additional latency of a warm reset. There's just no way to know if the device on the other side is legitimately causing a polling condition or whether this is a result of the aforementioned live lock. So far I only have one USB3 device that requires this, our favorite ax88179_178a net adapter. The spec says that the only way to reliably sync the state machines is to remove power from the device, but we have no real way from the kernel to force and know a port is physically powered off. I'll look and see how imposing latency-wise it would be to always warm reset, but we may want to just quirk temperamental devices and hosts as we find them and use the timeout as a backstop. > >> Other xHCs that fail to propagate >> warm resets on hub resume may want to trigger this behavior via a quirk. > > What do you mean by "other xHCs"? Other than what? > Other "xHCs" referring again to that warm reset thread and the hypothesis that the Synopsys xHC is not propagating warm resets on host resume. > I don't want to go over this patch in detail, because it's pretty > confusing and the code is messy. Still, it seems odd to add all those > port status manipulations in usb_port_runtime_resume, when > hub_port_debounce_be_connected is already doing them. > > And why do we need another special flag to indicate that a warm reset > is needed? Can't check_port_resume_type figure that out from the port > status? That routine was meant for exactly this sort of thing. > check_port_resume_type() does not have the context to make the determination. LS_POLLING is a valid state, we only know that a warm reset is required when it has been in this state for "too long". Unfortunately, the timeout needs to consider that the device is coming from physically powered off condition (rather than just logical) so it at least needs to be 2 seconds for a connection (per commit ad493e5 usb: add usb port auto power off mechanism). -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html