Re: [PATCH v4 13/14] usb: force warm reset to break resume livelock

Julius Werner <jwerner@xxxxxxxxxxxx> · Tue, 11 Feb 2014 19:29:40 -0800

>> I believe I am seeing a "polling livelock" scenario as described by Julius.
>
> Julius was talking about what happens when the host controller itself
> gets reset (and therefore remembers nothing about any device) whereas
> the device still thinks it is in U3.  Is that the scenario you're
> encountering?  I thought you were working on normal runtime PM.

When you turn the power back on for a port, it should start out in
RxDetect and switch to Polling as it detects Rx terminations. If the
downstream device is unhappy for any reason (e.g. in SS.Inactive or
still in U3) and sends no or wrong responses to the LFPS Polling, the
hub's port will either move to ComplianceMode or keep cycling back and
forth between RxDetect and Polling. The latter is especially dangerous
because it's hard to detect (if you just sample the port status you
might see RxDetect, which would also be expected if there is nothing
connected at all), so I'm thinking an unconditional warm reset might
be unavoidable. That is why we proposed to go that route for the
Synopsys controller, and I think the same will apply to this situation
(since I think turning off a PortPower bit in XHCI will make the
controller "forget" a previous U3 state and return to RxDetect when
you turn it back on again, even though the actual VBUS line to the
device may not have been disabled after all).

>> > One other thought (I don't know if it is the right thing to do) is that
>> > we might _always_ perform a warm reset after powering-on a SuperSpeed
>> > port, without bothering to call hub_port_debounce_be_connected.
>>
>> I'm leaning in that direction.  However, the decision comes down to
>> the relative occurrence frequency of devices that fall into this trap
>> vs those that successfully recover and would suffer the additional
>> latency of a warm reset.
>
> Is a warm reset significantly longer than an ordinary reset?  We have
> to do some kind of reset in any case.  After all, the power session
> _has_ been interrupted.  (Assuming the power switching worked...)

USB 3.0 ports don't need to be reset on connect as a matter of course.
The should usually just start training themselves and eventually
become ready as soon as the wires touch. An extra warm reset would add
80-120ms delay to the port resume. (In comparison, a hot reset should
not take more than 12ms, probably even much less.)

>> >> With this in place we may want to consider reducing the timeout and
>> >> relying on warm reset for recovery.
>> >
>> > Why?  I'm not familiar with the intricacies of USB-3 link state
>> > changes, but there seem to be only two possibilities:
>> >
>> >         Either PORT_LS_POLLING is a valid state to be in while
>> >         trying to establish a SuperSpeed connection, in which case
>> >         we don't want to reduce the timeout,
>> >
>> >         Or it isn't a valid state, in which case we should abort
>> >         the debounce immediately.

It is a valid transitional state, unfortunately, but in a working case
it should resolve itself within a few milliseconds (probably less than
10). Maybe we should try to differentiate between USB 2.0 and 3.0
devices in hub_port_debounce()? I think due to the built-in link
training in USB 3.0, the classic debouncing doesn't really make sense
anymore (and wastes a lot of time since SuperSpeed links can train
really fast when they work).

As for this patch, I think the best approach would be to wait for the
device to come back in usb_port_runtime_resume() (through
hub_port_debounce() or something else), and if it doesn't show up
always set the bit to warm reset the port (regardless of LTSSM state,
since even if it says RxDetect I wouldn't be sure that there is really
nothing connected). We could then also use those bits in the "lost
power" case of xhci_resume() to try and work around the problems with
that Synopsys controller.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html