Re: [PATCH v4 13/14] usb: force warm reset to break resume livelock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 10 Feb 2014, Dan Williams wrote:

> I believe I am seeing a "polling livelock" scenario as described by Julius.

Julius was talking about what happens when the host controller itself
gets reset (and therefore remembers nothing about any device) whereas
the device still thinks it is in U3.  Is that the scenario you're 
encountering?  I thought you were working on normal runtime PM.

> >> With this in place we may want to consider reducing the timeout and
> >> relying on warm reset for recovery.
> >
> > Why?  I'm not familiar with the intricacies of USB-3 link state
> > changes, but there seem to be only two possibilities:
> >
> >         Either PORT_LS_POLLING is a valid state to be in while
> >         trying to establish a SuperSpeed connection, in which case
> >         we don't want to reduce the timeout,
> >
> >         Or it isn't a valid state, in which case we should abort
> >         the debounce immediately.
> >
> > One other thought (I don't know if it is the right thing to do) is that
> > we might _always_ perform a warm reset after powering-on a SuperSpeed
> > port, without bothering to call hub_port_debounce_be_connected.
> 
> I'm leaning in that direction.  However, the decision comes down to
> the relative occurrence frequency of devices that fall into this trap
> vs those that successfully recover and would suffer the additional
> latency of a warm reset.

Is a warm reset significantly longer than an ordinary reset?  We have
to do some kind of reset in any case.  After all, the power session
_has_ been interrupted.  (Assuming the power switching worked...)

>  There's just no way to know if the device on
> the other side is legitimately causing a polling condition or whether
> this is a result of the aforementioned live lock.  So far I only have
> one USB3 device that requires this, our favorite ax88179_178a net
> adapter.
> 
> The spec says that the only way to reliably sync the state machines is
> to remove power from the device, but we have no real way from the
> kernel to force and know a port is physically powered off.  I'll look
> and see how imposing latency-wise it would be to always warm reset,
> but we may want to just quirk temperamental devices and hosts as we
> find them and use the timeout as a backstop.
> 
> >
> >>  Other xHCs that fail to propagate
> >> warm resets on hub resume may want to trigger this behavior via a quirk.
> >
> > What do you mean by "other xHCs"?  Other than what?
> >
> 
> Other "xHCs" referring again to that warm reset thread and the
> hypothesis that the Synopsys xHC is not propagating warm resets on
> host resume.

It looks like there were two separate considerations: Whether the host
controller issues a reset signal over the bus when it gets reset
itself, and whether the host issues a reset signal to a port if it
doesn't believe any device is attached.

> > I don't want to go over this patch in detail, because it's pretty
> > confusing and the code is messy.  Still, it seems odd to add all those
> > port status manipulations in usb_port_runtime_resume, when
> > hub_port_debounce_be_connected is already doing them.
> >
> > And why do we need another special flag to indicate that a warm reset
> > is needed?  Can't check_port_resume_type figure that out from the port
> > status?  That routine was meant for exactly this sort of thing.
> >
> 
> check_port_resume_type() does not have the context to make the
> determination.  LS_POLLING is a valid state, we only know that a warm
> reset is required when it has been in this state for "too long".

If the port is in that state when check_port_resume_type runs then it
certainly has been "too long".  According to section 7.5 of the spec,
ports don't go into LS_POLLING when the power session wasn't
interrupted.

> Unfortunately, the timeout needs to consider that the device is coming
> from physically powered off condition (rather than just logical) so it
> at least needs to be 2 seconds for a connection (per commit ad493e5
> usb: add usb port auto power off mechanism).

Okay, that's reasonable.  Unfortunate, but reasonable.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux