On Tue, 2019-06-04 at 16:53 +0300, Mathias Nyman wrote: > On 27.5.2019 14.28, Nicolas Saenz Julienne wrote: > > Hi Matthias, > > thanks for the review. > > > > On Mon, 2019-05-27 at 14:16 +0300, Mathias Nyman wrote: > > > On 24.5.2019 17.52, Nicolas Saenz Julienne wrote: > > > > This was seen on a Dell Precision 5520 using it's WD15 dock. The dock's > > > > Ethernet device interfaces with the laptop through one of it's USB3 > > > > ports. While idle, the Ethernet device and HCD are suspended by runtime > > > > PM, being the only device connected on the bus. Then, both are resumed > > > > on > > > > behalf of the Ethernet device, which has remote wake-up capabilities. > > > > > > > > The Ethernet device was observed to randomly disconnect from the USB > > > > port shortly after submitting it's remote wake-up request. Probably a > > > > weird timing issue yet to be investigated. This causes runtime PM to > > > > busyloop causing some tangible CPU load. The reason is the port gets > > > > stuck in the middle of a remote wake-up operation, waiting for the > > > > device to switch to U0. This never happens, leaving "port_remote_wakeup" > > > > enabled, and automatically triggering a failure on any further suspend > > > > operation. > > > > > > > > This patch clears "port_remote_wakeup" upon detecting a device with a > > > > wrong resuming port state (see Table 4-9 in 4.15.2.3). Making sure the > > > > above mentioned situation doesn't trigger a PM busyloop. > > > > > > > > > > There was a similar case where the USB3 link had transitioned to a > > > lower power U1 or U2 state after resume, before driver read the state, > > > leaving port_remote_wakeup flag uncleared. > > > > > > This was fixed in 5.1 kernel by commit: > > > > > > 6cbcf59 xhci: Fix port resume done detection for SS ports with LPM enable > > > > > > Can you check if you have it? > > > It should be in recent stable releases as well. > > > > I was aware of that patch, unfortunately it doesn't address the same issue. > > In > > my case I never get a second port status event (so no PLC == 1 or any state > > change seen in PLS). The device simply disconnects from the bus. > > > > I see, ok, then we need to clear the flag in the hub thread. > > But to me it looks like this patch could cause a small race risk in the > successful > device initiated resume cases. > > If the hub thread, i.e. the get_port_status() function, notices the U0 state > before > the interrupt handler, i.e. handle_port_status() function, then > port_remote_wakeup > flag is cleared in the hub thread and the wakeup notification is never called > from > handle_port_status(). > > Would it be enough to just check for (port_remote_wakeup flag && > !PORT_CONNECT) in the hub thread? > USB3 PORT_CONNECT bit is lost in most error cases. I get your concerns, there is a race indeed. On top of that, checking PORT_CONNECT works fine for me. So if I undertood your suggestion right, would this be fine? diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index 3abe70ff1b1e..253957dc62de 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1057,6 +1057,9 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd, bus_state->resume_done[wIndex] = 0; clear_bit(wIndex, &bus_state->resuming_ports); usb_hcd_end_port_resume(&hcd->self, wIndex); + } else if (bus_state->port_remote_wakeup & (1 << port->hcd_portnum) && + !(raw_port_status & PORT_CONNECT)) { + bus_state->port_remote_wakeup &= ~(1 << port->hcd_portnum); } if (bus_state->port_c_suspend & (1 << wIndex)) Regards, Nicolas
Attachment:
signature.asc
Description: This is a digitally signed message part