> -----Original Message----- > From: Mathias Nyman [mailto:mathias.nyman@xxxxxxxxxxxxxxx] > Sent: Wednesday, March 23, 2016 7:52 PM > To: Rajesh Bhagat <rajesh.bhagat@xxxxxxx> > Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-usb@xxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; Sriram Dash <sriram.dash@xxxxxxx> > Subject: Re: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI > commmand timeout > > On 23.03.2016 05:53, Rajesh Bhagat wrote: > > >>> IMO, The assumption that "xhci_abort_cmd_ring would always generate > >>> an event and handle_cmd_completion would be called" will not be > >>> always be true if HW > >> is in bad state. > >>> > >>> Please share your opinion. > >>> > >> > >> writing the CA (command abort) bit in CRCR (command ring control > >> register) will stop the command ring, and CRR (command ring running) will be set > to 0 by xHC. > >> xhci_abort_cmd_ring() polls this bit up to 5 seconds. > >> If it's not 0 then the driver considers the command abort as failed. > >> > >> The scenario you're thinking of is that xHC would still react to CA > >> bit set, it would stop the command ring and set CRR 0, but not send a command > completion event. > >> > >> Have you tried adding some debug to handle_cmd_completion() and see > >> if you receive any event after command abortion? > >> > > > > Yes. We have added debug prints at first line of > > handle_cmd_completion, and we are not getting those prints. The last > > print messages that we get are as below from xhci_alloc_dev while > > resume > > operation: > > > > xhci-hcd xhci-hcd.0.auto: Command timeout xhci-hcd xhci-hcd.0.auto: > > Abort command ring > > > > May be somehow, USB controller is in bad state and not responding to the > commands. > > > > Please suggest how XHCI driver can handle such situations. > > > > Restart the command timeout timer when writing the command abort bit. > If we get theIf we get the abort event the timer is deleted. > > Otherwise if the timout triggers a second time we end up calling > xhci_handle_command_timeout() with a stopped ring, This will call > xhci_handle_stopped_cmd_ring(), turn the aborted command to no-op, restart the > command ring, and finally when the no-op completes it should call the missing > completion. > > If command ring doesn't start then additional code could be added to > xhci_handle_command_timeout() that clears the command ring if it is called a second > time (=current command is already in abort state and command ring is stopped when > entering xhci_handle_command_timeout) > > There might be some details missing, I'm not able to test any of this, but try > something like this: > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index > 3e1d24c..576819e 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -319,7 +319,10 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci) > xhci_halt(xhci); > return -ESHUTDOWN; > } > - > + /* writing the CMD_RING_ABORT bit should create a command completion > + * event, add a command completion timeout for it as well > + */ > + mod_timer(&xhci->cmd_timer, jiffies + XHCI_CMD_DEFAULT_TIMEOUT); > return 0; > } Hello Mathias, Thanks for the patch. After application of above patch, I'm getting following prints constantly: xhci-hcd xhci-hcd.0.auto: Command timeout xhci-hcd xhci-hcd.0.auto: Abort command ring xhci-hcd xhci-hcd.0.auto: Command timeout on stopped ring xhci-hcd xhci-hcd.0.auto: Turn aborted command be56e000 to no-op xhci-hcd xhci-hcd.0.auto: // Ding dong! ... xhci-hcd xhci-hcd.0.auto: Command timeout xhci-hcd xhci-hcd.0.auto: Abort command ring xhci-hcd xhci-hcd.0.auto: Command timeout on stopped ring xhci-hcd xhci-hcd.0.auto: Turn aborted command be56e000 to no-op xhci-hcd xhci-hcd.0.auto: // Ding dong! As expected, xhci_handle_command_timeout is called again and next time ring state is __not__ CMD_RING_STATE_RUNNING, Hence xhci_handle_stopped_cmd_ring is called which turn all the aborted commands to no-ops and again makes the ring state as CMD_RING_STATE_RUNNING, and rings the door bell. But again in this case, no response from USB controller and xhci_alloc_dev is still waiting for wait_for_completion. Does rescheduling the xhci->cmd_timer ensures command completion will be called. IMO, it is still dependent on HW response. Please share your opinion. > > -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html