RE: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI commmand timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Mathias Nyman [mailto:mathias.nyman@xxxxxxxxxxxxxxx]
> Sent: Wednesday, March 23, 2016 7:52 PM
> To: Rajesh Bhagat <rajesh.bhagat@xxxxxxx>
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-usb@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Sriram Dash <sriram.dash@xxxxxxx>
> Subject: Re: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI
> commmand timeout
> 
> On 23.03.2016 05:53, Rajesh Bhagat wrote:
> 
> >>> IMO, The assumption that "xhci_abort_cmd_ring would always generate
> >>> an event and handle_cmd_completion would be called" will not be
> >>> always be true if HW
> >> is in bad state.
> >>>
> >>> Please share your opinion.
> >>>
> >>
> >> writing the CA (command abort) bit in CRCR (command ring control
> >> register)  will stop the command ring, and CRR (command ring running) will be set
> to 0 by xHC.
> >> xhci_abort_cmd_ring() polls this bit up to 5 seconds.
> >> If it's not 0 then the driver considers the command abort as failed.
> >>
> >> The scenario you're thinking of is that xHC would still react to CA
> >> bit set, it would stop the command ring and set CRR 0, but not send a command
> completion event.
> >>
> >> Have you tried adding some debug to handle_cmd_completion() and see
> >> if you receive any event after command abortion?
> >>
> >
> > Yes. We have added debug prints at first line of
> > handle_cmd_completion, and we are not getting those prints. The last
> > print messages that we get are as below from xhci_alloc_dev while
> > resume
> > operation:
> >
> > xhci-hcd xhci-hcd.0.auto: Command timeout xhci-hcd xhci-hcd.0.auto:
> > Abort command ring
> >
> > May be somehow, USB controller is in bad state and not responding to the
> commands.
> >
> > Please suggest how XHCI driver can handle such situations.
> >
> 
> Restart the command timeout timer when writing the command abort bit.
> If we get theIf we get the abort event the timer is deleted.
> 
> Otherwise if the timout triggers a second time we end up calling
> xhci_handle_command_timeout() with a stopped ring, This will call
> xhci_handle_stopped_cmd_ring(), turn the aborted command to no-op, restart the
> command ring, and finally when the no-op completes it should call the missing
> completion.
> 
> If command ring doesn't start then additional code could be added to
> xhci_handle_command_timeout() that clears the command ring if it is called a second
> time (=current command is already in abort state and command ring is stopped when
> entering xhci_handle_command_timeout)
> 
> There might be some details missing, I'm not able to test any of this, but try
> something like this:
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index
> 3e1d24c..576819e 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -319,7 +319,10 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
>                  xhci_halt(xhci);
>                  return -ESHUTDOWN;
>          }
> -
> +       /* writing the CMD_RING_ABORT bit should create a command completion
> +        * event, add a command completion timeout for it as well
> +        */
> +       mod_timer(&xhci->cmd_timer, jiffies + XHCI_CMD_DEFAULT_TIMEOUT);
>          return 0;
>   }

Hello Mathias, 

Thanks for the patch. 

After application of above patch, I'm getting following prints constantly: 

xhci-hcd xhci-hcd.0.auto: Command timeout
xhci-hcd xhci-hcd.0.auto: Abort command ring
xhci-hcd xhci-hcd.0.auto: Command timeout on stopped ring
xhci-hcd xhci-hcd.0.auto: Turn aborted command be56e000 to no-op
xhci-hcd xhci-hcd.0.auto: // Ding dong!
...
xhci-hcd xhci-hcd.0.auto: Command timeout
xhci-hcd xhci-hcd.0.auto: Abort command ring
xhci-hcd xhci-hcd.0.auto: Command timeout on stopped ring
xhci-hcd xhci-hcd.0.auto: Turn aborted command be56e000 to no-op
xhci-hcd xhci-hcd.0.auto: // Ding dong!

As expected, xhci_handle_command_timeout is called again and next time ring state
is __not__ CMD_RING_STATE_RUNNING, Hence xhci_handle_stopped_cmd_ring is called 
which turn all the aborted commands to no-ops and again makes the ring state as 
CMD_RING_STATE_RUNNING, and rings the door bell. 

But again in this case, no response from USB controller and xhci_alloc_dev is still waiting for
wait_for_completion. 

Does rescheduling the xhci->cmd_timer ensures command completion will be called. IMO, it is still 
dependent on HW response. 

Please share your opinion. 

> 
> -Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux