RE: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI commmand timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Mathias Nyman [mailto:mathias.nyman@xxxxxxxxxxxxxxx]
> Sent: Friday, March 18, 2016 4:51 PM
> To: Rajesh Bhagat <rajesh.bhagat@xxxxxxx>; linux-usb@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; mathias.nyman@xxxxxxxxx; Sriram Dash
> <sriram.dash@xxxxxxx>
> Subject: Re: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI
> commmand timeout
> 
> On 18.03.2016 09:01, Rajesh Bhagat wrote:
> > We are facing issue while performing the system resume operation from
> > STR where XHCI is going to indefinite hang/sleep state due to
> > wait_for_completion API called in function xhci_alloc_dev for command
> > TRB_ENABLE_SLOT which never completes.
> >
> > Now, xhci_handle_command_timeout function is called and prints
> > "Command timeout" message but never calls complete API for above
> > TRB_ENABLE_SLOT command as xhci_abort_cmd_ring is successful.
> >
> > Solution to above problem is:
> > 1. calling xhci_cleanup_command_queue API even if xhci_abort_cmd_ring
> >     is successful or not.
> > 2. checking the status of reset_device in usb core code.
> 
> 
> Hi
> 
> I think clearing the whole command ring is a bit too much in this case.
> It may cause issues for all attached devices when one command times out.
> 


Hi Mathias, 

I understand your point, But I want to understand how would completion handler be called 
if a command is timed out and xhci_abort_cmd_ring is successful. In this case all the code 
would be waiting on completion handler forever. 
	

> We need to look in more detail why we fail to call completion for that one aborted
> command.
> 

I checked the below code, Please correct me if I am wrong

code waiting on wait_for_completion: 
int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev)
{
...
        ret = xhci_queue_slot_control(xhci, command, TRB_ENABLE_SLOT, 0);
...

        wait_for_completion(command->completion); <=== waiting for command to complete 


code calling completion handler:
1. handle_cmd_completion -> xhci_complete_del_and_free_cmd
2. xhci_handle_command_timeout -> xhci_abort_cmd_ring(failure) -> xhci_cleanup_command_queue -> xhci_complete_del_and_free_cmd

In our case command is timed out, Hence we hit the case #2 but xhci_abort_cmd_ring is success which 
does not calls complete. 


> The bigger question is why the timeout happens in the first place?
> 

We are doing suspend resume operation, It might be controller issue :(, IMO software should not 
hang/stop if hardware is not behaving correct. 

> What kernel version, and what xhci vendor was this triggered on?
> 

We are using 4.1.8 kernel

> It's possible that the timeout is related either to the locking issue found by Chris
> Bainbridge:
> http://marc.info/?l=linux-usb&m=145493945408601&w=2
> 
> or the resume issues in this thread, (see full thread)
> http://marc.info/?l=linux-usb&m=145477850706552&w=2
> 
> Does any of those proposed solutions fix the command timeout for you?
> 

I will check the above patches and share status.

> -Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux