> -----Original Message----- > From: Mathias Nyman [mailto:mathias.nyman@xxxxxxxxxxxxxxx] > Sent: Friday, March 18, 2016 4:51 PM > To: Rajesh Bhagat <rajesh.bhagat@xxxxxxx>; linux-usb@xxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx > Cc: gregkh@xxxxxxxxxxxxxxxxxxx; mathias.nyman@xxxxxxxxx; Sriram Dash > <sriram.dash@xxxxxxx> > Subject: Re: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI > commmand timeout > > On 18.03.2016 09:01, Rajesh Bhagat wrote: > > We are facing issue while performing the system resume operation from > > STR where XHCI is going to indefinite hang/sleep state due to > > wait_for_completion API called in function xhci_alloc_dev for command > > TRB_ENABLE_SLOT which never completes. > > > > Now, xhci_handle_command_timeout function is called and prints > > "Command timeout" message but never calls complete API for above > > TRB_ENABLE_SLOT command as xhci_abort_cmd_ring is successful. > > > > Solution to above problem is: > > 1. calling xhci_cleanup_command_queue API even if xhci_abort_cmd_ring > > is successful or not. > > 2. checking the status of reset_device in usb core code. > > > Hi > > I think clearing the whole command ring is a bit too much in this case. > It may cause issues for all attached devices when one command times out. > Hi Mathias, I understand your point, But I want to understand how would completion handler be called if a command is timed out and xhci_abort_cmd_ring is successful. In this case all the code would be waiting on completion handler forever. > We need to look in more detail why we fail to call completion for that one aborted > command. > I checked the below code, Please correct me if I am wrong code waiting on wait_for_completion: int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev) { ... ret = xhci_queue_slot_control(xhci, command, TRB_ENABLE_SLOT, 0); ... wait_for_completion(command->completion); <=== waiting for command to complete code calling completion handler: 1. handle_cmd_completion -> xhci_complete_del_and_free_cmd 2. xhci_handle_command_timeout -> xhci_abort_cmd_ring(failure) -> xhci_cleanup_command_queue -> xhci_complete_del_and_free_cmd In our case command is timed out, Hence we hit the case #2 but xhci_abort_cmd_ring is success which does not calls complete. > The bigger question is why the timeout happens in the first place? > We are doing suspend resume operation, It might be controller issue :(, IMO software should not hang/stop if hardware is not behaving correct. > What kernel version, and what xhci vendor was this triggered on? > We are using 4.1.8 kernel > It's possible that the timeout is related either to the locking issue found by Chris > Bainbridge: > http://marc.info/?l=linux-usb&m=145493945408601&w=2 > > or the resume issues in this thread, (see full thread) > http://marc.info/?l=linux-usb&m=145477850706552&w=2 > > Does any of those proposed solutions fix the command timeout for you? > I will check the above patches and share status. > -Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html