On 2 December 2016 at 09:17, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: > Hi, > > On 12/01/2016 04:03 PM, Baolin Wang wrote: >> On 1 December 2016 at 15:44, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: >>> Hi, >>> >>> On 12/01/2016 03:35 PM, Baolin Wang wrote: >>>> On 1 December 2016 at 14:35, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: >>>>> Hi, >>>>> >>>>> On 12/01/2016 02:04 PM, Baolin Wang wrote: >>>>>> Hi Baolu, >>>>>> >>>>>> On 1 December 2016 at 13:45, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On 11/30/2016 05:02 PM, Baolin Wang wrote: >>>>>>>> If the hardware never responds to the stop endpoint command, the >>>>>>>> URBs will never be completed, and we might hang the USB subsystem. >>>>>>>> The original watchdog timer is used to watch if one stop endpoint >>>>>>>> command is timeout, if timeout, then the watchdog timer will set >>>>>>>> XHCI_STATE_DYING, try to halt the xHCI host, and give back all >>>>>>>> pending URBs. >>>>>>>> >>>>>>>> But now we already have one command timer to control command timeout, >>>>>>>> thus we can also use the command timer to watch the stop endpoint >>>>>>>> command, instead of one duplicate watchdog timer which need to be >>>>>>>> removed. >>>>>>>> >>>>>>>> Meanwhile we don't need the 'stop_cmds_pending' flag to identy if >>>>>>>> this is the last stop endpoint command of one endpoint. Since we >>>>>>>> can make sure we only set one stop endpoint command for one endpoint >>>>>>>> by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove >>>>>>>> this flag. >>>>>>> I am afraid you can't do this. "stop_cmds_pending" was added >>>>>>> to fix the problem described in the comments that you want to >>>>>>> remove. But I didn't find any fix of this problem in your patch. >>>>>> Now we can not pending another stop endpoint command for the same one >>>>>> endpoint, since will check 'EP_HALT_PENDING' flag in >>>>>> xhci_urb_dequeue() function to avoid this. But after some >>>>>> investigation, I think I missed the stop endpoint command in >>>>>> xhci_stop_device() which did not check the 'EP_HALT_PENDING' flag, >>>>>> maybe need to add 'EP_HALT_PENDING' flag checking in >>>>>> xhci_stop_device() function. DId I miss something else? Thanks. >>>>> Consider below three threads running on different CPUs at the same time. >>>>> >>>>> Thread A: xhci_handle_cmd_stop_ep() --- in interrupt handler >>>>> Thread B: xhci_stop_endpoint_command_timeout() --- timer expired >>>>> Thread C: xhci_urb_dequeue --- called by usb core >>>>> >>>>> They are serialized by xhci->lock. Let's consider below sequence: >>>>> >>>>> Thread A: >>>>> - delete xhci->cmd_timer), but will fail due to Thread B. >>>>> - clear EP_HALT_PENDING bit >>>>> >>>>> Thread B: >>>>> - halt the host controller >>>>> >>>>> Thread C: >>>>> - set EP_HALT_PENDING bit >>>>> - enqueue another stop endpoint command >>>>> - add the timer back >>>> Ah, thanks for you comments. >>>> If thread B halted the host, then the thread C xhci_urb_dequeue() will >>>> check host state failed, then just return and can not set another stop >>>> endpoint command. >>> Not so simple. Thread B will release xhci->lock before xhci_halt(). >> Yes. >> >>>> But from your example, I think your meaning is we >>>> should not halt the host by thread B, since we have handled the stop >>>> endpoint command event. >>>> >>>> So I think we need to check the xhci command of this stop endpoint >>>> command in xhci_stop_endpoint_command_timeout(), if the xhci command >>>> of this stop endpoint command is not in the command list (which means >>>> the stop endpoint command event has been handled), then just return >>>> and do not halt the controller. Right? >>>> >>> I'd like suggest you to put this change into a separated patch. >>> It's actually a different topic from the main purpose of this patch. >> OK, I will. Thanks for your comments. >> > > If you are going to work out a v2 patch, please consider whether > we can totally leverage the common command mechanism to > handle this stop endpoint command. > > Currently, when a stop endpoint command is issued for urb unlink, > there are two timers for this command. This is duplicated and we > should remove the stop-endpoint-only timer. The timeout functions > are also different. The stop-endpoint-only timer just halt the host > controller (this should be the last sort of way), while the common > command timer will try to recover the situation by aborting and > restart the command ring. Yes, thanks for your reminding and I will think about that. -- Baolin.wang Best Regards -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html