Hi, On 12/01/2016 02:04 PM, Baolin Wang wrote: > Hi Baolu, > > On 1 December 2016 at 13:45, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote: >> Hi, >> >> On 11/30/2016 05:02 PM, Baolin Wang wrote: >>> If the hardware never responds to the stop endpoint command, the >>> URBs will never be completed, and we might hang the USB subsystem. >>> The original watchdog timer is used to watch if one stop endpoint >>> command is timeout, if timeout, then the watchdog timer will set >>> XHCI_STATE_DYING, try to halt the xHCI host, and give back all >>> pending URBs. >>> >>> But now we already have one command timer to control command timeout, >>> thus we can also use the command timer to watch the stop endpoint >>> command, instead of one duplicate watchdog timer which need to be >>> removed. >>> >>> Meanwhile we don't need the 'stop_cmds_pending' flag to identy if >>> this is the last stop endpoint command of one endpoint. Since we >>> can make sure we only set one stop endpoint command for one endpoint >>> by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove >>> this flag. >> I am afraid you can't do this. "stop_cmds_pending" was added >> to fix the problem described in the comments that you want to >> remove. But I didn't find any fix of this problem in your patch. > Now we can not pending another stop endpoint command for the same one > endpoint, since will check 'EP_HALT_PENDING' flag in > xhci_urb_dequeue() function to avoid this. But after some > investigation, I think I missed the stop endpoint command in > xhci_stop_device() which did not check the 'EP_HALT_PENDING' flag, > maybe need to add 'EP_HALT_PENDING' flag checking in > xhci_stop_device() function. DId I miss something else? Thanks. Consider below three threads running on different CPUs at the same time. Thread A: xhci_handle_cmd_stop_ep() --- in interrupt handler Thread B: xhci_stop_endpoint_command_timeout() --- timer expired Thread C: xhci_urb_dequeue --- called by usb core They are serialized by xhci->lock. Let's consider below sequence: Thread A: - delete xhci->cmd_timer), but will fail due to Thread B. - clear EP_HALT_PENDING bit Thread B: - halt the host controller Thread C: - set EP_HALT_PENDING bit - enqueue another stop endpoint command - add the timer back Best regards, Lu Baolu > >> - * The timer may also fire if the host takes a very long time to respond to the >> - * command, and the stop endpoint command completion handler cannot delete the >> - * timer before the timer function is called. Another endpoint cancellation may >> - * sneak in before the timer function can grab the lock, and that may queue >> - * another stop endpoint command and add the timer back. So we cannot use a >> - * simple flag to say whether there is a pending stop endpoint command for a >> - * particular endpoint. >> - * >> - * Instead we use a combination of that flag and a counter for the number of >> - * pending stop endpoint commands. If the timer is the tail end of the last >> - * stop endpoint command, and the endpoint's command is still pending, we assume >> - * the host is dying. >> >> Best regards, >> Lu Baolu >> >>> We also need to clean up the command queue before trying to halt the >>> xHCI host in xhci_stop_endpoint_command_timeout() function. >>> >>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx> > > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html