Re: xhci_handle_command_timeout and wait_for_completion

Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> · Mon, 09 May 2016 13:18:57 +0300

On 06.05.2016 23:32, Joe Lawrence wrote:
Hello Mathias,

In c311e391a7ef "xhci: rework command timeout and cancellation," xHCI
command timeouts were refactored to flow through
xhci_handle_command_timeout.  We've seen a few instances of
hangs/crashes with upstream and RHEL7.2 kernels where someone gets stuck
on wait_for_completion(cmd->completion) when the underlying host
controller has been hotplug removed.

Without wait_for_completion_interruptible_timeout, I was wondering how
xhci_handle_command_timeout would handle scenarios that avoid the code
inside the ESHUTDOWN case below:

   void xhci_handle_command_timeout(unsigned long data)
   {
           struct xhci_hcd *xhci;
           int ret;
           unsigned long flags;
           u64 hw_ring_state;
           struct xhci_command *cur_cmd = NULL;
           xhci = (struct xhci_hcd *) data;

           /* mark this command to be cancelled */
           spin_lock_irqsave(&xhci->lock, flags);
           if (xhci->current_cmd) {
                   cur_cmd = xhci->current_cmd;
                   cur_cmd->status = COMP_CMD_ABORT;
           }

           /* Make sure command ring is running before aborting it */
           hw_ring_state = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
           if ((xhci->cmd_ring_state & CMD_RING_STATE_RUNNING) &&
               (hw_ring_state & CMD_RING_RUNNING))  {

                   spin_unlock_irqrestore(&xhci->lock, flags);
                   xhci_dbg(xhci, "Command timeout\n");
                   ret = xhci_abort_cmd_ring(xhci);
                   if (unlikely(ret == -ESHUTDOWN)) {
                        xhci_err(xhci, "Abort command ring failed\n");
                        xhci_cleanup_command_queue(xhci);
                        usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
                        xhci_dbg(xhci, "xHCI host controller is dead.\n");
                   }
                   return;
           }
           /* command timeout on stopped ring, ring can't be aborted */
           xhci_dbg(xhci, "Command timeout on stopped ring\n");
           xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
           spin_unlock_irqrestore(&xhci->lock, flags);
           return;
   }

When xhci_abort_cmd_ring returns -ESHUTDOWN, it's clear that
xhci_cleanup_command_queue will iterate through the cmd_list, calling
complete on the cmd->completion variable when necessary.

I was uncertain about cases that don't fall into the -ESHUTDOWN path.
By what mechanism will they (re)timeout or hit handle_cmd_completion?
What happens if the device were to be removed before then?

To give you an idea of the kinds of lockups we're seeing, here's an
example from our RHEL7.2 machine.  PID 4892 wants the dev->mutex that
PID 44765 has.  The latter is waiting for a command completion and the
former grows tired and emits a hung task warning:

   INFO: task java:4892 blocked for more than 120 seconds.
   ...
   PID: 4892   TASK: ffff880852305080  CPU: 42  COMMAND: "java"
    #0 [ffff880fed2078d0] __schedule at ffffffff8163a78d
    #1 [ffff880fed207938] schedule_preempt_disabled at ffffffff8163bf09
    #2 [ffff880fed207948] __mutex_lock_slowpath at ffffffff81639c05
    #3 [ffff880fed2079a8] mutex_lock at ffffffff8163906f
    #4 [ffff880fed2079c0] usb_disconnect at ffffffff81438957
    #5 [ffff880fed207a08] usb_remove_hcd at ffffffff8143e3e1
    #6 [ffff880fed207a38] usb_hcd_pci_remove at ffffffff814509ff
    #7 [ffff880fed207a60] xhci_pci_remove at ffffffff81456901
    #8 [ffff880fed207a80] pci_device_remove at ffffffff81328d9b
    #9 [ffff880fed207aa8] __device_release_driver at ffffffff813f60ef
   #10 [ffff880fed207ac8] device_release_driver at ffffffff813f6183
   #11 [ffff880fed207ae8] pci_stop_bus_device at ffffffff81322104
   #12 [ffff880fed207b10] pci_stop_bus_device at ffffffff813220ab
   #13 [ffff880fed207b38] pci_stop_bus_device at ffffffff813220ab
   #14 [ffff880fed207b60] pci_stop_and_remove_bus_device at ffffffff813221f2
   #15 [ffff880fed207b78] ioboard_bringdown at ffffffffa05eb4c7 [ftmod]
   #16 [ffff880fed207ba0] ftmod_ioboard_event at ffffffffa05eb7c6 [ftmod]
   #17 [ffff880fed207c28] io_state_change at ffffffffa05e92d1 [ftmod]
   #18 [ffff880fed207c78] OsIoStateChange at ffffffffa06ba2e4 [ccmod]
   #19 [ffff880fed207c90] IoStateChange at ffffffffa06bad18 [ccmod]
   #20 [ffff880fed207dc0] CcIoBringDownEx at ffffffffa06baee0 [ccmod]
   #21 [ffff880fed207e20] CcIoBringDownEx at ffffffffa05e54db [ftmod]
   #22 [ffff880fed207e48] cmd_bringdown at ffffffffa05e68b8 [ftmod]
   #23 [ffff880fed207e60] ctl_ioctl at ffffffffa05e6c22 [ftmod]
   #24 [ffff880fed207eb8] do_vfs_ioctl at ffffffff811f1f15
   #25 [ffff880fed207f30] sys_ioctl at ffffffff811f2191
   #26 [ffff880fed207f80] system_call_fastpath at ffffffff81645e89
       RIP: 00007f88860cd257  RSP: 00007f87a0d323d8  RFLAGS: 00010206
       RAX: 0000000000000010  RBX: ffffffff81645e89  RCX: 0000000000000000
       RDX: 00007f87a0d32480  RSI: 0000000000000000  RDI: 0000000000000021
       RBP: 00007f87a0d323d0   R8: 0000000000000031   R9: 0000000000000002
       R10: 00007f8728000b72  R11: 0000000000000217  R12: 00007f8758073000
       R13: 00007f87a0d369c0  R14: 00000000feac7a48  R15: 0000000000000000
       ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

   PID: 44765  TASK: ffff8807e215d080  CPU: 3   COMMAND: "kworker/3:0"
    #0 [ffff880753e639e0] __schedule at ffffffff8163a78d
    #1 [ffff880753e63a48] schedule at ffffffff8163ae29
    #2 [ffff880753e63a58] schedule_timeout at ffffffff81638b19
    #3 [ffff880753e63b08] wait_for_completion at ffffffff8163b1f6
    #4 [ffff880753e63b68] xhci_configure_endpoint at ffffffff814754ad
    #5 [ffff880753e63bc0] xhci_check_bandwidth at ffffffff81476106
    #6 [ffff880753e63c08] usb_hcd_alloc_bandwidth at ffffffff814401b5
    #7 [ffff880753e63c60] usb_disable_device at ffffffff8144345a
    #8 [ffff880753e63ca0] usb_disconnect at ffffffff814389ad
    #9 [ffff880753e63ce8] hub_port_connect at ffffffff8143a4e1
   #10 [ffff880753e63d68] hub_event at ffffffff8143b5af
   #11 [ffff880753e63e20] process_one_work at ffffffff8109d5db
   #12 [ffff880753e63e68] worker_thread at ffffffff8109e3ab
   #13 [ffff880753e63ec8] kthread at ffffffff810a5acf
   #14 [ffff880753e63f50] ret_from_fork at ffffffff81645dd8

crash utility claims that 44765 hasn't run in ~2.5 minutes:

   crash> ps -m 44765
   [0 00:02:30.754] [UN]  PID: 44765  TASK: ffff8807e215d080  CPU: 3   COMMAND: "kworker/3:0"

And xhci_handle_command_timeout is still 4412 jiffies out:

   crash> timer
   ...
   TVEC_BASES[3]: ffff880853884000
     JIFFIES
   4418029572
     EXPIRES      TIMER_LIST         FUNCTION
   ...
   4418033984  ffff880813a6e0a8  ffffffff8147e0f0  <xhci_handle_command_timeout>
   ...

Given that the default command timeout is 5 seconds, it seems strange to
hit a 120 second hung task warning in this instance.  I can only think
that maybe something goofy is going on with xhci_handle_command_timeout
and an unfortunately timed host controller removal.

Idea is that when xhci is removed xhci_mem_cleanup() will take care of the pending completions.

xhci_mem_cleanup()
  xhci_cleanup_command_queue()
     list_for_each_entry_safe(cur_cmd, tmp_cmd, &xhci->cmd_list, cmd_list)
        xhci_complete_del_and_free_cmd(cur_cmd, COMP_CMD_ABORT);
           if (cmd->completion) {
		complete(cmd->completion);

The issue you see might be related to two things, first is that in an optimization,
xhci_mem_cleanup() was called after removing the first hcd.
Second is that when we are removing the secondary hcd we still issue configure endpoint
commands when endpoints are dropped and xhci_check_bandwitdh() is called.
The configure endpoint commands are not needed at this stage if we know that the whole
host is being removed.

Do the following patches resolve the issue?

(will be in 4.6, removes extra check_bandwith() call if host is being removed)
98d74f9ceaefc2b6c4a6440050163a83be0abede
    xhci: fix 10 second timeout on removal of PCI hotpluggable xhci controllers

(will be sent after 4.7-rc1 is out, call xhci_mem_cleanup after second hcd)
http://marc.info/?l=linux-usb&m=146134802313366&w=2

-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html