Re: [PROBLEM] usb: xhci_bus_resume cause irq lost issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

On 5.3.2025 12.07, liudingyuan wrote:


Hi

I'm running into an issue where the enumeration of a USB2.0 device failed due to lost interrupts.

This issue appears to occur randomly and we can only reproduce it on xHCI controllers that provide both USB3.0 and USB2.0 root hubs. Additionally, it is necessary to ensure that the first-level device under this controller is a SB2.0 device.
The above scenario can be referred to in the following figure.
    ----------------------------------------------------------------------------
   |         +---------------------------------+               |
   |         |    xHCI Controller    |               |
   |         +---------------------------------+               |
   |                /       \                      |
   |               /         \                     |
   |              /           \                    |
   |  +-------------------------+      +---------------------------+   |
   |  | USB 3.0 Root Hub |      | USB 2.0 Root Hub  |   |
   |  +------------------------+      +----------------------------+   |
   --------------|-------------------------------------|-------------------------
           |                       |
           | [NO device]             | [Device A]
           |                       |
The USB topology displayed in the OS looks like this:
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/ , 480M
     ID 1d6b:0002 Linux Foundation 2.0 root hub
     |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/ , 5000M
     ID 1d6b:0003 Linux Foundation 3.0 root hub

Odd that the USB3 roothub is registered before USB2 roothub,
I thought xhci driver always registers USB2 hcd first .


This issue only occurs when the system reboot or when we insmod the xhci driver or ingister the xhci controller.
When the issue occurs, we can observe that the CPU receives fewer interrupts than what would normally be generated during the enumeration process, and there are error logs indicating command timeouts.
	[ 2040.039438]xhci_hcd 0000:8a:00.7: Command timeout, USBSTS: 0x00000018 EINT PCD
	[ 2040.039444] xhci_hcd 0000:8a:00.7: Command timeout
	[ 2040.039446] xhci_hcd 0000:8a:00.7: Abort command ring
	[ 2042.055435] xhci_hcd 0000:8a:00.7: No stop event for abort, ring start fail?
	[ 2042.055469] xhci_hcd 0000:8a:00.7: Timeout while waiting for setup device command
	[ 2042.064048] usb 15-1: hub failed to enable device, error -62
	[ 2054.343446] xhci_hcd 0000:8a:00.7: Unsuccessful disable slot 1 command, status 25
  	[ 2066.631449] xhci_hcd 0000:8a:00.7: Error while assigning device slot ID: Command Aborted
	[ 2066.640633] xhci_hcd 0000:8a:00.7: Max number of devices this xHCI host supports is 64.
	[ 2066.649713] usb usb15-port1: couldn't allocate usb_device

After verification, we can confirm that the reason for the interrupt loss is that during the enumeration of U2 device,
U3 port is in a suspend procedure and performs an operation to disable interrupts in this function:

	drivers/usb/host/xhci-hub.c
		xhci_bus_resume()
			/* delay the irqs */
			temp = readl(&xhci->op_regs->command);
			temp &= ~CMD_EIE;
			writel(temp, &xhci->op_regs->command);

we can temporarily avoid this issue by modifying parameters.
echo -1 > /sys/module/usbcore/parameters/autosuspend

I am wondering whether there is a chance of interrupt loss occurring, regardless of whether or not it belongs to the scenario mentioned above? As long as an interrupt from a controller is triggered at exactly the same time as the process of disabling the controller's interrupts?

I read the xHCI protocol version 1.2 and haven't found detailed descriptions for such special cases. So I was wondering what is the main reason for disabling interrupts in xHCI driver during the resume process?


This is from a time before I started maintaining the xhci driver. I guess
it was done to allow bus suspended usb2 ports to resume fully to U0 before
xhci_bus_resume() returns.

Resuming a usb2 port from  U3 suspend to U0 is a two stage process.
In 'host initiated resume' the xhci driver will first transition the port from
'U3' to 'Resume' state, then wait in Resume state for 20ms, and finally move
it to U0 state.

I assume driver disabled xHC from triggering interrupts to prevent event handler
from messing with this usb2 port resume process.

spin_lock_irqsave(xhci->lock) would normally be used to prevent interrupt
handler from interfering, but keeping the spin lock over msleep(20) was not
possible.

The device initiated usb2 resume is much better, it utilizes the event handler,
hub thread, timestamps and completions. The same should be done here, but
implementation isn't trivial.

I don't however think there is a reason to turn off interrupts while resuming
the USB3 bus. It doesn't sleep so just keeping spin_lock_irqsave() should be
enough. This should be an easy fix. Something like this:
(untested, copy-pasted diff)


diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 9693464c0520..7cf7ee84fc96 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -1873,6 +1873,7 @@ int xhci_bus_resume(struct usb_hcd *hcd)
        u32 temp, portsc;
        struct xhci_hub *rhub;
        struct xhci_port **ports;
+       bool disabled_irq = false;
rhub = xhci_get_rhub(hcd);
        ports = rhub->ports;
@@ -1888,17 +1889,19 @@ int xhci_bus_resume(struct usb_hcd *hcd)
                return -ESHUTDOWN;
        }
- /* delay the irqs */
-       temp = readl(&xhci->op_regs->command);
-       temp &= ~CMD_EIE;
-       writel(temp, &xhci->op_regs->command);
-
        /* bus specific resume for ports we suspended at bus_suspend */
-       if (hcd->speed >= HCD_USB3)
+       if (hcd->speed >= HCD_USB3) {
                next_state = XDEV_U0;
-       else
+       } else {
                next_state = XDEV_RESUME;
-
+               if (bus_state->bus_suspended) {
+                       /* delay the irqs if we need to resume usb2 ports */
+                       temp = readl(&xhci->op_regs->command);
+                       temp &= ~CMD_EIE;
+                       writel(temp, &xhci->op_regs->command);
+                       disabled_irq = true;
+               }
+       }
        port_index = max_ports;
        while (port_index--) {
                portsc = readl(ports[port_index]->addr);
@@ -1967,10 +1970,12 @@ int xhci_bus_resume(struct usb_hcd *hcd)
bus_state->next_statechange = jiffies + msecs_to_jiffies(5);
        /* re-enable irqs */
-       temp = readl(&xhci->op_regs->command);
-       temp |= CMD_EIE;
-       writel(temp, &xhci->op_regs->command);
-       temp = readl(&xhci->op_regs->command);
+       if (disabled_irq) {
+               temp = readl(&xhci->op_regs->command);
+               temp |= CMD_EIE;
+               writel(temp, &xhci->op_regs->command);
+               temp = readl(&xhci->op_regs->command);
+       }
spin_unlock_irqrestore(&xhci->lock, flags);
        return 0;

Thanks
Mathias




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux