Re: regression: Wrong device reset in port_event()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 08-01-15 16:22, Mathias Nyman wrote:
On 30.12.2014 08:54, Zhuang Jin Can wrote:
Pasting the log to better explain the issue.

<7>[  589.157489] usb 2-5: usb auto-suspend, wakeup 1
<7>[  589.168496] xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 0 status  = 0xe0002a0
<7>[  589.168543] xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 1 status  = 0xe0002a0
<7>[  589.168574] xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 2 status  = 0xe0002a0
<7>[  589.168600] xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 3 status  = 0xe0002a0
<7>[  589.168625] xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 4 status  = 0xe001263
<7>[  589.168650] xhci_hcd 0000:00:14.0: set port remote wake mask, actual port 5 status  = 0xe0002a0
<7>[  589.168666] hub 2-0:1.0: hub_suspend
<7>[  589.168686] usb usb2: bus auto-suspend, wakeup 1
<7>[  589.168715] xhci_hcd 0000:00:14.0: xhci_hub_status_data: stopping port polling.
<7>[  589.336396] xhci_hcd 0000:00:14.0: Port Status Change Event for port 12
		<<--- Port 2-5 remote wakeup begins
<7>[  589.336407] xhci_hcd 0000:00:14.0: resume root hub
<7>[  589.336420] xhci_hcd 0000:00:14.0: port resume event for port 12
<7>[  589.336425] xhci_hcd 0000:00:14.0: remote wake SS port 12
		<<--- link state if set to U0.
<7>[  589.336457] usb usb2: usb wakeup-resume
<7>[  589.336465] usb usb2: usb auto-resume
<7>[  589.336497] hub 2-0:1.0: hub_resume
<7>[  589.336514] xhci_hcd 0000:00:14.0: get port status, actual port 0 status  = 0x2a0
<7>[  589.336519] xhci_hcd 0000:00:14.0: Get port status returned 0x2a0
<7>[  589.336565] xhci_hcd 0000:00:14.0: get port status, actual port 1 status  = 0x2a0
<7>[  589.336570] xhci_hcd 0000:00:14.0: Get port status returned 0x2a0
<7>[  589.336592] xhci_hcd 0000:00:14.0: get port status, actual port 2 status  = 0x2a0
<7>[  589.336596] xhci_hcd 0000:00:14.0: Get port status returned 0x2a0
<7>[  589.336614] xhci_hcd 0000:00:14.0: get port status, actual port 3 status  = 0x2a0
<7>[  589.336619] xhci_hcd 0000:00:14.0: Get port status returned 0x2a0
<7>[  589.336637] xhci_hcd 0000:00:14.0: get port status, actual port 4 status  = 0x401203
<7>[  589.336641] xhci_hcd 0000:00:14.0: Get port status returned 0x400203
<7>[  589.336656] hub 2-0:1.0: port 5: status 0203 change 0040
<7>[  589.336665] xhci_hcd 0000:00:14.0: get port status, actual port 5 status  = 0x2a0
<7>[  589.336668] xhci_hcd 0000:00:14.0: Get port status returned 0x2a0
<7>[  589.336701] hub 2-0:1.0: state 7 ports 6 chg 0020 evt 0000
<7>[  589.336712] xhci_hcd 0000:00:14.0: get port status, actual port 4 status  = 0x401203
		<<- port_event() calls hub_port_status() when the port happens to change
		from RESUME to U0 but IRQ of Port change event is not called yet.
<7>[  589.336716] xhci_hcd 0000:00:14.0: Get port status returned 0x400203
<7>[  589.336732]  port5: link state change
<7>[  589.336743] xhci_hcd 0000:00:14.0: clear port link state change, actual port 4 status  = 0x1203
<7>[  589.336771] xhci_hcd 0000:00:14.0: Port Status Change Event for port 12
		<<- port change event irq handling is called.
<7>[  589.336800] xhci_hcd 0000:00:14.0: handle_port_status: starting port polling.
<7>[  589.336824] xhci_hcd 0000:00:14.0: xhci_hub_status_data: stopping port polling.
<7>[  589.347478] usb 2-5: usb wakeup-resume
<7>[  589.347506] xhci_hcd 0000:00:14.0: get port status, actual port 4 status  = 0x1203
<7>[  589.347511] xhci_hcd 0000:00:14.0: Get port status returned 0x203
<7>[  589.347539] usb 2-5: finish resume
<7>[  589.350107] usb 2-5: interface 0 remotewakeup disabled
<7>[  589.350719] usb 2-5: interface 2 remotewakeup disabled
<7>[  589.351257] usb 2-5: interface 4 remotewakeup disabled
<7>[  589.351800] usb 2-5: interface 6 remotewakeup disabled
<7>[  589.352347] usb 2-5: interface 8 remotewakeup disabled
<7>[  589.353011] usb 2-5: interface 10 remotewakeup disabled
<7>[  589.353560] usb 2-5: interface 12 remotewakeup disabled
<7>[  589.353639] xhci_hcd 0000:00:14.0: ep 0x82 - asked for 2048 bytes, 1986 bytes untransferred
<7>[  589.353740] hub 2-0:1.0: resume on port 5, status 0
<7>[  589.353754] port_event: usb_reset_device
		<<- port_event() sees port is U0 && PLC && Superspeed
		and wrongly reset the device

the messege of "port_event: usb_reest_device" added by me in the snippet:
	/*
	 * On disconnect USB3 protocol ports transit from U0 to
	 * SS.Inactive to Rx.Detect. If this happens a warm-
	 * reset is not needed, but a (re)connect may happen
	 * before khubd runs and sees the disconnect, and the
	 * device may be an unknown state.
	 *
	 * If the port went through SS.Inactive without khubd
	 * seeing it the C_LINK_STATE change flag will be set,
	 * and we reset the dev to put it in a known state.
	 */
	if (reset_device || (udev && hub_is_superspeed(hub->hdev)
				&& (portchange & USB_PORT_STAT_C_LINK_STATE)
				&& (portstatus & USB_PORT_STAT_CONNECTION))) {
		usb_unlock_port(port_dev);
		usb_lock_device(udev);
		printk("%s: usb_reset_device\n", __func__);
		usb_reset_device(udev);
		usb_unlock_device(udev);
		usb_lock_port(port_dev);
		connect_change = 0;
	}

I think the gap between port change and the irq is handled always exists,
so it's hard to fix the issue unless we put more strict condtion on the
issue of commit a82b76f7f fixed. Any Idea?

Thanks
Jincan

On Wed, Dec 24, 2014 at 03:12:59PM +0800, Zhuang Jin Can wrote:
Hi Hans de Goede, Sarah, Mathias,

This is about a regression caused by commit a82b76f7f usb: Reset USB-3
devices on USB-3 link bounce.

When we do some stress remote wakeup tests with usb3 internal roothub
autosuspend enabled, it's found that hub thread calling the port_event()
may run before handle_port_status() clears the PLC. Thus, port_event()
hits the condition of PLC && U0 && Superspeed and wrongly reset the
device.

Things happen like this:
1. device initiates the resume
2. xHCI receives the port change, handle_port_status() resume the
roothub.
portsc is in RESUME and set link to U0.
3. hub_active() finds portsc in RESUME state and sets the
hub->change_bits and kick the hub.
4. link change RESUME->U0 happens, but irq is not generated or processed
yet.
5. port_event() finds portsc is 0x401203 (PLC && U0), and reset the
device.
6. irq is generated now for point 4. But it's too late.

The problem is that port_event() falls in the gap between RESUME->U0
transtion (portsc 0x401203) and port change irq is generated and
processed.

I'd say this is a regression of commit a82b76f7 due to the PLC && U0 &&
Superspeed condition can also be met in remote wakeup case.

Do you have a suggestion to fix the issue? Or should we revert this
commit?


Read through this and it seems we really may end up doing a extra reset if
the hub wq happends to to catch the device in middle of a remote wakeup.
(i.e. RESUME -> U0, PLC set, but interrupt handler clearing PLC flag not yet run)

reverting a82b76f7 is one option, but then we get back the issue it fixed, so I'd rather
try to find a fix for both cases before reverting.

Right, note that I wrote a82b76f7 while seriously stress testing stuff on what is
in essence a laptop with broken firmware which causes it to toggle on/off the vcc of
the usb-port while the kernel is resuming (so after the BIOS has transferred control
back). Even with a82b76f7 some devices still become rather unhappy (and rightfully so)
when the firmware does this. As mentioned in the commit one may (theoretically) see
similar problems with a bad / loose connector, but still this fixes a rather rare
corner case, so if it is causing problems in a more common case it might be best to
revert this for now.

It seems there are other issues with usb3 port states as well, Taegil Bae reported
a re-connect issue:

http://marc.info/?l=linux-usb&m=142055403029943&w=2

Looks like I need to really dig into the usb3 port states and figure out exactly how it is
supposed to work

Yes, that does seem like a good idea.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux