Re: OHCI-PCI: Thin client does not shutdown properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 15 May 2017, Rainer Koenig wrote:

> Hi,
> 
> I'm working on a very strange and bad problem related to USB and system
> shutdown.
> 
> Problem description:
> --------------------
> We have a Thin Client system based on the AMD eKabini-Chipset that
> does not shutdown properly if
> - we allow "Power-On via USB" in our BIOS
> - we use a kernel that includes this commit:
>   2cdbdd49853dfa856082edb0f4c4c0249d9df07
>   driver core: correct device's shutdown order
> 
> Background:
> -----------
> Customers in the banking sector use those thin clients and
> the machines are very well hidden inside their desks, so the
> only way for them to power the system on is via keyboard with
> a power switch. (Note: The problem does occur no matter what
> keyboard/mouse is used).
> 
> On the technical side we have the follwoing USB controllers:
> 00:10.0 USB controller: Advanced Micro Devices,
> 	Inc. [AMD] FCH USB XHCI Controller (rev 01)
> 00:12.0 USB controller: Advanced Micro Devices,
> 	Inc. [AMD] FCH USB OHCI Controller (rev 39)
> 00:12.2 USB controller: Advanced Micro Devices,
>         Inc. [AMD] FCH USB EHCI Controller (rev 39)
> 00:13.0 USB controller: Advanced Micro Devices,
> 	Inc. [AMD] FCH USB OHCI Controller (rev 39)
> 00:13.2 USB controller: Advanced Micro Devices,
> 	Inc. [AMD] FCH USB EHCI Controller (rev 39)
> 
> What we did so far to analyze the problem:
> ------------------------------------------
> We prepared a machine and compiled a kernel from Linus git tree where
> the head is postioned on that commit, we refer this as "bad" kernel.
> Then we compiled another kernel where HEAD is one commit before the
> critical commit, and reference this as "good" kernel.
> 
> Quick test shows that the problem does not occur on the "good" kernel,
> but occurs on the "bad" kernel.
> 
> The critical commit just changes the order in which devices are
> shutdown. To understand what is going on we inserted some more
> printk/dev_info calls into drivers/base/core.c function device_shutdown.
> Then we captured the kernel outputs via serial nullmodem cable.
> 
> In case of the "good" kernel the USB related output shows the following
> shutdown order (note the "rkoneig handling" is just a debug statement
> that shows what device is currently processed by going through the
> list in device_shudown).
> 
> usbhid 5-2:1.0: rkoenig handling
> usb 5-2: rkoenig handling
> usbhid 5-1:1.1: rkoenig handling
> usbhid 5-1:1.0: rkoenig handling
> usb 5-1: rkoenig handling
> usb usb6-port4: rkoenig handling
> usb usb6-port3: rkoenig handling
> usb usb6-port2: rkoenig handling
> usb usb6-port1: rkoenig handling
> usb usb6: rkoenig handling
> usb usb5-port4: rkoenig handling
> usb usb5-port3: rkoenig handling
> usb usb5-port2: rkoenig handling
> usb usb5-port1: rkoenig handling
> usb usb5: rkoenig handling
> usb usb4-port4: rkoenig handling
> usb usb4-port3: rkoenig handling
> usb usb4-port2: rkoenig handling
> usb usb4-port1: rkoenig handling
> usb usb4: rkoenig handling
> usb usb3-port4: rkoenig handling
> usb usb3-port3: rkoenig handling
> usb usb3-port2: rkoenig handling
> usb usb3-port1: rkoenig handling
> usb usb3: rkoenig handling
> usb usb2-port2: rkoenig handling
> usb usb2-port1: rkoenig handling
> usb usb2: rkoenig handling
> usb usb1-port2: rkoenig handling
> usb usb1-port1: rkoenig handling
> usb usb1: rkoenig handling
> ehci-pci 0000:00:13.2: rkoenig handling
> ehci-pci 0000:00:13.2: shutdown
> ohci-pci 0000:00:13.0: rkoenig handling
> ohci-pci 0000:00:13.0: shutdown
> ehci-pci 0000:00:12.2: rkoenig handling
> ehci-pci 0000:00:12.2: shutdown
> usb 5-2: USB disconnect, device number 3
> ohci-pci 0000:00:12.0: rkoenig handling
> ohci-pci 0000:00:12.0: shutdown
> ahci 0000:00:11.0: rkoenig handling
> xhci_hcd 0000:00:10.0: rkoenig handling
> xhci_hcd 0000:00:10.0: shutdown
> usb 5-2: new low-speed USB device number 4 using ohci-pci
> usb 5-2: new low-speed USB device number 5 using ohci-pci
> usb 5-2: new low-speed USB device number 6 using ohci-pci
> 
> Interesting in this case is that we see a "USB disconnect" message
> for device number 3. And even more strange are the last 3 lines
> that show that new low-speed SUB devices are found even after all
> USB controllers are shutdown.

The shutdown routine for ohci-hcd turns off all of the controller's
autonomous functionality, but it doesn't stop the kernel from polling
the controller for port-status changes.  It seems likely that these
status changes are what give rise to those "new device" messages.

> Now we look at the "bad" kernel:
> 
> usbhid 5-2:1.0: rkoenig handling
> usbhid 5-1:1.1: rkoenig handling
> usbhid 5-1:1.0: rkoenig handling
> ahci 0000:00:11.0: rkoenig handling
> ahci 0000:00:11.0: shutdown
> usb 5-2: rkoenig handling
> usb 5-1: rkoenig handling
> usb usb6-port4: rkoenig handling
> usb usb6-port3: rkoenig handling
> usb usb6-port2: rkoenig handling
> usb usb6-port1: rkoenig handling
> usb usb6: rkoenig handling
> ohci-pci 0000:00:13.0: rkoenig handling
> ohci-pci 0000:00:13.0: shutdown
> usb usb5-port4: rkoenig handling
> usb usb5-port3: rkoenig handling
> usb usb5-port2: rkoenig handling
> usb usb5-port1: rkoenig handling
> usb usb5: rkoenig handling
> ohci-pci 0000:00:12.0: rkoenig handling
> ohci-pci 0000:00:12.0: shutdown
> usb usb4-port4: rkoenig handling
> usb usb4-port3: rkoenig handling
> usb usb4-port2: rkoenig handling
> usb usb4-port1: rkoenig handling
> usb usb4: rkoenig handling
> ehci-pci 0000:00:13.2: rkoenig handling
> ehci-pci 0000:00:13.2: shutdown
> usb usb3-port4: rkoenig handling
> usb usb3-port3: rkoenig handling
> usb usb3-port2: rkoenig handling
> usb usb3-port1: rkoenig handling
> usb usb3: rkoenig handling
> ehci-pci 0000:00:12.2: rkoenig handling
> ehci-pci 0000:00:12.2: shutdown
> usb usb2-port2: rkoenig handling
> usb usb2-port1: rkoenig handling
> usb usb2: rkoenig handling
> usb usb1-port2: rkoenig handling
> usb usb1-port1: rkoenig handling
> usb usb1: rkoenig handling
> xhci_hcd 0000:00:10.0: rkoenig handling
> xhci_hcd 0000:00:10.0: shutdown
> 
> In this case the shutdown seems to be logically totally correct.
> We shutdown the USB HID devices, then the ports, then the controller
> that is connected to the ports.
> 
> The big difference is that in the "good" case we shutdown ehci-pci
> before ohci-pci, in the "bad" case we shutdown ohci-pci first and then
> ehci-pci (which seems totally logical because "modinfo ohci-pci" lists
> a "softdep: pre: ehci_pci".
> 
> Nevertheless in case of the "bad" kernel the system does not shutdown
> but instantly powers on again after power is off for a fraction of a
> second.
> 
> We also attached an USB analyzer to the system to see what is going on.
> In the "bad" case we actually see a "resume" on the USB bus when the
> machine is shutdown. Problem is that we cannot see *who* initiated this
> resume, but my own guess is that it comes from the host controller and
> not from any HID device.

The host controller is not supposed to initiate a resume signal unless
the computer tells it to.  It's possible that the kernel is doing this
-- but it's also possible that the BIOS is.  In fact, I would expect 
the BIOS to do this any time it decided to restart the system.

(And of course, the resume signal could be coming from an attached 
device.  However, that wouldn't explain why you don't see the signal 
when you run the "good" kernel...)

> Additional notes:
> -----------------
> We only see this problem on the AMD eKabini chipset. Other machines
> with similar "USB power on" feature in the BIOS does not show this
> problem.
> 
> We also tried the latest 4.11 release kernel and here things get
> a bit better. The affected system has 4 USB sockets on the back, and
> with kernel 4.11 we have 2 sockets that work (means system shuts off
> and remains off) and 2 that still show the problem.
> 
> This message is a follow up for this message on the linux-usb
> list that was issued by our partners:
> http://marc.info/?l=linux-usb&m=148828103627597

CC'ed.

> Reverse engineering with systems from other vendors that are based
> on the same chipset indicate that they also suffer from this issue,
> we found out that none of those systems will go to S3 sleep state but
> instantly wake up again (which pretty much looks like a similar problem).
> 
> Questions:
> ----------
> - What could be the root cause for this?

It's very hard to say.  I'm inclined to blame the BIOS, but the truth 
is that testing and debugging a kernel while it is shutting down (and 
afterward!) are quite difficult.

> - How can we find out, what further commits have made the situation
>   better in 4.11?

You can always use git bisect to do this.

> Any hints are welcome.

You should try doing an rmmod (or unbind) of ehci-pci or ohci-pci or
both before shutting down.  Maybe the presence or absence of one of the
drivers will matter.  (Note that after you rmmod or unbind ohci-pci, a
USB keyboard will become unusable -- you will have to start the
shutdown beforehand or over a network login.)

Also, it would be interesting to know whether the patch below has any 
effect.  Even if that effect is just to change the log messages you 
record with the good or bad kernel.

Alan Stern



Index: usb-4.x/drivers/usb/core/driver.c
===================================================================
--- usb-4.x.orig/drivers/usb/core/driver.c
+++ usb-4.x/drivers/usb/core/driver.c
@@ -1889,8 +1889,26 @@ int usb_set_usb2_hardware_lpm(struct usb
 
 #endif /* CONFIG_PM */
 
+/**
+ * usb_dev_shutdown - stop using a USB device when the system shuts down
+ * @dev: device to stop using
+ *
+ * Called by the device core at the start of a system shutdown.
+ * Don't delay the shutdown by taking any mutexes or changing the
+ * device's configuration; just mark its state as NOTATTACHED.
+ * This will prevent any more URBs from being submitted.
+ */
+static void usb_dev_shutdown(struct device *dev)
+{
+	struct usb_device *udev;
+
+	udev = to_usb_device(dev);
+	usb_set_device_state(udev, USB_STATE_NOTATTACHED);
+}
+
 struct bus_type usb_bus_type = {
 	.name =		"usb",
 	.match =	usb_device_match,
 	.uevent =	usb_uevent,
+	.shutdown =	usb_dev_shutdown,
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux