Hi, I'm working on a very strange and bad problem related to USB and system shutdown. Problem description: -------------------- We have a Thin Client system based on the AMD eKabini-Chipset that does not shutdown properly if - we allow "Power-On via USB" in our BIOS - we use a kernel that includes this commit: 2cdbdd49853dfa856082edb0f4c4c0249d9df07 driver core: correct device's shutdown order Background: ----------- Customers in the banking sector use those thin clients and the machines are very well hidden inside their desks, so the only way for them to power the system on is via keyboard with a power switch. (Note: The problem does occur no matter what keyboard/mouse is used). On the technical side we have the follwoing USB controllers: 00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 01) 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 39) 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 39) 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 39) 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 39) What we did so far to analyze the problem: ------------------------------------------ We prepared a machine and compiled a kernel from Linus git tree where the head is postioned on that commit, we refer this as "bad" kernel. Then we compiled another kernel where HEAD is one commit before the critical commit, and reference this as "good" kernel. Quick test shows that the problem does not occur on the "good" kernel, but occurs on the "bad" kernel. The critical commit just changes the order in which devices are shutdown. To understand what is going on we inserted some more printk/dev_info calls into drivers/base/core.c function device_shutdown. Then we captured the kernel outputs via serial nullmodem cable. In case of the "good" kernel the USB related output shows the following shutdown order (note the "rkoneig handling" is just a debug statement that shows what device is currently processed by going through the list in device_shudown). usbhid 5-2:1.0: rkoenig handling usb 5-2: rkoenig handling usbhid 5-1:1.1: rkoenig handling usbhid 5-1:1.0: rkoenig handling usb 5-1: rkoenig handling usb usb6-port4: rkoenig handling usb usb6-port3: rkoenig handling usb usb6-port2: rkoenig handling usb usb6-port1: rkoenig handling usb usb6: rkoenig handling usb usb5-port4: rkoenig handling usb usb5-port3: rkoenig handling usb usb5-port2: rkoenig handling usb usb5-port1: rkoenig handling usb usb5: rkoenig handling usb usb4-port4: rkoenig handling usb usb4-port3: rkoenig handling usb usb4-port2: rkoenig handling usb usb4-port1: rkoenig handling usb usb4: rkoenig handling usb usb3-port4: rkoenig handling usb usb3-port3: rkoenig handling usb usb3-port2: rkoenig handling usb usb3-port1: rkoenig handling usb usb3: rkoenig handling usb usb2-port2: rkoenig handling usb usb2-port1: rkoenig handling usb usb2: rkoenig handling usb usb1-port2: rkoenig handling usb usb1-port1: rkoenig handling usb usb1: rkoenig handling ehci-pci 0000:00:13.2: rkoenig handling ehci-pci 0000:00:13.2: shutdown ohci-pci 0000:00:13.0: rkoenig handling ohci-pci 0000:00:13.0: shutdown ehci-pci 0000:00:12.2: rkoenig handling ehci-pci 0000:00:12.2: shutdown usb 5-2: USB disconnect, device number 3 ohci-pci 0000:00:12.0: rkoenig handling ohci-pci 0000:00:12.0: shutdown ahci 0000:00:11.0: rkoenig handling xhci_hcd 0000:00:10.0: rkoenig handling xhci_hcd 0000:00:10.0: shutdown usb 5-2: new low-speed USB device number 4 using ohci-pci usb 5-2: new low-speed USB device number 5 using ohci-pci usb 5-2: new low-speed USB device number 6 using ohci-pci Interesting in this case is that we see a "USB disconnect" message for device number 3. And even more strange are the last 3 lines that show that new low-speed SUB devices are found even after all USB controllers are shutdown. Now we look at the "bad" kernel: usbhid 5-2:1.0: rkoenig handling usbhid 5-1:1.1: rkoenig handling usbhid 5-1:1.0: rkoenig handling ahci 0000:00:11.0: rkoenig handling ahci 0000:00:11.0: shutdown usb 5-2: rkoenig handling usb 5-1: rkoenig handling usb usb6-port4: rkoenig handling usb usb6-port3: rkoenig handling usb usb6-port2: rkoenig handling usb usb6-port1: rkoenig handling usb usb6: rkoenig handling ohci-pci 0000:00:13.0: rkoenig handling ohci-pci 0000:00:13.0: shutdown usb usb5-port4: rkoenig handling usb usb5-port3: rkoenig handling usb usb5-port2: rkoenig handling usb usb5-port1: rkoenig handling usb usb5: rkoenig handling ohci-pci 0000:00:12.0: rkoenig handling ohci-pci 0000:00:12.0: shutdown usb usb4-port4: rkoenig handling usb usb4-port3: rkoenig handling usb usb4-port2: rkoenig handling usb usb4-port1: rkoenig handling usb usb4: rkoenig handling ehci-pci 0000:00:13.2: rkoenig handling ehci-pci 0000:00:13.2: shutdown usb usb3-port4: rkoenig handling usb usb3-port3: rkoenig handling usb usb3-port2: rkoenig handling usb usb3-port1: rkoenig handling usb usb3: rkoenig handling ehci-pci 0000:00:12.2: rkoenig handling ehci-pci 0000:00:12.2: shutdown usb usb2-port2: rkoenig handling usb usb2-port1: rkoenig handling usb usb2: rkoenig handling usb usb1-port2: rkoenig handling usb usb1-port1: rkoenig handling usb usb1: rkoenig handling xhci_hcd 0000:00:10.0: rkoenig handling xhci_hcd 0000:00:10.0: shutdown In this case the shutdown seems to be logically totally correct. We shutdown the USB HID devices, then the ports, then the controller that is connected to the ports. The big difference is that in the "good" case we shutdown ehci-pci before ohci-pci, in the "bad" case we shutdown ohci-pci first and then ehci-pci (which seems totally logical because "modinfo ohci-pci" lists a "softdep: pre: ehci_pci". Nevertheless in case of the "bad" kernel the system does not shutdown but instantly powers on again after power is off for a fraction of a second. We also attached an USB analyzer to the system to see what is going on. In the "bad" case we actually see a "resume" on the USB bus when the machine is shutdown. Problem is that we cannot see *who* initiated this resume, but my own guess is that it comes from the host controller and not from any HID device. Additional notes: ----------------- We only see this problem on the AMD eKabini chipset. Other machines with similar "USB power on" feature in the BIOS does not show this problem. We also tried the latest 4.11 release kernel and here things get a bit better. The affected system has 4 USB sockets on the back, and with kernel 4.11 we have 2 sockets that work (means system shuts off and remains off) and 2 that still show the problem. This message is a follow up for this message on the linux-usb list that was issued by our partners: http://marc.info/?l=linux-usb&m=148828103627597 Reverse engineering with systems from other vendors that are based on the same chipset indicate that they also suffer from this issue, we found out that none of those systems will go to S3 sleep state but instantly wake up again (which pretty much looks like a similar problem). Questions: ---------- - What could be the root cause for this? - How can we find out, what further commits have made the situation better in 4.11? Any hints are welcome. Best regards Rainer -- Dipl.-Inf. (FH) Rainer Koenig Project Manager Linux Clients FJ EMEIA PR PSO PM&D CCD ENG SW OSS&C Fujitsu Technology Solutions Bürgermeister-Ullrich-Str. 100 86199 Augsburg Germany Telephone: +49-821-804-3321 Telefax: +49-821-804-2131 Mail: mailto:Rainer.Koenig@xxxxxxxxxxxxxx Internet ts.fujtsu.com Company Details ts.fujitsu.com/imprint.html -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html