On Tue, 16 May 2017, Rainer Koenig wrote: > >> We also attached an USB analyzer to the system to see what is going on. > >> In the "bad" case we actually see a "resume" on the USB bus when the > >> machine is shutdown. Problem is that we cannot see *who* initiated this > >> resume, but my own guess is that it comes from the host controller and > >> not from any HID device. > > > > The host controller is not supposed to initiate a resume signal unless > > the computer tells it to. It's possible that the kernel is doing this > > -- but it's also possible that the BIOS is. In fact, I would expect > > the BIOS to do this any time it decided to restart the system. > > Well, when we did the analysis the BIOS developer was involved, its a > colleage that is located in the same building at our site. And BIOS > says they're innocent. ;-) You've got a BIOS developer in the same building? That's a great resource! Maybe together you can find out what condition is causing the BIOS to initiate a reboot. For example, exactly what does "Power-On via USB" in the BIOS do? > >> Any hints are welcome. > > > > You should try doing an rmmod (or unbind) of ehci-pci or ohci-pci or > > both before shutting down. Maybe the presence or absence of one of the > > drivers will matter. (Note that after you rmmod or unbind ohci-pci, a > > USB keyboard will become unusable -- you will have to start the > > shutdown beforehand or over a network login.) > > > > Also, it would be interesting to know whether the patch below has any > > effect. Even if that effect is just to change the log messages you > > record with the good or bad kernel. > > > > Index: usb-4.x/drivers/usb/core/driver.c > > =================================================================== > > --- usb-4.x.orig/drivers/usb/core/driver.c > > +++ usb-4.x/drivers/usb/core/driver.c > > @@ -1889,8 +1889,26 @@ int usb_set_usb2_hardware_lpm(struct usb > > > > #endif /* CONFIG_PM */ > > > > +/** > > + * usb_dev_shutdown - stop using a USB device when the system shuts down > > + * @dev: device to stop using > > + * > > + * Called by the device core at the start of a system shutdown. > > + * Don't delay the shutdown by taking any mutexes or changing the > > + * device's configuration; just mark its state as NOTATTACHED. > > + * This will prevent any more URBs from being submitted. > > + */ > > +static void usb_dev_shutdown(struct device *dev) > > +{ > > + struct usb_device *udev; > > + > > + udev = to_usb_device(dev); > > + usb_set_device_state(udev, USB_STATE_NOTATTACHED); > > +} > > + > > struct bus_type usb_bus_type = { > > .name = "usb", > > .match = usb_device_match, > > .uevent = usb_uevent, > > + .shutdown = usb_dev_shutdown, > > }; > > Ok. Tried the patch first. Doesn't work with the bad kernel, but the > logs sligthly change. Now those devices that didn't have a shutdown > callback before now have one, but this does not solve the problem. I didn't expect the patch to solve the problem. Nevertheless, I would like to know exactly what effect it has on both kernels. Can you provide more details? > Next thing I tried was the unbind approach. Since ehci and ohci were > compiled into the kernel I tried to unbind every USB device I found > under /sys/bus/usb/drivers/, but even with everything gone there the > machine doesn't shutdown at the end. You should have unbound the controllers, not the devices. That is, you should have unbound PCI devices 0000:00:12.0 and 0000:00:13.0 from ohci-pci (in /sys/bus/pci/drivers/ohci_pci), and 0000:00:12.2 and 0000:00:13.2 from ehci-pci (in /sys/bus/pci/drivers/ehci_pci). > Next approach was that I changed the kernel config so that ehci and ohci > are modules instead of being compiled into the kernel. Then I booted the > "bad" kernel and did > > rmmod ehci-pci > rmmod ehci-hcd That works too. > The keyboard/mouse still continued to work on my system (which btw is Are they connected over USB? If they are, removing ehci-pci won't make any difference. But without ohci-pci, they won't work -- unless they are plugged into a USB-3 port. > running Ubuntu 16.04 for this tests). But now its getting strange: > > - if I shutdown the system at this point with "init 0" from a root shell > it performs a shutdown, and it turns off! Yeah. > > - if I shutdown the system at this point by using the shutdown menu from > the Ubuntu menu then the shutdown ends up in a kernel panic. Don't you get any information about the panic on your serial console? I would expect it to have a stack dump, at least. A panic means there's a bug, and it needs to be fixed. > Both results are reproducible. "init 0" shuts the system down and keeps > it off, shutdown form menu crashes. > > Since keyboard/mouse are still functional without the ehci stuff I tried > to blacklist them by putting a blacklist-ehci.conf file into > /etc/modprobe.d/ that had 2 lines: > blacklist ehci_pci > blacklist ehci_hcd > > I also rebuild the initrd image, but I really couldn't get rid of those > modules, after every new start lsmod still showed the ehci modules > despite the blacklist entries. You probably have to tell the program that creates the initrd image to blacklist them or leave them out entirely. I don't know how to do this for Ubuntu. > Next step was disabling ehci support in the kernel config. Rebuilding > everything and now I have a bad kernel without ehci support that boots > up, is able to handle keyboard and mouse and I shutdown the system (even > from the menu) its shuts down and keeps off. So now it seems to behave > like the "good" kernel. Therefore it appears that the problem is somehow caused by the operation of shutting down the EHCI controller. Perhaps it interrupts the connections to the OHCI controller briefly, in a way that leads the BIOS to believe that a "Power-On via USB" event has occurred. > So at least we would have a workaround, but I would really prefer that I > can blacklist those modules because then our partner could build his own > kernel for the thin client system in the usual way and a "workaround" > could be disabling the ehci stuff from loading. Another possibility is to unbind ehci-pci just before shutting down, for example as part of a shutdown script. > Makes me really wonder if something is wrong with the ehci part of the > hardware on that machine. Well, we also shipped one system to AMD for > further analysis, maybe they can explain this strange behaviour. > > Thanks a lot for your input, it was really helpful. Let me know what you find out. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html