Re: OHCI-PCI: Thin client does not shutdown properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 15.05.2017 um 20:56 schrieb Alan Stern:
>> Interesting in this case is that we see a "USB disconnect" message
>> for device number 3. And even more strange are the last 3 lines
>> that show that new low-speed SUB devices are found even after all
>> USB controllers are shutdown.
> 
> The shutdown routine for ohci-hcd turns off all of the controller's
> autonomous functionality, but it doesn't stop the kernel from polling
> the controller for port-status changes.  It seems likely that these
> status changes are what give rise to those "new device" messages.

Ok, understood.

>> We also attached an USB analyzer to the system to see what is going on.
>> In the "bad" case we actually see a "resume" on the USB bus when the
>> machine is shutdown. Problem is that we cannot see *who* initiated this
>> resume, but my own guess is that it comes from the host controller and
>> not from any HID device.
> 
> The host controller is not supposed to initiate a resume signal unless
> the computer tells it to.  It's possible that the kernel is doing this
> -- but it's also possible that the BIOS is.  In fact, I would expect 
> the BIOS to do this any time it decided to restart the system.

Well, when we did the analysis the BIOS developer was involved, its a
colleage that is located in the same building at our site. And BIOS
says they're innocent. ;-)

> (And of course, the resume signal could be coming from an attached 
> device.  However, that wouldn't explain why you don't see the signal 
> when you run the "good" kernel...)

That is why I assumed that it comes from the controller itself,
otherwise I couldn't explain why it works in the "good" case.

>> - What could be the root cause for this?
> 
> It's very hard to say.  I'm inclined to blame the BIOS, but the truth 
> is that testing and debugging a kernel while it is shutting down (and 
> afterward!) are quite difficult.

Yes, already experienced that. Well at least I capture from serial and
the last lines always say:

ACPI: Preparing to enter system sleep state S5
PM: Calling mce_syscore_shutdown+0x0/0x10
PM: Calling ledtrig_cpu_syscore_shutdown+0x0/0x20
PM: Calling irq_gc_shutdown+0x0/0x60
PM: Calling i8259A_shutdown+0x0/0x20
PM: Calling cpufreq_suspend+0x0/0x110
reboot: Power down
acpi_power_off called

So I assume I got everything of interest in my capture file.

>> - How can we find out, what further commits have made the situation
>>   better in 4.11?
> 
> You can always use git bisect to do this.

I'll have a look at this.

>> Any hints are welcome.
> 
> You should try doing an rmmod (or unbind) of ehci-pci or ohci-pci or
> both before shutting down.  Maybe the presence or absence of one of the
> drivers will matter.  (Note that after you rmmod or unbind ohci-pci, a
> USB keyboard will become unusable -- you will have to start the
> shutdown beforehand or over a network login.)


> Also, it would be interesting to know whether the patch below has any 
> effect.  Even if that effect is just to change the log messages you 
> record with the good or bad kernel.
> 
> Index: usb-4.x/drivers/usb/core/driver.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/core/driver.c
> +++ usb-4.x/drivers/usb/core/driver.c
> @@ -1889,8 +1889,26 @@ int usb_set_usb2_hardware_lpm(struct usb
>  
>  #endif /* CONFIG_PM */
>  
> +/**
> + * usb_dev_shutdown - stop using a USB device when the system shuts down
> + * @dev: device to stop using
> + *
> + * Called by the device core at the start of a system shutdown.
> + * Don't delay the shutdown by taking any mutexes or changing the
> + * device's configuration; just mark its state as NOTATTACHED.
> + * This will prevent any more URBs from being submitted.
> + */
> +static void usb_dev_shutdown(struct device *dev)
> +{
> +	struct usb_device *udev;
> +
> +	udev = to_usb_device(dev);
> +	usb_set_device_state(udev, USB_STATE_NOTATTACHED);
> +}
> +
>  struct bus_type usb_bus_type = {
>  	.name =		"usb",
>  	.match =	usb_device_match,
>  	.uevent =	usb_uevent,
> +	.shutdown =	usb_dev_shutdown,
>  };

Ok. Tried the patch first. Doesn't work with the bad kernel, but the
logs sligthly change. Now those devices that didn't have a shutdown
callback before now have one, but this does not solve the problem.

Next thing I tried was the unbind approach. Since ehci and ohci were
compiled into the kernel I tried to unbind every USB device I found
under /sys/bus/usb/drivers/, but even with everything gone there the
machine doesn't shutdown at the end.

Next approach was that I changed the kernel config so that ehci and ohci
are modules instead of being compiled into the kernel. Then I booted the
"bad" kernel and did

rmmod ehci-pci
rmmod ehci-hcd

The keyboard/mouse still continued to work on my system (which btw is
running Ubuntu 16.04 for this tests). But now its getting strange:

- if I shutdown the system at this point with "init 0" from a root shell
  it performs a shutdown, and it turns off! Yeah.

- if I shutdown the system at this point by using the shutdown menu from
  the Ubuntu menu then the shutdown ends up in a kernel panic.

Both results are reproducible. "init 0" shuts the system down and keeps
it off, shutdown form menu crashes.

Since keyboard/mouse are still functional without the ehci stuff I tried
to blacklist them by putting a blacklist-ehci.conf file into
/etc/modprobe.d/ that had 2 lines:
blacklist ehci_pci
blacklist ehci_hcd

I also rebuild the initrd image, but I really couldn't get rid of those
modules, after every new start lsmod still showed the ehci modules
despite the blacklist entries.

Next step was disabling ehci support in the kernel config. Rebuilding
everything and now I have a bad kernel without ehci support that boots
up, is able to handle keyboard and mouse and I shutdown the system (even
from the menu) its shuts down and keeps off. So now it seems to behave
like the "good" kernel.

So at least we would have a workaround, but I would really prefer that I
can blacklist those modules because then our partner could build his own
kernel for the thin client system in the usual way and a "workaround"
could be disabling the ehci stuff from loading.

Makes me really wonder if something is wrong with the ehci part of the
hardware on that machine. Well, we also shipped one system to AMD for
further analysis, maybe they can explain this strange behaviour.

Thanks a lot for your input, it was really helpful.

Best regards
Rainer
-- 
Dipl.-Inf. (FH) Rainer Koenig
Project Manager Linux Clients
FJ EMEIA PR PSO PM&D CCD ENG SW OSS&C

Fujitsu Technology Solutions
Bürgermeister-Ullrich-Str. 100
86199 Augsburg
Germany

Telephone: +49-821-804-3321
Telefax:   +49-821-804-2131
Mail:      mailto:Rainer.Koenig@xxxxxxxxxxxxxx

Internet         ts.fujtsu.com
Company Details  ts.fujitsu.com/imprint.html
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux