Re: Missing USB XHCI and EHCI reset for kexec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 15, 2014 at 08:42:58PM +0200, Stefani Seibold wrote:
> Am Dienstag, den 15.04.2014, 15:33 -0300 schrieb Thadeu Lima de Souza
> Cascardo:
> > On Tue, Apr 15, 2014 at 05:00:28PM +0200, stefani@xxxxxxxxxxx wrote:
> > > 
> > > Zitat von Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxxxxxxx>:
> > > 
> > > >On Tue, Apr 15, 2014 at 12:04:17PM +0200, stefani@xxxxxxxxxxx wrote:
> > > >>
> > > >>Zitat von Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxxxxxxx>:
> > > >>
> > > >>>On Mon, Apr 14, 2014 at 05:44:58PM +0200, stefani@xxxxxxxxxxx wrote:
> > > >>>>
> > > >>>>Zitat von Benjamin Herrenschmidt <benh@xxxxxxxxxxx>:
> > > >>>>
> > > >>>>>I don't know about EHCI specifically but this is a known issue with
> > > >>>>>XHCI, I observe similar issues on other powerpc platforms (servers)
> > > >>>>>and this isn't architecture specific (looks more like actualy xhc
> > > >>>>>implementation specific).
> > > >>>>>
> > > >>>>>Thadeu Cascardo (on CC) has been the one investigating that on our side,
> > > >>>>>he might have more to add including patches.
> > > >>>>>
> > > >>>>
> > > >>>>I have now a kernel 3.14 dmesg log of the problem. After a kexec the
> > > >>>>kexeced 3.14 kernel shows:
> > > >>>>
> > > >>>>[    1.170029] xhci_hcd 0001:03:00.0: xHCI Host Controller
> > > >>>>[    1.175306] xhci_hcd 0001:03:00.0: new USB bus registered,
> > > >>>>assigned bus number 1
> > > >>>>[    1.212561] xhci_hcd 0001:03:00.0: Host not halted after 16000
> > > >>>>microseconds.
> > > >>>>[    1.219621] xhci_hcd 0001:03:00.0: can't setup: -110
> > > >>>>[    1.224597] xhci_hcd 0001:03:00.0: USB bus 1 deregistered
> > > >>>>[    1.230021] xhci_hcd 0001:03:00.0: init 0001:03:00.0 fail, -110
> > > >>>>[    1.235955] xhci_hcd: probe of 0001:03:00.0 failed with error -110
> > > >>>>
> > > >>>
> > > >>>What is your controller vendor and device IDs? Is that a TI chip?
> > > >>>
> > > >>
> > > >>Yes it is a TI chip, vendor ID 104c and product ID 8241.
> > > >>
> > > >>>Can you check if the patch I sent a month ago fixes it? [1] There's the
> > > >>>whole story there. In fact, you will also need something like the patch
> > > >>>below. Can you apply only the first one, verify, and, then, the other
> > > >>>one as well, and report what worked for you?
> > > >>>
> > > >>>[1] http://marc.info/?l=linux-usb&m=139483181809062&w=2
> > > >>>
> > > >>
> > > >>I tried the attach patch and it did not help. This is what i
> > > >>expected because this is a fix in the shutdown path, which will
> > > >>never called when doing a forced kexec.
> > > >
> > > >Hi, Stefani.
> > > >
> > > >Did you try with both patches applied? How do you evoke the forced
> > > >kexec? Is that a kexec on panic? Does it really need to be forced? With
> > > >no clean shutdown, platform and drivers would need to issue resets, like
> > > >you mentioned below, to get the system into a clean state.
> > > >
> > > 
> > > Yes, i applied both patches. But without success.
> > > 
> > > IMHO i think it is necessary to bring the device i a clean state
> > > when the driver use the HW.
> > > 
> > > >>
> > > >>I have a running a 3.10.23 kernel. This kernel do a kexec for a
> > > >>kernel 3.14. Since the kernel 3.10.23 did not performe a clean
> > > >>shutdown, the state of the XHCI Controller is undefined. So when
> > > >
> > > >And the clean shutdown requires both of my patches, for TI chips, as far
> > > >as I know. It looks like the problem is issuing a halt when there are
> > > >pending URBs.
> > > >
> > > >>kernel 3.14 will probe XHCI it will find a XHCI controller which was
> > > >>not performed a reset.
> > > >>
> > > >
> > > >The problem is not that a reset hasn't been issued. A PCI function reset
> > > >should fix most of the problems with a bad device state, when the reset
> > > >works. However, the problem is that it was not cleanly shut down. URBs
> > > >should have been canceled and removed from the controller queue, and it
> > > >should have halted after that.
> > > 
> > > Again, i think it is the job of the driver to bring the chip in a clean state
> > > before using them. A driver should never expect a reset state of a chip.
> > > 
> > > >
> > > >>So i think it is necessary to reset the XHCI controller and all
> > > >>devices on this bus. This is what i do with a "echo 1
> > > >>>/sys/bus/pci/drivers/xhci_hcd/0001:03:00.0/reset" before the kexec.
> > > >>
> > > >
> > > >One way to look at that is making the PCI code issue resets to all buses
> > > >before doing any other access. That will make booting more slow, and
> > > >there are a lot of other corner cases where this might not be enough.
> > > >It's probably more sane to try to get the 3.10.23 kernel to do a clean
> > > >shutdown, if possible.
> > > >
> > > 
> > > With this driver design the kexec functionality is usesless on PowerPC.
> > > X86 looks a little bit better.
> > > 
> > > - Stefani
> > > 
> > > 
> > 
> > What is the vendor and device ID you are using on your X86 system? This
> > is not a matter of what architecture you are using, it's the XHCI
> > controller which does not behave as well as the one you are using on
> > X86, which is likely an Intel one.
> > 
> 
> It is an Intel 8086:8c31. But this was only a side note. We need a
> generic solution not a vendor specific one. Otherwise kexec is useless
> on other architectures.
> 
> - Stefani
> 
> 

It's probably "useless" on X86 with a TI XHCI board. I just don't have
such an environment to test. Can you arrange to test that? If that shows
me wrong, we certainly need to investigate this even further.

Thanks.
Cascardo.

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux